On AI Narration

Since my eyes got bad a little over a year ago, I’ve had to listen to books rather than read. (It’s frustrating.) Having listened to a couple of hundred audiobooks over that span, I have a pretty decent idea of what constitutes good narration.

As an aside, something around six months ago, I got an email from Google Play, inviting me to be one of their beta testers for AI (Artificial Intelligence) book narration. I’d just paid about $3,500 (!) to convert Reinventing Herself into an audiobook and thought perhaps this would be a less expensive way to get one of my other fictions books into that format. I don’t remember my specific objections to the contract but I do remember thinking it wasn’t favorable to authors, so nixed that idea.

Sometime last week, I read that Apple Books had quietly introduced AI Narration into their library. Out of curiosity, I looked into it. It took me a bit to find a genre I liked and a storyline I was willing to potentially waste some money on. There were a few free options but I wasn’t interested in the book descriptions. I finally settled on a crime fiction novel for $2.99. I won’t name the book or author – were I they, I’d be embarrassed.

Anyhoo, five minutes in, I could tell I wouldn’t be buying another AI-narrated book for some time. The (male) voice was pleasant but the pacing! I was hearing periods where there are probably commas or semicolons.

Then it became obvious AI hasn’t learned all the nuances of the English language. It definitely has difficulty with compound words: boyfriend is articulated as boy friend, seagull is sea gull, etc. The really obvious faux pas: roly-poly was pronounced as roly-polly. (I get where it got that pronunciation from but…)

It also had difficulty with one of the character’s names. I don’t have the written word to check, but I believe his name is Captain Coate. There was no consistency to the pronunciation – sometimes it would be, “Coat” (probably correct), sometimes it would be, “Coaté.”

The last, and probably secondary issue after the pacing was the lack of differentiation in the characters’ voices. When one reads a book, your imagination translates the written word into something you hear in your brain. You give the characters different voices. When listening to a book, it’s up to the narrator (or reader, or performer – whichever term you prefer) to make that translation for you. This AI did not do that. It simply read the book. There are a few places where it seemed it was attempting to make different voices but it wasn’t consistent and they really weren’t all that dissimilar. That, at times, makes it difficult to follow conversations when there is no dialogue tag.

I finished the book because it’s a damned fine story but my verdict: AI Narration is Not Ready for Prime Time.