AI audiobooks – are we there yet?
If you’ve ever looked into creating an audiobook for your novel, you’ll already know it doesn’t come cheap. The easiest way is to create an account at Audible-backed ACX. There you upload text, set a budget and audition narrators. In theory, you can agree to pay little or nothing for the job and instead offer a share of profits, but don’t count on getting any takers. You’re likely not a bestselling author who can guarantee big numbers and make this worthwhile.
To record, produce and post produce an hour of audio (to the standards demanded by Audible and expected by any listener), you’re looking at between four and six hours of work. Most books run to between seven and eleven hours. That’s well over a week of solid work for a narrator (and their producer). Even if you pitch your project at the lower end of the going rate of £50 – £100 per finished hour of audio, you’re well into the thousand pound mark. And that’s before you sell a single copy. Audible royalty rates are woeful and they set the price for you. Unrealistically high (although steps have recently been taken to improve on this). You’ll need to go non-exclusive to have your book feature on a wide range of dedicated stores. This means cutting the royalty rate still further. In many cases, it becomes a vanity project. Nice to have. Actually, fabulous to have, but …
Artificial Intelligence (AI) is this week’s buzzword. Using AI generated voices is gaining popularity. With the advancements in artificial intelligence and machine learning, it has become possible to generate reasonable-quality human-like voices. These are fine for on-hold messages or short instructional videos. However, while AI generated voices offer several benefits, they have significant limitations that need to be considered before using them for audiobook narration.
Lack of Emotion and Expression
One of the biggest limitations of AI generated voices is their inability to express emotions and convey tone accurately. Although AI voices can mimic human speech patterns, they struggle to capture the subtle nuances of human emotions. This makes it challenging for listeners to connect with the story and characters. This can negatively impact the overall listening experience. And here come the one and two-star reviews!
Limited Voice Options
There is a limited range of voices available for AI generated audiobooks, and listeners may not be able to find the perfect voice that suits their preferences. This can be a drawback compared to traditional audiobooks, which are usually narrated by professional voice actors with a wide range of voices and styles to choose from.
Lack of Consistency
Another limitation of AI generated voices is their lack of consistency. Although AI voices can produce speech that is similar to human speech, they may produce different results each time they are used, making it challenging to maintain consistency in the audiobook’s tone and pacing. This can detract from the listening experience and make the audiobook feel disjointed.
AI generated voices are still in the early stages of development, and there are still many technical limitations that need to be overcome. For example, AI generated voices may struggle with complex sentences and grammatical structures, resulting in errors and mispronunciations that can be distracting to listeners.
Or at least that used to be the case until I stumbled upon a new tool that I’ll be writing about tomorrow. Come back and learn more.