AI Voices – Should You Use Them?
Anyone who remembers the first voices from navigation systems attest to modern AI voices: Things have changed for the better. But the question remains: What are artificial voices good for in professional audio production, and where do pitfalls lurk?
There is no doubt about it: AI voices have increased immensely in various applications, and companies that offer "realistic" AI voices are founded every other day it seems. From voice assistants and chatbots to audiobooks and eLearning, AI voices, providers would have us believe, are ready for use anywhere and equal to their flesh-and-blood counterparts. But is this true when superlatives are used to talk about quality and naturalness? We listened more closely, and – spoiler alert – we are far from ready to leave audio projects to bits and bytes with a clear conscience.
In the long run, one of the most apparent reasons AI voices cannot replace professional voice actors is the need for more human intonation and emotion. While AI voices can pronounce words and sentences correctly these days, at least most of the time, they cannot place the correct stresses and pauses to add meaning and emotion to what is being said. Why? Because AI does not truly understand the copy it is reading. But only truly understanding will create a bond between what's written and how it's read. This is essential for melody, emphasis, and dynamics. It is this interplay that brings a spoken text to life.
Flawed understanding
Lack of comprehension leads to monotonous and flat-sounding texts, resulting in lower engagement and interest on the listener's part. This is not ideal in eLearning and disastrous in advertising. There is no real improvement to be expected here soon, either. Yes, the voices themselves sound natural. With machine learning and lots of training, AI can imitate sentence melodies that match the text if you are lucky.
Consciousness is paramount for understanding copy
But what if they don't? What if the director would have liked the last paragraph to be spoken with a bit more "red carpet" or the previous sentence to be spoken with an emphasis on a particular word? What if the attitude is fundamentally too soft or too matter-of-fact? How do you convey to the language model that it should approach the copy with more pressure, sensitivity, warmth, or aggression?
It. Does. Not. Work. It might work one day. But not for the foreseeable future.
Awareness, experience, reflection
Why? Because the machine is not conscious of itself. It has no experience, never suffered a loss, or enjoyed a victory. But consciousness is key. When a talent reads a text, they reflect the copy with their voice and the whole experience as a talented, human, and sentient being. They give the copy an individual character, which directors can then fine-tune if necessary.
Who does the AI voice belong to?
Besides creative-human aspects, trouble threatens from a completely different side: Who owns AI voices? In recent months, artists have fought against the unsolicited use of their works as training units for AI models. And there is also resistance from voice actors. The case law here is still in its infancy. Still, it takes little imagination to grasp how complicated the copyright question becomes when an AI has been trained with the audio material of thousands of talents.
There are growing indications that various companies offer AI voices whose models have been trained with recordings for which the rights of use seem unclear. The customers of these providers may also bear the risk of facing legal consequences down the line.
Recommended by LinkedIn
Conclusion
Without question, artificial voices have come a long way. With applications such as Siri, Alexa, and Co., we gladly accept the technical intonation of our digital assistants. However, the situation is completely different for all applications that expect a "human touch." Here, professional voices will remain the #1 choice for the foreseeable future.
bodalgo.com offers professional voice actors in over 80 languages and hundreds of dialects. All of them have been thoroughly vetted to ensure outstanding results.
About bodalgo
bodalgo is a multi-award-winning online casting website founded in 2008. bodalgo enables clients to find the perfect voiceover for audio/video productions. As a free service to clients, bodalgo turns this process into an efficient, easy, and fun process. bodalgo features more than 12,000 professional voiceover talents speaking more than 80 languages and has helped more than 60.000 projects find the perfect voice.
Since its launch, bodalgo has become an integral part of the backbone of the voiceover industry. Unlike other platforms of its kind, bodalgo encourages direct client-talent contact. To maintain its unrivaled quality, the platform strictly vets talents and jobs – a world-exclusive mechanism.
In 2023, bodalgo won the One Voice Award for "Best Voice Job Web Site of the Year" for the sixth time.
About the author
Armin Hierstetter, born in 1970, is a German entrepreneur most well-known as the founder of the online casting platform bodalgo. Before turning to the voiceover business, Armin worked in magazine publishing for close to two decades, most notably as Editor of "PENTHOUSE" and the highly successful teenage magazine "Sugar" as well as Publishing Director for the men’s magazine "FHM".
Armin lives with his wife and two daughters in Munich, Bavaria. 🥨
director at voice of spain ltd
9moI agree with the article, but for some projects. Travelling abroad I had to listen to dreadful audio-guides narrated by humans that a machine would have done much better, resulting in a much more uniform listening experience for sure. Also, I have listened to some recordings, in Spanish narrated by foreigners where it was difficult to understand most of it because of poor diction, wrong intonation and stress etc. The thing is that not all voiceover platforms are like Bodalgo- where a voice vetting system is implemented and the standards are very high-, some of them showcase native speakers as native without being the case and the cockups I have witnessed in my career are enough to make anyone cringe: from voice actors with a lisp avoiding the "s" at the end of each word in order to avoid to be caught, to so called native speakers getting the stress of the words wrong. I think AI is going to be the nail in the coffin of those terrible voice actors who jumped on the bandwagon to make an easy buck.
Executive, Career & Mindset Coach | Voiceover Actor | Audiobook Narrator/Producer | Radio DJ/Host & Interviewer | | Top Tier Facilitator | Consultant
9moTerrific insights about AI and I completely agree. Thank you for posting it. Ask the question "why would a project use an AI generated voice?" And assess the answers. I don't think there are all that many answers. 🤔 In my opinion, the "why" still seems to be mostly outright cheating to steal a voice actor's craft, without asking and fairly compensating, and just wanting to use an artistic element (voice acting) without a thought about rights. Ignorance, lack of integrity, cheating, greediness, and yes, stealing, are all part of this converation. Technology and its wow factor bedazzle - but there is a dark side. And we all should be mindful of steering these kinds of things with the appropriate amount of controls and convictions.
CTO & Media Technology Expert | Voice Over Artist | Digital Transformation Consultant
9moFully on board with every point made in the discussion. As we witness AI voices making strides, the unique essence and emotional depth we, as voiceover professionals, inject into our work remain unmatched by technology. What truly alarms me, however, is the burgeoning trend of ‘pay-to-play’ offers—companies paying for hours of voice recordings under the guise of AI training, with scant transparency about the ultimate use of our voices. The notion that one could lose perpetual control over their own voice for a single payment is not just frightening but a grim reality we must confront. This practice doesn’t only undermine our profession but also entangles us in a web of potential legal and ethical dilemmas. It’s a stark wake-up call for all voice artists to tread carefully and safeguard our irreplaceable asset—our voice. Let’s champion for our rights and the integrity of our craft in the face of these challenges.