The AI Voice Marketplace, Advertising and Bridging the Gap

The topic of AI is still hot in 2024, with the conversation making a noticeable shift from ‘how does it work?’ to ‘how can I use it?’.

Why wouldn’t it be? The tech industries are making massive advancements in artificial intelligence and consumers are hungry for it, adopting these new technologies at such a rate the industry itself was shocked by the demand.

This week alone, one two-year old start up announced they successfully landed $80 million dollars in series B funding.

One industry increasingly aware of how quickly this appetite is growing, is the advertising industry. Interest in the synthetic voice space is gaining momentum with brand and advertisers keen to create efficiencies in their businesses and communications. As marketing budgets continue to shrink it comes as no surprise that brands and advertisers are looking to AI voices as an opportunity to create these efficiencies.

As the synthetic voice market continues to be flooded with start ups releasing new software platforms and claiming to have the largest library of multilingual and most ‘human sounding’ AI voices, knowing where to start can feel massively overwhelming. It’s easy to assume this perceived overwhelm is the reason why the advertising industry is slow to adopt and use AI voices in their campaigns.

Why is the adoption rate of AI voice in advertising slow?

The advertising industry and its clients are looking for quality, particularly when it comes to voices. In an ad campaign it’s the voice-overs responsibility to make you feel something. It needs to be trustworthy, authentic and drive an emotional response or call to action.

For brands and advertisers to even consider using AI voices, the quality of AI voices in these libraries needs to be better than the qualities unique to the human voice, like authenticity, character, nuance, and empathy. With consumers more digital savvy than ever casting an inauthentic, obviously ‘AI sounding’ voice on their ads will cut more than their production costs, it will cut their bottom line.

Oh, so it’s the tech that’s not there yet.

No, AI voice synthesis technology already has the capacity to produce sophisticated AI voices and is continually evolving at an exponential rate.

When creating an AI voice, the results are dependent on the model and the model is dependent on the quality of the data used to train the new model. To produce quality AI voices, you need to cast and pay for quality source voices. This is where the tech industry is letting itself down.

Until the quality of synthetic voices in the AI Voice Marketplace dramatically improves, it will struggle to meet the standards required of the advertising industry and its clients.

But the tech is constantly evolving, so what’s the problem?

Voice casting is a craft, and the process is the same whether it be for an advertising campaign, an animated film, or a source voice to be used in the training of a language model.

As a voice casting professional with a background in audio producing, I’ve spent years attending theatre performances, scene workshops, acting school showcases, comedy clubs, musicals, and live performances, searching for talent, tuning my ears, and mastering the craft of matching unique, individual vocal qualities in the human voice to a brand or product, in a way that will create an emotional connection. I have also worked with many directors who have spent hours in studio directing those talent, drawing out through their performances the intricate nuances that showcase the authenticity in their voices, and that’s for a 60 second ad!

If you want to create a quality AI voice, cast a quality source voice, and direct that voice in a studio with a sound engineer. Ensure their sole focus is vocal performance, free from the complicated technical requirements associated with ensuring the voice data is useable.

How is this relevant to the AI Voice Marketplace, they are all AI Voices?

An AI begins with a source voice.

Source voices are human voices, and those human voices are performers. Not everyone can be a voice-over, it’s a highly specialised skill that takes years of vocal training. Equally, recording scripts and utterances for use in the training of language models and creation of synthetic voices is a skill which requires strong vocal stamina.

Currently, there is a disconnect between creative and the technology industries. Technology businesses and platforms commercially benefit from a Performer’s intellectual property (IP) when synthesising their voice and using their voice to internally train language models.

A performer’s voice is their IP, and the use of that IP is pivotal to the developmental success of the AI Voice Marketplace. Often, the fee on offer and contract for the provision of such services is not reflective of the value of this specialised skill, let alone enough to cover the time required to complete the services and license the IP.

Don’t get me wrong, their are tech businesses in the market that respect, value and understand the important role a performer plays in the process of voice synthesis. Actively educating clients and setting standards for the rest of the industry to follow.

But if we want to see a distinct shift in the quality of AI voices available in the AI Voice Marketplace, we need to bridge the gap between the creative and technology industries.

Previous
Previous

Voicing the Future: Navigating AI in the Australian Voice-Over Industry

Next
Next

The Art of Voice-Over Casting and It’s Role In Production