Video Game Voice Casting and AI

Bonnie Bogovich

January 9, 2024

min read

The world of creative media is ablaze with discussions around AI, machine learning, and how to use these technologies to their greatest advantage. Few roles have commanded the spotlight around impact more than the actors who we all rely on to bring characters and stories to life through their visual likenesses and vocal performances.

The SAG-AFTRA union strikes have shown that the fight to protect performers’ original contributions to entertainment extends beyond movies and television. Many of the organization’s members have forged iconic presences in video games as well. For this reason the union voted overwhelmingly in favor of authorizing a new strike should contract negotiations with top game companies continue to stall.

As a complete game audio partner that casts and directs both union and non-union voice actors, we understand that this movement concerns more than just card-carrying guild members. It affects non-union negotiations in equal measure, too! The efforts of the National Association of Voice Actors (NAVA) have ensured even non-union performers can incorporate similar protections into their contract negotiations by supplying AI writers that don’t just dictate contract terms but also create a basis for discussion and mutual agreement around equitable performance, protection, and use.

Negotiation Is Not Denial

As experientially minded creators, we see great opportunities for the use of AI, algorithms, and machine learning throughout game sound design and implementation. Many of these uses add new dimensions to the player experience at no artist’s expense. We also recognize that the current strikes and the resolutions they seek are not intended to prevent entertainment producers from using AI or synthesis in their productions. It’s far more a matter of how, why, and to what end.

Many prominent concerns revolve around the cloning of vocal performance or personal caricature for broader use beyond a talent’s direct availability. This is an appealing prospect to producers who want to extend the scope and longevity of the entertainment they produce, yet this end could undermine the agency of a performance cast if such use exceeds their own comfort or intent.

Negotiation still boils down to individual agreements, and some actors are more willing to entertain the possibilities of AI synthesis than others if the conditions are right. It’s all about finding the right balance between how much you want from your talent and respecting the value of your talent’s contributions with or without the application of AI.

Charting New Contractual Territory

The rapid rise of synthetic and generative technologies has left our prolific industry lagging on defining ways to ensure their applications are equitable for the folks we hire to breathe life into their scenes and scripts.

With such personal characteristics as voice or image at play, it’s only reasonable that a production should recognize these assets as original to their talent, be ready to fairly compensate for their agreed use, and protect and preserve their models against misappropriation or misuse.

While current collective pushback on producers might feel like a squeeze, we know that to call for more explicit upfront agreements about these technological innovations comes from a deep and personal investment by actors in the quality of any works that feature their performance or imprint.

But studios shouldn’t feel alone in driving these discussions. Unlock’s own voice director extraordinaire Bonnie Bogovich has these words of encouragement to share. “Working with Unlock Audio’s dialogue team can help ease the burden on studios, taking the guesswork out of how to start with contract negotiations, and helping them to find what agreement works best for both their studio and prospective voice talent.”

As our team approaches new voice contracts in our influential role as a game audio outsource partner, we’re helping to establish clear-cut terms that are mutually additive, not exploitative, for our developer and performer partners alike.

Consent: The most important condition

Did you know that AI can now convincingly simulate a voice with just a three-second source recording? The immense potential of this technology has made vocal synthesis not only a point of discussion in creative entertainment but in cybersecurity circles as well.

The world has already seen the power of AI voice modeling used for nefarious purposes such as faked emergency phone calls from loved ones to scam vulnerable people or to spread misinformation about what prominent public figures have or have not said. Of course, there have been amusing applications as well, such as the mimicry of popular musicians to create fake songs in their style, yet that amusement comes at the cost of that artist’s security over their creative signatures.

When it comes to contract performances for video games and voice acting, it’s vitally important for all parties to understand what is being consented to and under what conditions. If synthesizing a voice model to further a voice actor’s performance is part of the discussion, then several concerns will likely need to be addressed before earning any performer’s consent.

Concerns may include the assurance that the synthesized voice model is used only for specific purposes (such as new game content or promotional content not extracted from the game) or the actor’s continued script oversight to ensure that they know what’s communicated with their voice would not inadvertently reflect badly upon them as an individual. (Actors provide an intuitive instinct around how audiences will perceive a performance that AI cannot recreate.)

Furthermore, actors who are open to the possibilities of AI voice modeling want to know that developers will have safeguards in place so that outside parties cannot easily exploit their vocal imprint for defamatory or misleading purposes like the deep fakes or scam calls above. While nothing prevents anyone from training a new voice model on any recorded and publicly available material, actors should be able to trust in the explicit protection by studios and developers of any synthesis they consent to.

Voice Casts Build Equity

As we continue to navigate new opportunities for AI in game sound, I want to emphasize one advantage that a real voice cast has that no AI model can truly replace, and that’s fan equity. I don’t mean just from notable actors with fervent followings who accompany them into a new game’s fold. Even actors making their video game debut have the personal power to fuel fandoms long after a launch through their social followings, at events, and wherever else they directly engage with gamers.

A synthesized voice model can dictate whatever dialogue it is fed, but it can’t partake in the discussions and celebrations that lengthen the lifespan of a game or franchise. Voice actors deliver more than dictation—they are the heart and soul of the characterization that gamers connect with so well. If a voice model fills in for content beyond their recorded work, their intimacy with the game diminishes, setting them up for some awkward encounters when players pick their brain about details or theories involving content they did not directly help produce.

When considering AI in game audio, we emphasize the need for an additive approach to application, especially when it comes to voice roles. Certain values that voice actors bring both in and beyond the sound booth cannot be replicated by any program or algorithm. We’re keeping an open mind as we navigate new negotiations with our own voice casts and developer partners, and we’re confident in our industry’s power to move collectively forward in creativity with the best interests of all in mind—the creators, the casts, and the players alike.

Video Game Voice Casting and AI

Recent Posts

Unlock Audio Heads to GDC!

Video Game Voice Casting and AI

Audio Implementation Practices to Ease Development Stress

Unlock Handles Sound Design for Mythforce!

Like this content?