Commentary:
The Open AI-Scarlett Johansson case serves as a caution to all of us that the voices we hear may have been divorced from their original owners.
In the Disneyfied fairy tale The Little Mermaid, the title figure Ariel is duped out of her voice by the sea witch Ursula. Only after her voice is severed from her body does Ariel realize how important it is to her identity.
This week, the Hans Christian Andersen classic appeared to be a tale for our times, as movie star Scarlett Johansson publicly debated who owns her voice and what she can do if someone imitates it for commercial advantage.
This week, the Hans Christian Andersen classic appeared to be a tale for our times, as movie star Scarlett Johansson publicly debated who owns her voice and what she can do if someone imitates it for commercial advantage.
When Johansson declined to provide her voice to Open AI's latest generative AI tool, ChatGPT-4o, CEO Sam Altman suggested another actor who sounded remarkably like her. Altman denied attempting to make GPT-4o sound like Johansson. However, his denial ring hollow given that on the same day Open AI revealed GPT-4o, he tweeted a one-word tweet — "her" — the title of the 2013 sci-fi film in which Johansson played the female AI aide Samantha.
Whatever the case may be, the Johansson issue serves as a caution to all of us. Regardless of how devoted we are to the sound of our own voices, AI has created a world in which our sense of ownership is threatened. Whether we provide our voices freely or they are taken from us, AI can be utilized to make it appear as if we have said things we have never said.
Audiobox and me
I just had a taste of it myself. At Vivatech in Paris this week, I tested Meta's generative AI speech tool Audiobox, which the firm first revealed last summer. The program works by listening to and synthesizing an audio recording of your voice, allowing it to read material aloud as if you were reading the words yourself. You can try it here.
I recorded several seconds of my voice on an iPad and then typed in a phrase for the tool to read aloud to me. Within a minute, it read the text back to me in my own voice. Hearing my voice emerge from the iPad, reading a line I'd never actually said out loud,was an uncanny experience.
I recorded several seconds of my voice on an iPad and then typed in a phrase for the tool to read aloud to me. Within a minute, it read the text back to me in my own voice. Hearing my voice emerge from the iPad, reading a line I'd never actually said out loud,was an uncanny experience.
Remember that election robocall that sounded like President Joe Biden's voice?
In the age of AI, we have little choice but to assume that our voices will never be completely protected from spoofing. Just as we've been cautious of AI-generated visuals, we must be willing to accept that not everything we hear is true.
When nothing is taken at face value, trust becomes more crucial than ever. The individuals developing our technology must demonstrate that they can be trusted to do the right thing for the people they are designing for – us. Companies like Meta and Google are used to public scrutiny of their trust and safety rules, as well as their capacity to implement such policies. Even though we're still concerned, the majority of us recognize their efforts and have a sense where we stand.
When nothing is taken at face value, trust becomes more crucial than ever. The individuals developing our technology must demonstrate that they can be trusted to do the right thing for the people they are designing for – us. Companies like Meta and Google are used to public scrutiny of their trust and safety rules, as well as their capacity to implement such policies. Even though we're still concerned, the majority of us recognize their efforts and have a sense where we stand.
OpenAI needs to build trust, not break it
Newer organizations, such as OpenAI, who lack a track record of trust and safety must still prove themselves. This week, Altman's startup failed at one of its initial tests.
OpenAI may not have trained ChatGPT-4o to mimic Johansson's voice, but by discovering a means to do so regardless of the actor's desires, the business plainly shown that it views people's vocal likenesses to be fair game. And the fact that she refused to provide her voice to OpenAI but the business made its tool sound like her anyway implies that it was not interested in respecting her refusal to agree.
Altman disputes this account of events, claiming in a statement that OpenAI cast the actress that voices the chatbot Sky before reaching out to Johansson. "Out of respect for M. Johansson, we have temporarily stopped using Sky's voice in our goods," he added. "We are sorry to Ms Johansson that we didn't communicate better." OpenAI did not immediately respond to a request for additional information.
OpenAI may not have trained ChatGPT-4o to mimic Johansson's voice, but by discovering a means to do so regardless of the actor's desires, the business plainly shown that it views people's vocal likenesses to be fair game. And the fact that she refused to provide her voice to OpenAI but the business made its tool sound like her anyway implies that it was not interested in respecting her refusal to agree.
Altman disputes this account of events, claiming in a statement that OpenAI cast the actress that voices the chatbot Sky before reaching out to Johansson. "Out of respect for M. Johansson, we have temporarily stopped using Sky's voice in our goods," he added. "We are sorry to Ms Johansson that we didn't communicate better." OpenAI did not immediately respond to a request for additional information.
Consent has also been at the core of copyright lawsuits filed against OpenAI and Microsoft over the texts used as training material for their large language models.
AI has rendered copyright and intellectual property ownership "a little bit fuzzy," according to Dario Amodei, CEO of AI company Anthropic, speaking at VivaTech. Amodei previously worked at OpenAI before founding Anthropic, which created the Claude chatbot, and he has been critical of his former employer. Anthropic has so far relied on text rather than "other modalities," owing to the complexities of these concerns, according to Amodei.
As AI gets more intelligent and powerful, Amodei believes we will need to confront as a society the reality that AI will impinge, sometimes painfully, on what humans can do.
As AI gets more intelligent and powerful, Amodei believes we will need to confront as a society the reality that AI will impinge, sometimes painfully, on what humans can do.
AI, like other technology, has entered our lives before the guardrails are in place. Governments are trying to address this issue and present tech corporations with a set of rules to follow. Earlier this year, the US Federal Communications Commission prohibited the use of AI-cloned voices in robocalls.
As this argument continues, we may not be able to entirely silence our voices, but we can safeguard ourselves in the brave new world of AI by recognizing that the voices we hear coming from our TVs, radios, phones, PCs, and other gadgets may have been disconnected from their original owners.