Can someone replicate and freely use anyone's voice using AI?

Question

Suppose person A offers paid voiceover services, and is a popular voiceover artist. Person B wants person A's voice for their revenue-generating internet video projects, because they know it will offer a significant profits-boost. It is legally all right for person B to synthesize person A's voice (without their knowledge or consent) using a neural net and use that audio clip for the videos? That is, are people's voices copyrighted?

In case the answer is "no," you might also want to ask whether there is any other protection or property right in a person's voice, perhaps personality rights. — phoog, Jun 17 '22 at 09:34
Do you mean "accent" by voice? I suspect there is no way that accent is copyrightable as with 7 billion people in the world no one has a unique accent. — User65535, Jun 17 '22 at 09:46
In Britain I'm fairly sure there is a crime of "personation" - where you pretend to be someone else with intent to deceive. I would assume that it is also a civil tort as well so that if the one impersonated can show evidence of damage, they will have a case against the impersonator for recompense. — WS2, Jun 17 '22 at 10:49
https://www.narcity.com/the-voice-of-tiktok-is-a-canadian-womans-shes-suing-the-app-for-stealing-it — Jim W, Jun 17 '22 at 18:51
I had a similar question here, maybe the answer will be helpful as well: https://law.stackexchange.com/questions/70895/using-deepfake-to-create-endorsment — Yanick Salzmann, Jun 17 '22 at 19:30
Note that there may be a difference depending on if you're explicitly claiming the recording is Person A. Many actors have trademarks on their name, stage name, catchphrase, etc., and they can definitely sue you for using those without permission. — bta, Jun 17 '22 at 22:07
Can someone replicate [...] anyone's voice using AI? - not to my knowledge. Citation needed that the technology exists. — Mazura, Jun 18 '22 at 04:00
@Mazura this was in 2019, Two Minute Papers - Google's AI Clones Your Voice After Listening for 5 Seconds! (YouTube video) — Andrew T., Jun 18 '22 at 09:11
The title says "anyone" but the question is about a popular voiceover artist. I think the answer will be different. — Barmar, Jun 18 '22 at 13:05
@Mazura https://github.com/neonbjb/tortoise-tts ; Sonantic ; https://variety.com/2022/film/news/val-kilmer-top-gun-maverick-voice-artificial-intelligence-1235281512/ — Franck Dernoncourt, Jun 19 '22 at 04:57
Related question: What's the license for speech files generated from real speech files and do I owe anything to the original speaker? — Franck Dernoncourt, Jun 19 '22 at 04:57
"Can I legally use X" and "Is X copyrighted" are not the same question. You might consider rewording the last sentence to "legally protected" instead, since I strongly doubt from the rest of the question that you only care about one specific protection. — , Jun 19 '22 at 12:19
I am surprised that neither answer references the story of Susan Bennett the original voice of Apple's Siri. To me it seems to fit the question perfectly. https://www.cnn.com/2013/10/04/tech/mobile/bennett-siri-iphone-voice/index.html — Aldus Bumblebore, Jun 19 '22 at 15:48

score 31 · Accepted Answer · edited Jun 20 '22 at 18:12

31

Not copyright as such because that is about protecting a 'work' — a voice is not a 'work'. As the court said in one of the following examples, "A voice is not copyrightable. The sounds are not 'fixed.'"

(You could copyright a roar or a yell — some kind of fixed arrangement of sound(s).)

But some jurisdictions have recognised property rights in voices and/or that the voice is protected by the person's 'right of publicity' (the right to control the commercial exploitation of their identity, of which the voice is a part). For example:

Bette Wins Ruling In ‘Sound-Alike’ Lawsuit - AP News

June 23, 1988

SAN FRANCISCO (AP) _ A federal appeals court has reinstated a lawsuit filed by entertainer Bette Midler after an advertising agency allegedly tried to duplicate her voice and singing style in one of its ad campaigns.

The 9th U.S. Circuit Court of Appeals unanimously ruled Wednesday that Midler could pursue her suit against the Ford Motor Co. and the Young & Rubicam advertising agency. The court said certain personal attributes - such as a voice - can be considered property rights, protected by state law.

...

U.S. District Judge Ferdinand Fernandez said Young & Rubicam acted like ″the average thief″ but dismissed Midler’s suit, saying no law prohibits imitation of a singer’s voice.

But the appeals court disagreed.

"A voice is as distinctive and personal as a face,″ the appeals court said. ″When a distinctive voice of a professional singer is widely known and is deliberately imitated in order to sell a product, the sellers have appropriated what is not theirs."

judgment in Midler v Ford

Another case in the US is Waits v Frito-Lay Inc. The US Court of Appeal found that a radio commercial's imitation of the voice of Tom Waits constituted a civil tort, "voice misappropriation".

I'm not aware of any cases involving computer synthesis of voices.

edited Jun 20 '22 at 18:12

Laurel

485
5
17

answered Jun 17 '22 at 11:42

Lag

16,878
2
39
61

1

I'll just add that Jess Harnell, lead singer of Rock Sugar, is unable to release music commercially because Steve Perry sued them over imitating his voice in his mashups (see https://loudwire.com/voice-actor-jess-harnell-interview-band-sued/); – Kevin McKenzie Jun 18 '22 at 21:00
1

Over half of the U.S. states recognize "right of publicity" while a few only recognize "celebrity rights" which is probably more limited to those with a recognized public persona. California Civil Code § 3344, provides it is unlawful, for the purpose of advertising or selling, to knowingly use another’ voice or likeness without that person’s prior consent. The only exception might be in cases where the other party has a right to the likeness, for example, they naturally look like the other or sound like the other. Achieving this by the use of AI is a distinction without a difference. – kisspuska Jun 20 '22 at 07:23
1/2 A better explanation than "A voice is not copyrightable. The sounds are not 'fixed.'" of the 9th Circuit would be that a voice is the abstract of one's oral sound production determined by anatomic constraints as well as the intricate mostly unconscious and involuntarily adapted and applied rules of oral production. This abstract may not be the subject of copy right; the sound of one's voice recorded, for example, may be. An interesting thought experiment could be whether, if set forth in human or computer language, one's voice abstract could be the subject of copy right. […] – kisspuska Jun 20 '22 at 07:32
2/3 In that case, feeding that information to a computer which then could reproduce another line of recorded speech according to the abstract, the imitated sound product of the abstract voice could — on top of under the above theories — be deemed derivative work. The California publicity law provides for a 50 year post-mortem protection period, but by decoding the abstract of one's voice stored on a tangible medium, and getting it copyrighted as the digital data of the voice abstract, the protection as derivative work of one's voice could be extended up to 95 post-mortem years, + © is global. – kisspuska Jun 20 '22 at 07:39
3/4 This theory of copyrighting the abstract of one's voice by defining and storing the specific rules that enable deepfake voice synthesizers to work would probably only be valid wherever a right of publicity or a celebrity right is recognized by the person whose such voice abstract is created of. Any claim by anyone else without the consent of person or celebrity to create such a voice abstract would not be copyrightable otherwise since the creation absent the consent would first violate the publicity or celebrity right, and any U.S. registered copyright would be against public policy & void – kisspuska Jun 20 '22 at 07:45
4/5 The only hold back could be the new international rule implemented not so long ago about prohibiting the copyright protection of AI generated art. In this case, the art would be the manifest abstract of one's voice, the binary data defining the rules of the dynamics of the sound of one's voice production. Likely, unless someone is able to set forth the same rules in a comprehensible manner manually, the copyright could not be granted on the digital data of the abstract like in the case of other computer code or information storing works of art. – kisspuska Jun 20 '22 at 07:50
So if I'm sounding like someone famous, I'm not allowed to exploit my own voice commercially? That sounds.... strange. – Polygnome Jun 20 '22 at 12:55
@Polygnome it's also strange that if you invent something and somebody else has the same idea and patents it before you have a chance to publish it yourself, then you're not allowed to exploit your own idea commercially. There will always be such unfortunate possibilities, unless all copyright / patent law were abolished. – leftaroundabout Jun 20 '22 at 14:05
@leftaroundabout Inventing is a choice. You can invent different things. Your voice is something you cannot change (or don't want to, for good reason). Discrimination because of things you cannot change (e.g. sex, skin color) is usually illegal, discrimination because of things that are your choice (e.g. saying stupid things) is usually not. – Polygnome Jun 21 '22 at 08:13
1

@Polygnome well, in theory that is so, and also in most practical situations. But there are always weird extreme cases. Like, if your family is from some Indian community that happens to use swastika symbols a lot, you may find that you're not allowed in Germany to publicly wear the clothes you inherited from your grandparents. “Having a voice like Tom Waits” is such a case. Anyway it's only a very mild discrimination that arises from it: it's not that you aren't allowed to use your voice even commercially, it's just that you must do it in a way that makes it clear you're not Tom Waits. – leftaroundabout Jun 21 '22 at 22:37

Greendrake · Answer 2 · 2022-06-18T14:37:22.967

8

Copyright is the least of B's concerns

Voice, like visual appearance, is an inherent trait of a person. It is not a creative work (let's not dive into philosophy here about creation of human beings). It simply cannot be protected by copyright.

However, that doesn't automatically mean that B can synthesise and publish/make money on A's voice

without their knowledge or consent

— unless B explicitly, prominently and audibly warns their audience that this is not real A's voice and A has actually nothing to do with voicing it.

Such a warning is required as it is difficult to tell AI voice imitation from the real voice owner voicing it, and, by default, the audience will assume it is real A's voice. This is the main B's concern because, unless the warning is given:

B would effectively be misleading their customers. For many customers (not all of course), the main reason for subscribing/purchasing the media would be the fact that it is real A speaking/signing. If these customers knew it was not the case, they would skip;
B could make A's voice speak to express views or "testify" facts that A themselves would never pronounce. Such a misuse could damage A's reputation and, hence, be subject of a claim/lawsuit.

edited Jun 18 '22 at 14:37

answered Jun 17 '22 at 10:17

Greendrake

27,460
4
63
126

"it is difficult to tell AI voice imitation from the real voice owner voicing it" - I wasn't aware we could do that yet. Aren't all the 'deep fakes' voiceless? – Mazura Jun 18 '22 at 02:56
2

@Mazura People making deep fakes indeed rarely bother imitating the voice — they primarily care about the visual part. But this doesn't mean that voice imitation tech is not there. In fact, it is much easier than visuals, and decent (although not perfect) voice imitation tech emerged long before AI. Now with AI it's just become impeccable (although, again, deep fake makers often don't bother using it). – Greendrake Jun 18 '22 at 03:37
"Can I change Siri's voice to Morgan Freeman?" - "Unless you can afford to pay the veteran Hollywood actor, it is pretty much impossible to switch Siri's voice to Morgan Freeman's." - That's the result from googling make siri sound like. If it was technically possible, there'd be an app for that. I presume it isn't and that's why there's no laws against it, but there will be, just like how it became illegal to 'sample' something. – Mazura Jun 18 '22 at 03:54
The Sinatra hologram deal, I'd assume was an archive recording with both audio and video. IMO the easier thing to fake was the video, if it was. You're telling me a computer could sing new Sinatra songs? – Mazura Jun 18 '22 at 03:54
1

@Mazura It has been possible at least since mid 90s to transform one's voice into voice sounding very much like someone's else. It was just some smart sound wave processing algorithm, not an AI. Now there isn't an app that makes Siri sound like Morgan Freeman apparently just because he won't approve it and those who would create such an app are afraid of possibly being sued. – Greendrake Jun 18 '22 at 04:04
5

There is this app, OTOH: https://futurism.com/the-byte/app-morgan-freeman-voice-real-time – Jeremy Friesner Jun 18 '22 at 04:10
"It depends on the terms of the voiceover contract" My interpretation of the question is that they didn't get a voiceover contract from the artist, so they decided to fake the voice instead. – Barmar Jun 18 '22 at 13:04
@Barmar That wasn't clear in the original version of the question which the answer was written to. I'll see if I feel like adapting it to the current question. – Greendrake Jun 18 '22 at 13:13
From a technical point of view, every Siri voice (different languages and accents, male/female) is about 400 Megabyte, which is probably 10 hours or more of voice recording. I suppose the artists get paid for a few weeks of work, but mostly for not being able anymore to make a phone call without the receiver thinking Siri called them :-). I don’t think you can get the sound quality without the hours and hours of voice recordings. – gnasher729 Jun 18 '22 at 17:24
Your caveat "explicitly, prominently, etc" reminds me of the South Park disclaimer. – preferred_anon Jun 19 '22 at 20:04
@Mazura See this criminally underwatched example of a deepfake of David Attenborough's voice: https://www.youtube.com/watch?v=d2A07ToxkTI – Charles Staats Jun 19 '22 at 21:10
In the UK, as well as a possible crime of "personation" (purporting to be someone else with intent to deceive), a charge could also, one assumes, be brought under the Trades Descriptions Act - selling something that is not as it has been described. – WS2 Jun 19 '22 at 22:59
@Mazura yeah, but your question to google amounted to: "can any layperson do that with the computational capability of a phone AND interface it with another app", which is almost the opposite of "would just teh technical feat be possible by semi-experts with decent hardware. Because the latter totally is, and decent hardware merely means a PC worth more than 1000€. – Hobbamok Jun 20 '22 at 10:02
Some voice actors are very good at imitating other people's voices, and provided that they only do so in contexts involving suitable disclaimers or where the impersonation would be obvious (e.g. if a stand-up comedian imitating someone else's voice, it would be rather obvious that there is no pretense of the original person's involvement) there is nothing even remotely resembling a copyright issue. If an AI is trained by a comedian X who is imitating person Y, then I would see no basis for Y having any stronger copyright claim on the result than they would have on the comedian's performances. – supercat Jun 20 '22 at 16:42
@Mazura are you familiar with the Vocaloid softwares? Those are special (Japanese) programs where they have recorded the Japanese syllables in such a way and then remixed them in such a way that you can form every Japanese word and most non japanese ones (with a heavy japanese accent) in the voice of the singer. In the remixing, they had given the vocaloids unique traits different from the voice donor, but without those filters you could not separate the singer from the vocaloid and vice versa. – Trish Jun 20 '22 at 23:57
@Mazura Maybe it was easy to tell last year when you wrote that comment, but generative AI is now quite good at imitating voices. Planet Money recently did a podcast about using AI to recreate a former cohost's voice, and use it in their GPT-generated episode. – Barmar Jun 13 '23 at 21:23

score 0 · Answer 3 · answered Jun 20 '22 at 14:36

"AI" is a function of its training set, which would contain copyrighted data that isn't licensed for that use.

The AI toys we get to see right now are in kind of a legal limbo, because this hasn't been litigated yet, but my initial interpretation is that this is currently merely tolerated by rightsholders, and there is a bit of a rush to create an AI product that is useful enough to cause backlash when it is eventually regulated.

In software development, the use of tools like GitHub Copilot is controversial, because these neural nets often reproduce fragments of the training set, which consists of mostly open source software that has been released under various licenses, some of which stipulate terms under which derived works may be produced.

Can someone replicate and freely use anyone's voice using AI?

3 Answers3

Copyright is the least of B's concerns

Linked