An AI startup made a hyperrealistic deepfake of me that’s so good it’s scary
This article is written by Melissa Heikkilä for MIT Technology Review.
This makes it sound like I’m dying, but it’s the opposite. I am, in a way, about to live forever, thanks to the AI video startup Synthesia. For the past several years, the company has produced AI-generated avatars, but today it launches a new generation, its first to take advantage of the latest advancements in generative AI, and they are more realistic and expressive than anything I’ve ever seen. While today’s release means almost anyone will now be able to make a digital double, on this early April afternoon, before the technology goes public, they’ve agreed to make one of me.
When I finally arrive at the company’s stylish studio in East London, I am greeted by Tosin Oshinyemi, the company’s production lead. He is going to guide and direct me through the data collection process—and by “data collection,” I mean the capture of my facial features, mannerisms, and more—much like he normally does for actors and Synthesia’s customers.
He introduces me to a waiting stylist and a makeup artist, and I curse myself for wasting so much time getting ready. Their job is to ensure that people have the kind of clothes that look good on camera and that they look consistent from one shot to the next. The stylist tells me my outfit is fine (phew), and the makeup artist touches up my face and tidies my baby hairs. The dressing room is decorated with hundreds of smiling Polaroids of people who have been digitally cloned before me.
Apart from the small supercomputer whirring in the corridor, which processes the data generated at the studio, this feels more like going into a news studio than entering a deepfake factory.
I joke that Oshinyemi has what MIT Technology Review might call a job title of the future: “deepfake creation director.”
“We like the term ‘synthetic media’ as opposed to ‘deepfake,’” he says.
It’s a subtle but, some would argue, notable difference in semantics. Both mean AI-generated videos or audio recordings of people doing or saying something that didn’t necessarily happen in real life. But deepfakes have a bad reputation. Since their inception nearly a decade ago, the term has come to signal something unethical, says Alexandru Voica, Synthesia’s head of corporate affairs and policy. Think of sexual content produced without consent, or political campaigns that spread disinformation or propaganda.
“Synthetic media is the more benign, productive version of that,” he argues. And Synthesia wants to offer the best version of that version.
Until now, all AI-generated videos of people have tended to have some stiffness, glitchiness, or other unnatural elements that make them pretty easy to differentiate from reality. Because they’re so close to the real thing but not quiteit, these videos can make people feel annoyed or uneasy or icky—a phenomenon commonly known as the uncanny valley. Synthesia claims its new technology will finally lead us out of the valley.
Thanks to rapid advancements in generative AI and a glut of training data created by human actors that has been fed into its AI model, Synthesia has been able to produce avatars that are indeed more humanlike and more expressive than their predecessors. The digital clones are better able to match their reactions and intonation to the sentiment of their scripts—acting more upbeat when talking about happy things, for instance, and more serious or sad when talking about unpleasant things. They also do a better job matching facial expressions—the tiny movements that can speak for us without words.
But this technological progress also signals a much larger social and cultural shift. Increasingly, so much of what we see on our screens is generated (or at least tinkered with) by AI, and it is becoming more and more difficult to distinguish what is real from what is not. This threatens our trust in everything we see, which could have very real, very dangerous consequences.
“I think we might just have to say goodbye to finding out about the truth in a quick way,” says Sandra Wachter, a professor at the Oxford Internet Institute, who researches the legal and ethical implications of AI. “The idea that you can just quickly Google something and know what’s fact and what’s fiction—I don’t think it works like that anymore.”
So while I was excited for Synthesia to make my digital double, I also wondered if the distinction between synthetic media and deepfakes is fundamentally meaningless. Even if the former centers a creator’s intent and, critically, a subject’s consent, is there really a way to make AI avatars safely if the end result is the same? And do we really want to get out of the uncanny valley if it means we can no longer grasp the truth?
But more urgently, it was time to find out what it’s like to see a post-truth version of yourself.
Please click on this link to read the full article.
Image credit: Image by freepik