Duma has been destroyed, lain to rubble by Syrian shells. Across the streets of this city, located northeast of the capital of Damascus, what hasn’t gone up in smoke disappears behind a thick fog. In some places, the rubble heaps erupt with the light of flames. Inside the hospital, the atmosphere is even more choking. Children brought here receive water and oxygen masks. On April 7, 2018, rebel forces lost the city. The next day, they sent out these terrible images to prove the use of chemical weapons by the regime of Bashar al-Assad. “Staged,” was the response of the Russian allies in Damascus, while the United States claimed to have evidence that Barack Obama’s “red line” had been crossed. The ex-president had promised that he would intervene in such a case. In the end, he decided to do nothing, while his successor would later order air strikes.
Ten days later, BuzzFeed published a video in which the former gives his frank opinion on the second. “Donald Trump is a total and complete dipshit,” Barack Obama says from the White House. Never had such a candid insult been uttered by an American head of state. Except the statement never existed. In reality, with the help of comedian Jordan Peele, the American site retouched one of Obama’s speeches to change his voice and the movement of his lips. It was a warning. “We’re entering an era in which our enemies can make anyone say anything at any point in time,” Buzzfeed warned. In these times of information war, authenticating videos of the Syrian conflict is already challenging. But even if the statements themselves can be faked, the truth could be destroyed like Duma.
The Deep Fake
BuzzFeed used a program called FakeApp, which gives people the opportunity to create videos called “deepfakes.” From two sequences, its artificial intelligence pulls out certain images from which it identifies faces. Their lighting and expressions are analyzed. After that, they’re melted into each other. In 2017, researchers at the University of Washington had designed another tool called Siggraph in order to put words in Barack Obama’s mouth. A German team led by Matthias Nießner did the same with George W. Bush. Its program was called Face2Face. “We’ve developped an approach to ‘synthesize by analyzing’ that reconstructs a face in 3D,” says one of its members, Justus Thies. “We can then easily apply other people’s expressions to it.” The software giant Adobe, for its part, is working on an “audio Photoshop” where anyone could manipulate a spoken discourse. A similar project is underway by the Canadian startup Lyrebird.
Using the deepfake technique, an Internet user embedded a young Carrie Fisher’s face into a scene from Star Wars Rogue One, even though the actress who played Princess Leia was 60 years old at the time of filming. The user managed to achieve a result close to the one in the film, with a much simpler technique. “A super high-tech and laborious makeup version” was used in the film, according to John Knoll, creative director of Industrial Light & Magic special effects. FakeApp, though, is relatively simple to use. Almost anyone can see their face transposed onto anyone else’s body. “Imagine what big tech companies and governments could do,” says “derpfakes,” the author of the reyouthification of Carrie Fisher. On the American forum Reddit, however, where the technology was first shared, he recognizes that “for the moment, people are mostly using it to make obscene videos of celebrities.”
It was another Reddit user, “deepfakes,” who launched the movement on September 30, 2017 by posting a series of pornographic videos. Above the bodies of pornstars appeared the faces of stars like Scarlett Johansson, Maisie Williams, Taylor Swift, Aubrey Plaza, or Gal Gadot. Its algorithm was taken and used by others. One user under the name “deepfakeapp” used it to build the program FakeApp. He claims not to be the same user as “deepfakes” but that he collaborated to a certain extent with her.
On February 7, 2018, Reddit deleted the thread on deepfakes, claiming that it violated one of its rules: that it is forbidden to publish pornographic content of someone without their consent. Still, “it’s not really a pornographic video of the person, just their face on the body of someone else,” Jonathan Masur emphasizes. The American lawyer is not sure that a celebrity would win their case in court. Justus Thies, on the other hand, believes that “it’s clearly illegal.” Still, the profile “deepfakeapp” has relaunched a thread on Reddit to share its program. And although the image hosting company Fgycat and the adult site Pornhub say they censure the content they publish, that only slows the content’s spread.
Unlike the coarse montages featuring celebrities, BuzzFeed’s Barack Obama could fool a poorly informed audience. As the technology continues to perfect itself, Internet users could soon navigate a sea of false videos. Will they then lose all confidence in the Internet, given that they’re already struggling to separate American and Russian versions of the attack in Duma? After having anticipated the danger of fake news in 2016, Aviv Ovadya, computer scientist at the Center for Social Media Responsibility at the University of Michigan, now warns of the threat of “reality apathy.” Not only can deepfakes engender a generalized mistrust, but it’s not impossible that, by purporting a false provocation from one state towards another, they can provoke tensions, even a conflict. An “infocalypse” is stirring, he says.
Art and Machine
Long before his warning, Aviv Ovadya took an internship at Google, in 2008. The American giant that powers the world’s most popular search engine was investing massively in artificial intelligence. At the same time, the young man was studying at the Massachusetts Institute of Technology (MIT), a university also involved in research on algorithms – to the extent that on Tuesday, November 7, 2017, it invited the man who created FakeApp. It wasn’t “deepfakes” or “deepfakeapp,” but a Google employee, Ian Goodfellow. In 2015, he made public a machine learning program, TensorFlow, which is the basis of FakeApp’s operation.
This type of tool, admits Goodfellow at the Emtech conference organized at MIT, is capable of creating more and more convincing false images. Photo montages have a long history. But the skepticism with regard to these videos will likely increase, even though “they’re a chance to be able to prove that something has taken place,” he says. In the United States, the first amateur video used as a piece of evidence took place on March 3, 1991. From his balcony that evening, George Holliday, armed with his Sony Handycam, filmed an unarmed man named Rodney King being beaten by Los Angeles police. Thanks to those images, he would eventually get justice in court.
With the spread of cameras on cell phones, this has become more commonplace. In 2009, the Iranian Green Movement was largely documented by anonymous citizenry. The murder of Neda Agha-Soltan by security forces was considered to be “the most witnessed death in history,” according to the American magazine Time. Many more abuses were documented by smartphones during the Arab Spring of 2010 and 2011.
Ian Goodfellow had just entered the University of Montreal, beginning his PhD in machine learning. “I first studied neuroscience at Stanford, but one of my professors inspired me to take AI classes,” he remembers. “It mostly made me think of video games, but I quickly realized it was a real science.” He enjoyed designing graphic processors with his friend Ethan Dreifuss. At a party organized by a friend, Razvan Pascanu, in a Montreal bar, Les 3 Brasseurs, Ian Goodfellow was a little drunk. So when a colleague mentioned the possibility of mathematically determining all the qualities of a photo and integrating them as statistics in a machine capable of composing its own images, his glasses got misty. It’s impossible, he said that evening in 2014.
Yet another idea came to mind. In the same way that the meeting of two brains in a bar can stimulate reflection, the meeting of two computer neural networks could also be fruitful. Faced with a system attempting to compose the most realistic photo possible, a machine dedicated to detecting false images would learn from its corrections. Despite his friends’ doubts, Ian Goodfellow set to work as soon as he got home. Still a little drunk, he started coding. And, by what he describes as a stroke of luck, it worked.
In an article published a year later, the student described his system as generative adversarial networks (GAN). “It’s like an exchange between an artist and an art critic,” Goodfellow says. “The generative model can fool the critic by making him believe that its images are real.” An outside look is necessary to point out the problem. According to Yann LeCunn, director of artificial intelligence research at Facebook, this is “the coolest idea in deep learning in the past 20 years.” It didn’t take long to become publicly accessible. In November 2015, Google opened the code of TensorFlow, a programming software that works with generative adversarial networks.
According to Kenneth Tran, specialist in machine learning at Microsoft, Google had little interest in keeping its technology private, given that its research in this field was widespread and often public. The company would benefit from improvements made by people who used it. A little like an artist inspired by a critic.
The image in doubt
In 2016, while a group of researchers at GoogleBrain described the operation of TensorFlow in an article, Aviv Ovadya understood that something was rotten in the Internet universe. Working for the website Quora to give voice to people best able to respond to users’ questions, he believed that information was spread poorly on Facebook and Twitter. In fact, their ecosystems were permeable to propaganda, disinformation, and malicious content – in other words, what we’ve now come to call fake news.
But deceptive content is not necessarily called out and refuted in the little niches in which they form on social networks. Quite the opposite. The algorithms tend to reinforce our prejudices. “By selecting links and information based on the profiles of Internet users, these filters enclose citizens in an intellectual cocoon,” summarized Le Monde in September 2016. Two months later, the British magazine New Statesman published an investigation titled, “This film only exists in the minds of Reddit users.” On the American forum, a small community discussed Shazaam, a feature film from the ’90s in which an incompetent genie grants the wishes of two children. But no such film had ever been made.
When the New Statesman pointed out that they were wrong, many didn’t believe them. “It’s as if a part of my childhood has been stolen,” one wrote after recognizing his error. Like Facebook and Twitter, Reddit gathered people who shared common opinions, never mind if those opinions were wrong. That reinforced not only their convictions but even their very perceptions of reality, to the point of fabricating memories. In a 2015 study, two psychologists demonstrated that it’s possible to make a person believe they’d committed a crime in the past. To that effect, images seemed particularly effective. “If you tell someone, like this person here, in some situation, doing such and such, and the video corresponds to this description, that’s going to be convincing,” says University of California psychologist Linda Levine.
Succinctly, “the idea behind generative adversarial networks is to create images that are as realistic as possible,” Ian Goodfellow explains. To do that, FakeApp serves as a kind of network called an auto-encoder, capable of transposing the movements of one video onto another. This system includes a generator (the artist), which tries to produce a result as close as possible to reality, one on hand, and a detector of fakes (the art critic) on the other hand. Similarly, according to Julus Thies, “manipulating videos is like a game of cat and mouse“: the creators of deepfakes can evade techniques for the detection of false videos as long as they understand them and correct their work accordingly. Matthias Neisser notes that “for the moment, detection is much easier than manipulation.” But a naive audience will see nothing but fire.
In March 2015, the German channel ZDF broadcast a video of the then Greek Minister of Finance Yanis Varoufakis flipping off Berlin during a conference two years earlier in Croatia. While a diplomatic scandal bubbled, the channel admitted to having edited the video. It was all to show how easily it could be done. They didn’t even need AI to do it. “Face2Face operates without any particular processor,” says Justus Thies. “All you need is a classic webcam and a computer equipped with a graphics card similar to those that allow you to play video games.” 3D modeling took care of the rest.
By different means, “any media content (text, audio or image) can be manipulated,” Justus Thies summarizes. “I hope that people are going to realize that they can’t trust an image from an unknown origin. For the most part, they already know that many photos are touched up on Photoshop.” Now that FakeApp is freely available on the Internet, the halo of doubt around information will continue to grow. Honest, irreproachable content will be increasingly suspected of doctoring as the general public begins to understand the ease with which the doctoring can be done. The Syrian drama shows how much the origin and authenticity of a video matters. “We have to develop systems to efficiently determine if it’s manipulated,” Aviv Ovadya says. Reality is at stake.