What if in the future, our computer data weren’t stored in giant computer servers, but rather in DNA molecules? The idea might seem bizarre, even ripped out of a work of science fiction. And yet, DARPA, the Defense Advanced Research Projects Agency charged with researching new technologies, is carefully studying it. It even has a program dedicated to storing computer information in molecules (which include DNA, as well as others) – the Molecular Informatics Program.

Launched last year, the program has invested several million dollars into research projects led by a diverse American faculty from Brown, Harvard, and the Universities of Illinois and Washington. In 2016, the University of Washington, collaborating with Microsoft, managed to store an entire piece of music in DNA, beating the record for the largest quantity of data ever hosted in this manner. To revolutionize the way we store information, researchers at Microsoft and the University of Washington can count on the help of a San Francisco-based startup that specializes in the creation of synthetic DNA, Twist Bioscience.

The Dogpatch, in San Francisco
Credits : rosatrieu

The startup is based in Dogpatch, an old industrial neighborhood located east of San Francisco. On the other side of the Bay, beyond the great cerulean and silver expanse, the buildings of Oakland look like dollhouses of uncertain outline, shimmering under the sun’s rays. The factories and warehouses that shape the Dogpatch landscape, abandoned in the second half of the 20th century, has been reborn thanks to the growth of new technologies. The old, unused factories have drawn in a crowd of startups; the warehouses have rebranded as trendy bars, upscale cafés and sushi restaurants; and stevedores are replaced by plaid-covered hipsters.

The strongest current of dynamism blowing through the neighborhood is its omnipresent construction sites. To get to the building where Twist Bioscience is located, one slaloms between helmeted men, cement mixers and trucks, in the middle of a blast of jackhammers. It’s in strong contrast with the large offices that Twist Bioscience shares with different startups, which, at this early hour, seem at total peace, punctuated only by the soft touch of piano keys. Through a big bay window lies a stunning view of its namesake. The last breath of morning mist evaporates on the water’s surface.

The startup is based in Dogpatch, an old industrial neighborhood located east of San Francisco. On the other side of the Bay, beyond the great cerulean and silver expanse, the buildings of Oakland look like dollhouses of uncertain outline, shimmering under the sun’s rays. The factories and warehouses that shape the Dogpatch landscape, abandoned in the second half of the 20th century, has been reborn thanks to the growth of new technologies. The old, unused factories have drawn in a crowd of startups; the warehouses have rebranded as trendy bars, upscale cafés and sushi restaurants; and stevedores are replaced by plaid-covered hipsters.

The strongest current of dynamism blowing through the neighborhood is its omnipresent construction sites. To get to the building where Twist Bioscience is located, one slaloms between helmeted men, cement mixers and trucks, in the middle of a blast of jackhammers. It’s in strong contrast with the large offices that Twist Bioscience shares with different startups, which, at this early hour, seem at total peace, punctuated only by the soft touch of piano keys. Through a big bay window lies a stunning view of its namesake. The last breath of morning mist evaporates on the water’s surface.

The Art of Synthesis

A funny note: Twist Bioscience was recently acclaimed for storing Deep Purple’s “Smoke on the Water” (as well as Miles Davis’ “Tutu”) in DNA, in collaboration with Microsoft and the University of Washington. Besides this, the startup uses its lab-created DNA for a host of revolutionary projects. But before we list them, let’s jump backward, to the 1960s.

Emily Leproust, CEO of Twist Bioscience

At the beginning of the film The Graduate, Mr. McGuire, a prosperous businessman, takes on Benjamin, a young and ingenious graduate played by Dustin Hoffman. Without hesitating, he advises him to make his career in “plastic,” an industry he promises has a huge future. Not only has the scene become a cult classic, but Mr. McGuire’s flair proved correct: plastic is now ubiquitous. And yet, it’s essentially made from petroleum, an exhaustible fossil fuel damaging to the environment. But according to Emily Leproust, CEO of Twist Bioscience, things could be changing soon. “If you have access to synthetic DNA, you can manipulate the genome of yeast in such a way that when it ferments sugar, you get adipic acid instead of alcohol,” she explains. “And from this adipic acid, you can manufacture nylon!”

So it’s possible to get plastic from fermentation, in the same way we manufacture beer. Welcome to the era of bioengineering. We can make carpets, but also tires or plastic bottles without using petroleum. And the icing on the cake: this kind of production is not just better for the environment, it’s also cheaper than traditional techniques. Synthetic DNA makes it possible to create using new materials that are present in nature but until now unusable by man.

One example is spider silk. Spider silk has surprising properties: both ultra-light and stronger than steel, it would be capable, at the human scale, of stopping a full-speed train. But it’s impossible to raise spiders to harvest their silk. The critters kill each other systematically. Thanks to synthetic DNA, we can, according to Emily Leproust, isolate the gene that allows spiders to create silk and, by using the aforementioned fermentation process, manufacture artificial spider silk.

The possibilities are endless. One of Twist Bioscience’s clients, Gingko Bioworks, has managed to isolate the gene that creates rose perfume. By adding yeasts and sugars to the mix, we’ve also obtained rose extract purer than we can get in nature, which we can use in upscale fragrances. Another client, Evolva, used a similar process to manufacture vanilla extract.

But synthetic DNA potential doesn’t stop there. Its applications are equally promising in the field of health, particularly for the discovery of new medicines. “Laboratories are testing millions of different antibodies to find ones that are most resistant to illness,” Emily Leproust says. “And yet, for each antibody tested, you need a DNA sample. By reducing the cost of these samples, we’re speeding up the research.” Twist Bioscience is currently working with nine major pharmaceutical laboratories.

A third possible use of synthetic DNA is in combatting world hunger, by generating crops that are no longer fertilizer-dependent. These rely on a simple principle of feeding the plant nitrogen, speeding up its growth. This is done with various chemical methods that often involve burning oil.

Credits: Twist Bioscience

To bypass this harmful step, one of Twist Bioscience’s clients uses synthetic DNA to modify bacteria located in the soil in such a way that the bacteria automatically traps nitrogen in the air and transmits it into the plant. That way, they get harvests that require neither fertilizer nor GMOs, since they pull their nitrogen directly from the ground. The key is a healthier agriculture that can feed more people. That’s all fascinating, but how does it work?

DNA Grammar

Progress in genetic engineering has been explosive in recent decades. It all began in 1953, when James Watson and Francis Crick, two young biology researchers, first identified the double-helix structure of DNA. This discovery let us understand how genetic information copies and transmits itself. In other words, the genetic code could now be decrypted.

The 1970s brought the first attempts to sequence DNA; that is, to decode the information necessary for living beings to survive and reproduce. In 1977, we successfully sequenced the first full genome, that of the virus Phi X174. However, it wasn’t until 2000 that we finally managed to sequence the whole of the human genome. So much for “reading” DNA. Twist Bioscience, for its part, specializes in “writing,” or the creation of synthetic DNA. “To read DNA, we take a sample, analyze it with a machine, and extract data from it. Writing is the exact opposite: we take out data, and, with the help of a machine, we create a corresponding DNA sample,” Emily Leproust explains.

Dans le laboratoire de Twist Bioscience

Although Twist Bioscience didn’t invent the process, the startup has managed, at the very least, to considerably improve its efficacy, inspired by progress in DNA sequencing. “In the 2000s, sequencing the totality of the human genome cost $3 billion,” Emily Leproust says. “Today, the price has fallen to around $1,000. This considerable drop was made possible by miniaturization: businesses like 454 Life Sciences or Illumina have managed to reduce the size of DNA samples used in order to achieve a more efficient technique at a reduced cost. We did the same thing with writing.”

Launched in 2013, the startup has used advanced engineering techniques. While others traditionally use plastic tablets to store DNA samples, the company uses heat-resistant silicon. The mineral, also used in computer manufacturing (which, by the way, is the source of the name Silicon Valley), allows it to miniaturize samples.

So while a traditional platelet contains 96 DNA samples, the silicon version made by Twist Bioscience holds a million samples on the same surface, which considerably reduces costs. The startup’s innovation, then, is one of engineering rather than biology. It is, strictly speaking, a “product innovation.” But the biological applications of that innovation are staggering.

Although Twist Bioscience was founded by engineers, its clients are generally biology research laboratories for whose experiments the startup creates DNA samples on demand. “When you do the research,” Emily Leproust explains, “you need to have ongoing experiments. The more DNA samples you have at your disposal, the more you can test different hypotheses. By reducing cost, we’re letting them test more ideas and take on more projects.”

The World in a Sequence

These rapid progressions in mastering DNA are raising hopes for the ability of this molecule to store digital data. Which leads us back to Deep Purple. Out of all of its projects, this one has been perhaps both the most important and the most complex. With the development of the Internet and connected things, we’re now producing an unprecedented amount of data. More data was generated in 2017 than in the past 5,000 years of human history. Currently, all that data is stored in massive servers.

But this method has several faults. It takes up space and pollutes: the Internet is presently responsible for 3% of world carbon emissions. In other words, it’s not sustainable. Data stored on an electromagnetic cassette must be rerecorded, on average, every seven years. Finally, the miniaturization of computing (as expressed by Moore’s Law) – which, in the course of the past 60 years, has allowed us to continually reduce the size of the material necessary to store information – cannot continue forever. There will come a day where our engineering collides with the laws of physics.

This is where DNA comes into play. “A computer document is composed of 0’s and 1’s,” Emily Leproust explains. “But this base 2 code can easily be converted into a base 4 code, using the numbers 0, 1, 2, and 3. And from this base 4 code, we can then convert the document by using the four letters of DNA, A, C, G, T. We can then take any digital document and convert it into DNA.”

This has several advantages. It’s more durable, for one: the DNA of the mammoth and of the Neanderthal man have thus traversed thousands of years to reach us completely intact. Besides, DNA is highly economical in terms of space: storing a petabyte of data, or the equivalent of 1,000 hard disks, would fill, in the form of DNA, the size of a grain of sand. We could store all available data in the world in the spatial equivalent of a semi-trailer. Servers require large amounts of energy to cool its circuits and to prevent overheating. The size of DNA being negligible, the energy used for cooling would be equally negligible – the consumption equivalent of light bulb for an entire data center. Lastly, a copy takes much less time than digital.

The main inconvenience right now is in the cost of storage, which remains prohibitive. That’s why Twist Bioscience is working now with Microsoft, the University of Washington, and the Swiss Federal Institute of Technology in Lausanne on a research project aiming to reduce the cost. It was within this framework that Deep Purple’s and Miles Davis’ songs were stored in DNA.

In 2012, Harvard University researchers had already done the same with the contents of a book. More recently, researchers at the same university set the stage by saving the slides of Eadweard Muybridge, which, placed end to end, depict a galloping horseman and constitute the ancestor of video, in a DNA sequence. It’s also why, since last year, DARPA is gathering impressive financial resources to develop DNA storage.

Another danger lies ahead for the revolutionary storage method. Although DNA is not a digital medium, it’s still not safe from hackers. Thus, last year, researchers at the University of Washington demonstrated that it’s possible to encode malicious software into DNA. When this is read and downloaded as data on a computer, the software can infect it.

The researchers assert, however, that this menace remains so far confined to a distant horizon. Assuming we’re able to reduce the costs and to stifle biohackers, DNA data storage could well prove to be the most efficient and responsible way to safeguard our digital information in a post-server world. As of 2018, Emily Leproust seems to have been well advised to make her career in DNA instead of plastic.