The Quest for a DNA Data Drive
This article is featured in IEEE Spectrum and written by Rob Carlson.
How much thought do you give to where you keep your bits? Every day we produce more data, including emails, texts, photos, and social media posts. Though much of this content is forgettable, every day we implicitly decide not to get rid of that data. We keep it somewhere, be it in on a phone, on a computer’s hard drive, or in the cloud, where it is eventually archived, in most cases on magnetic tape. Consider further the many varied devices and sensors now streaming data onto the Web, and the cars, airplanes, and other vehicles that store trip data for later use. All those billions of things on the Internet of Thingsproduce data, and all that information also needs to be stored somewhere.
Data is piling up exponentially, and the rate of information production is increasing faster than the storage density of tape, which will only be able to keep up with the deluge of data for a few more years. The research firm Gartner predicts that by 2030, the shortfall in enterprise storage capacity alone could amount to nearly two-thirds of demand, or about 20 million petabytes. If we continue down our current path, in coming decades we would need not only exponentially more magnetic tape, disk drives, and flash memory, but exponentially more factories to produce these storage media, and exponentially more data centers and warehouses to store them. Even if this is technically feasible, it’s economically implausible.
Fortunately, we have access to an information storage technology that is cheap, readily available, and stable at room temperature for millennia: DNA, the material of genes. In a few years your hard drive may be full of such squishy stuff.
Storing information in DNA is not a complicated concept. Decades ago, humans learned to sequence and synthesize DNA—that is, to read and write it. Each position in a single strand of DNA consists of one of four nucleic acids, known as bases and represented as A, T, G, and C. In principle, each position in the DNA strand could be used to store two bits (A could represent 00, T could be 01, and so on), but in practice, information is generally stored at an effective one bit—a 0 or a 1—per base.
Moreover, DNA exceeds by many times the storage density of magnetic tape or solid-state media. It has been calculated that all the information on the Internet—which one estimate puts at about 120 zettabytes—could be stored in a volume of DNA about the size of a sugar cube, or approximately a cubic centimeter. Achieving that density is theoretically possible, but we could get by with a much lower storage density. An effective storage density of “one Internet per 1,000 cubic meters” would still result in something considerably smaller than a single data center housing tape today.
Fortunately, we have access to an information storage technology that is cheap, readily available, and stable at room temperature for millennia: DNA, the material of genes. In a few years your hard drive may be full of such squishy stuff.
Storing information in DNA is not a complicated concept. Decades ago, humans learned to sequence and synthesize DNA—that is, to read and write it. Each position in a single strand of DNA consists of one of four nucleic acids, known as bases and represented as A, T, G, and C. In principle, each position in the DNA strand could be used to store two bits (A could represent 00, T could be 01, and so on), but in practice, information is generally stored at an effective one bit—a 0 or a 1—per base.
Moreover, DNA exceeds by many times the storage density of magnetic tape or solid-state media. It has been calculated that all the information on the Internet—which one estimate puts at about 120 zettabytes—could be stored in a volume of DNA about the size of a sugar cube, or approximately a cubic centimeter. Achieving that density is theoretically possible, but we could get by with a much lower storage density. An effective storage density of “one Internet per 1,000 cubic meters” would still result in something considerably smaller than a single data center housing tape today.
Please click on this link to read the full article.
Image credit: Image by Freepik