Every day the human race creates 2.5 quintillion bytes of new data. Data is at the center of the world’s digital progression. The problem, however, is that we’re creating more data than we know how to store.
The IDC estimates that in five years, 175 zettabytes of data will be generated annually. To give you some context, the IDC estimates that to this point, all of humanity has produced a grand total of almost 20 zettabytes. Even still, the world is sending 306 billion emails per day, watching 5.9 billion YouTube videos, and posting 67 million Instagram photos. People won’t be working with data amounts that can be stored comfortably on thumb drives, the seemingly omnipresent cloud, or even distributed storage systems. So, what’s the next evolution in data storage’s timeline?
To keep up with the exponential growth of data across the world, large companies currently have to build warehouse data centers that average 100,000 square feet of space (or about the size of two football fields). These data centers account for almost 2% of the United State’s electricity consumption. With the ever-increasing demand for data storage, this is not a sustainable model of data preservation.
An option that sounds like it came from straight from Jurassic Park, DNA might hold the answer to the world’s data storage concerns.
How Does DNA Data Storage Work?
Data storage through DNA works similarly to the universal binary coding system used today. In binary coding, digital information is distilled into a string of ones and zeros that the computer then re-translates back to readable information. DNA encodes information in a similar way. Going back to high school biology, DNA is made up of four base nucleotides: A,T,C, and G. This quaternary coding can be used to store information more efficiently than the current binary system.
One gram of DNA has the ability to store 215 petabytes of information. With one million gigabytes in one petabyte, DNA data storage absurdly eclipses any of the world’s current storage options. To give you some reference, one teaspoon (about four grams) of DNA holds 903 petabytes, which means that a single teaspoon of DNA is only just a little short of holding as much data as the new 62,000 square foot Facebook data center which can hold 1000 petabytes (one exabyte).
So Why Aren’t We Using It?
The primary hurdle DNA storage needs to overcome is the cost. While DNA can be replicated rather inexpensively, the technology needed to read the encoded information is expensive. Chemically synthesizing one megabyte of data can cost around $3,500. The upside of the high price for the technology needed to synthesize and read the information is data security. Hacking DNA information that’s stored in a liquid or powder form would be near impossible.
Currently, the capacity to use DNA storage on a large scale is still extremely limited. In order for DNA storage to work, data has to be encoded into the quaternary DNA language and then reversed back into readable information. In order for this process to be successful on a global scale, it has to be automated. Right now, scientists still have to be involved during every step of the encoding and decoding process, since DNA sequencing technology wasn’t initially designed for reading data. Even then, scientists have only been able to encode small, individual pieces of information.
Despite the obstacles, labs around the world are well on their way to find a realistic solution to large-scale DNA data storage. Microsoft has teamed up with the University of Washington to encode media like an Ok Go music video featuring a massive Rube Goldberg machine, the Universal Declaration of Human Rights (in more than 100 languages), and the top 100 books of all time from Project Gutenberg. While the initial results like Microsoft’s are promising, recreating the DNA encoding process in a scalable way is still far off.
Eventually, labs will need to pivot from testing the process using cold data (data that isn’t accessed often like archival content) to hot data (real-time and frequently accessed information). Encoding real-time data will be the real test in seeing whether DNA data storage will be a viable alternative to the current electronic system.
The growth of digital tech has and will continue to transform the way people learn, communicate, and live. But with the development of new technologies comes the need for innovative ways to store the data. DNA data storage is a promising option, but is held back by the myriad of roadblocks common to every world-altering technology. If we want to solve the world’s current electronic storage dependency, first we need to figure out how to resolve those scalability issues.
And let’s hope we resolve those issues soon, because the world is quickly running out of space.