Storing Mezzanine in DNA

Storing Massive Attack's Mezzanine Album in DNA

Robert Grass, Reinhard Heckel, Marcelo Caraballo, Rob 3D Del Naja, Mark Picken

In March 2018 we were approached by the management of external pageMassive Attack (3rd space Agency Ltd) announcing the band was interested in celebrating the 20 year anniversary of their seminal album Mezzanine, by storing it in DNA for eternity.

In several E-mails, we discussed the specifics and possibilities DNA has to offer as a storage solution, which then became our first commercial application of DNA data storage. In the following, we describe the technical details of the project.

 

mezzanine

Music compression

We received the album as high-quality mp3 files, totaling 153 MB of digital information. Due to cost constraints in DNA synthesis, we first compressed the music with the opus codec. This publicly less known audio compression scheme has a long history in telephony, Voice over IP, and videoconferencing, and enables a higher sound quality at significantly lower bit-rates than mp3 (32kb/s .obis is about equivalent to 70kb/s .mp3). We compressed the original music utilizing the software Switch Plus (NCH Software, fully licensed) in mono and with a constant bit rate of 32 kb/s resulting in a filesize of ca. 15 MB. We then added the album meta-data, extended by the following line, to the compressed music:

encoded in DNA in 2018 by Robert Grass (ETHZ and TurboBeads), Reinhard Heckel (Rice University) and Marcelo Caraballo (Customarray Inc.).

The file folder containing the music in opus format as well as the meta-data were compressed to a zip archive of size 15.3 MB.

Converting the music to DNA

The conversion of the digital bitstream of the .zip archive to a DNA sequence list was performed by Reinhard Heckel (Rice University)external page. We encoded and protected the data stored on the DNA by adding redundancy with error correcting codes. Towards this goal, we used a newly developed scheme similar to the one introduced in our previous publication (Grass et al. external pageAngew. Chem. Int. Ed. 2015). Using that scheme, the data was encoded in 901'065 DNA sequences, each 105 nt long.

In more detail, our scheme is motivated by the following constraints and imperfections of writing (synthesis) and reading (sequencing) DNA. It is difficult to synthesize DNA strands of medium to long length and therefore current DNA synthesis technologies can only generate strands of length about 200 nucleotides. In addition, those strands cannot be specially ordered. We therefore stored the data on short (105 nt) sequences and added a digital identifier (index) to each sequence to enable rearranging the sequences after read-out.

DNA synthesis, DNA storage, and DNA sequencing are error prone technologies that introduce errors within sequences and result in a loss of entire sequences. We therefore added digital redundancy to all sequences. For this we utilized an inner- and outer external pageReed-Solomon error correcting code. The inner code protects the information within a given sequence and the corresponding index, and the outer code adds additional sequences enabling recovery of lost sequences. The outer codes introduces about 24% additional sequences, which enable recovery even when 24% of all sequences are lost, and the inner code adds about 9% redundancy, which allows to correct at least one error within a sequence.

To facilitate downstream information read-out and DNA amplification, a constant DNA primer was added each end of every sequence (5': 5'-ACACGACGCTCTTCCGATCT-3', 3': 5'-AGATCGGAAGAGCACACGTCT-3'), resulting in a final DNA design of 901'065 sequences, each 146 nucleotides long.
 

Enlarged view: seq
The first few sequences with constant primers at both ends

DNA synthesis

The DNA sequences of the final DNA design were synthesized by external pageCustom Array Inc. using their proprietary electrochemical array synthesis technology, utilizing ten of their 92'918 oligo chips. The DNA was cleaved from the synthesis chip and shipped to Zurich in 10 times 80 microliters of water.
 

Enlarged view: mezz
Synthesized DNA received and unpacked.

DNA preparation and amplification by TurboBeads

Each of the ten vials contained between 11.8 and 21.8 micrograms of DNA (80 µl). 1 µl was taken from each vial, and diluted 1:10 with water. A first qPCR test was performed for each vial to test the amplifiability of the DNA. For this 1µl of the diluted DNA was mixed with 7 µl water, appropriate DNA primers (1 µl, 10 µM each), and 10 µl qPCR master-mix. Due to the slight differences in initial DNA concentrations, and amplfication yield of the individual tubes, a second qPCR experiment was performed, in which varying amounts of DNA of every tube (0.5 µl - 2 µl) were individually amplified with the same primers and master mix, yielding a CT cycle of 10.1 +- 0.62.

The amplification products of tube 1-5 and 6-10 were pooled, purified by gel-electrophoresis (Invitrogen EGel system) and the Roche DNA clean-up kit.
 

Following elution from the Roche columns, the DNA was combined to give a volume of 160 µl and the DNA concentration was measured by Qbit as 12 ng / µl yielding a total purified DNA amount of 2 µg. Taking a dsDNA length of 146 bp (as per DNA design), this is equivalent to 12'453'000'000'000 DNA sequences, and assuming an equal number distribution of sequences in the pool, this is equivalent to 14 million copies of the whole Mezzanine album.

For a second run, and the fabrication of larger DNA amounts, the DNA of the individual synthesis vials was first mixed (taking between 0.5-2 µl of every vial), and amplified for 23 cycles via qPCR in 40 reactions. The amplification products were mixed, and purified by Qiagen to yield 600 µl solution containing 10 ng/µl DNA.

Integration of the DNA into products

The DNA was then encapsulated in silica nanoparticles according to TurboBeads proprietary DNA fossilization technology (Paunescu et al. external pageNat. Protocols 2013) to yield 6 mg of silica encapsulated DNA for further processing.

To be continued...

Enlarged view: mezzanine
Massive Attack announcing the storage of Mezzanine album in DNA on their official Instagram channel.
JavaScript has been disabled in your browser