MARS-seq / MARS-seq2.0

(1) I think the sequence of "2nd RT Primer" in Supplementary Table S7 in the original publication (Science 343, 776-779 (2014)) might be in the wrong orientation (i.e. they showed the sequence from 3' -> 5').

(2) The author claimed in the Supplementary Method (Page 8) that UMIs are 4-8 bp in length, but from the oligo sequence in Supplementary Table 7, they are only 4 bp in length. 4 bp were drawn in this workflow here.

(3) The author claimed in the Supplementary Method (Page 8) that plate barcodes are 6 bp in length, but from the oligo sequence in Supplementary Table 8, they seem to be 7 bp in length (4bp + 3 Ns). Maybe only 6 bp was only used to identify a plate. I'm not entirely sure about this.

(4) In May, 2019, MARS-seq2.0 was published in Nature Protocol, and the oligos used in the protocol is almost the same to the original MARS-seq. The cell barcodes and UMIs are longer in MARS-seq2.0. The improvements are mainly related to throughput, robustness, noise reduction and costs. Check the publication for more details.

(5) Oligos used in the original MARS-seq were shown in this workflow.



Adapter and primer sequences:

1st RT primer (MARS-seq): 5'- CGATTGAGGCCGGTAATACGACTCACTATAGGGGCGACGTGTGCTCTTCCGATCT[6-bp cell barcode][4-bp UMI]TTTTTTTTTTTTTTTTTTTTN -3'

1st RT primer (MARS-seq2.0): 5'- CGATTGAGGCCGGTAATACGACTCACTATAGGGGCGACGTGTGCTCTTCCGATCT[7-bp cell barcode][8-bp UMI]TTTTTTTTTTTTTTTTTTTTN -3'

T7 promoter: 5'- TAATACGACTCACTATAGGG -3'

Barcode plate ligation adaptor:


        For MARS-seq, they are called Lig_NNNX4_ix[1-8]: 5'/5Phos/- [7-bp plate barcode]AGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'

For MARS-seq2.0, they are called lig_N5X4_ix[1-32]: 5'/5Phos/- [9-bp plate barcode]AGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'

2nd RT primer: 5'- CTACACGACGCTCTTCCGATCT -3'

P5_Rd1_PCR primer: 5'- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT -3'

P7_Rd2_PCR primer: 5'- CAAGCAGAAGACGGCATACGAGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT -3'

Illumina Truse Read1 primer: 5'- TCTTTCCCTACACGACGCTCTTCCGATCT -3'

Illumina Truse Read2 primer: 5'- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT -3'

Illumina P5 adapter: 5'- AATGATACGGCGACCACCGAGATCTACAC -3'

Illumina P7 adapter: 5'- CAAGCAGAAGACGGCATACGAGAT -3'

Sequence of Barcode plate ligation adaptor:


For MARS-seq:
    lig_NNNX4_ix1 5'-/5Phos/ GACTNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix2 5'-/5Phos/ CATGNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix3 5'-/5Phos/ CCAANNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix4 5'-/5Phos/ CTGTNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix5 5'-/5Phos/ GTAGNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix6 5'-/5Phos/ TGATNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix7 5'-/5Phos/ ATCANNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_NNNX4_ix8 5'-/5Phos/ TAGANNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'

For MARS-seq 2.0, the are 32 of them, and there 5 Ns in each adaptor:
    lig_N5X4_ix1  5'-/5Phos/GACTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix2  5'-/5Phos/CATGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix3  5'-/5Phos/CCAANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix4  5'-/5Phos/CTGTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix5  5'-/5Phos/GTAGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix6  5'-/5Phos/TGATNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix7  5'-/5Phos/ATCANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix8  5'-/5Phos/TAGANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix9  5'-/5Phos/AAGTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix10 5'-/5Phos/GGCGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix11 5'-/5Phos/GTTTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix12 5'-/5Phos/GCGCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix13 5'-/5Phos/GAAANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix14 5'-/5Phos/TACCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix15 5'-/5Phos/CGGANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix16 5'-/5Phos/CCCTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix17 5'-/5Phos/TCAGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix18 5'-/5Phos/CTCGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix19 5'-/5Phos/CTACNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix20 5'-/5Phos/CTTANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix21 5'-/5Phos/TGGCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix22 5'-/5Phos/AGCTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix23 5'-/5Phos/CAGCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix24 5'-/5Phos/ACTTNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix25 5'-/5Phos/TCTANNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix26 5'-/5Phos/ACCGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix27 5'-/5Phos/ATGCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix28 5'-/5Phos/GATCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix29 5'-/5Phos/GGACNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix30 5'-/5Phos/GTCCNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix31 5'-/5Phos/CGAGNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'
    lig_N5X4_ix32 5'-/5Phos/GCATNNNNNAGATCGGAAGAGCGTCGTGTAG /3SpC3/-3'



Step-by-step library generation (the 5'-/acrydite/iSpPC/ is omitted for simplicity)

(1) Anneal 1st RT primer to mRNA and reverse transcription:


5'- XXXXXXXXXXXXXXXXXXXX(A)n
                 <-----N(T)20[4-bp UMI][6-bp cell barcode]TCTAGCCTTCTCGTGTGCAGCGGGGATATCACTCAGCATAATGGCCGGAGTTAGC -5'

(2) Pool all single cells, and RNaseH and DNA Pol I based second strand synthesis:


5'- XXXXXXXXXXXXXXXXXXXX(pA)[4-bp UMI][6-bp barcode]AGATCGGAAGAGCACACGTCGCCCCTATAGTGAGTCGTATTACCGGCCTCAATCG
    XXXXXXXXXXXXXXXXXXXX(dT)[4-bp UMI][6-bp barcode]TCTAGCCTTCTCGTGTGCAGCGGGGATATCACTCAGCATAATGGCCGGAGTTAGC -5'
                                                                         ↵
                                                                    IVT starts from here

(3) T7 in vitro transcription to amplify cDNA (resulting in single stranded RNA):


5'- GGCGACGUGUGCUCUUCCGAUCU[6-bp cell barcode][4-bp UMI](dU)XXXXXXXXXXXXXXXXXXXXX -3'

(4) Heat fragment the amplified RNA (aRNA), and perform ssDNA/RNA ligation (T4 RNA ligase I) with lig_NNNX4_ix[1-8] with plate barcode:


Due to the 3' block of the lig_NNNX4_ix[1-8], there is only one ligation possibility,
which is the 5' end of the lig_NNNX4_ix[1-8] ligating to 3' of aRNA:

3'- GATGTGCTGCGAGAAGGCTAGA[7-bp plate barcode]XXX...XXX(dU)[4-bp UMI][6-bp cell barcode]UCUAGCCUUCUCGUGUGCAGCGG -5'

(5) Add 2nd RT primer to revesrse transcribe the aRNA:


5'- CTACACGACGCTCTTCCGATCT-------->
3'- GATGTGCTGCGAGAAGGCTAGA[7-bp plate barcode]XXX...XXX(dU)[4-bp UMI][6-bp cell barcode]UCUAGCCUUCUCGUGUGCAGCGG -5'

(6) Resulting first strand cDNA looks like this:


5'- CTACACGACGCTCTTCCGATCT[7-bp plate barcode]XXX...XXX(pA)[4-bp UMI][6-bp cell barcode]AGATCGGAAGAGCACACGTCGCC -3'

(7) Add P5_Rd1_PCR & P7_Rd2_PCR primers for library preparation and amplification:


5'- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-------------->
                                    5'- CTACACGACGCTCTTCCGATCT[7-bp plate barcode]XXX...XXX(pA)[4-bp UMI][6-bp cell barcode]AGATCGGAAGAGCACACGTCGCC -3'
                                                                                                              <-------------TCTAGCCTTCTCGTGTGCAGACTTGAGGTCAGTGTAGAGCATACGGCAGAAGACGAAC -5'

(8) Final library structure (not sure what NNN between Partial Rd1 and 4bp plate barcode is):


5'- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNXXX...XXX(pA)NNNNNNNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCCGTCTTCTGCTTG -3'
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNXXX...XXX(dT)NNNNNNNNNNTCTAGCCTTCTCGTGTGCAGACTTGAGGTCAGTGTAGAGCATACGGCAGAAGACGAAC -5'
            Illumina P5              Illumina Truseq Read1      7bp     cDNA      4bp   6bp       Illumina Truseq Read2              Illumina P7
                                                               plate              UMI   cell
                                                              barcode                  barcode



Library sequencing:

(1) Add Illumina Truseq Read1 sequencing primer to sequence the first read (bottom strand as template, the first 6 - 7 bp are plate barcode, then followed by cDNA sequence):


                             5'- TCTTTCCCTACACGACGCTCTTCCGATCT----------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNXXX...XXX(dT)NNNNNNNNNNTCTAGCCTTCTCGTGTGCAGACTTGAGGTCAGTGTAGAGCATACGGCAGAAGACGAAC -5'

(2) Cluster regeneration, and add Illumina Truseq Read2 sequencing primer to sequence read 2 (top strand as template, these are the cell barcodes and UMI reads, with some dT at the end depending on the cycle numbers):


5'- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNXXX...XXX(pA)NNNNNNNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCCGTCTTCTGCTTG -3'
                                                                             <--------------TCTAGCCTTCTCGTGTGCAGACTTGAGGTCAGTG -5'