dscATAC-seq/dsciATAC-seq (SureCell ATAC-seq)

dscATAC-seq and dsciATAC-seq are droplet-based scATAC-seq methods. They were published in Nat Biotechnol. on 2019 Jun 24 (Lareau et al. Nature Biotechnology 37, 916–924.), and the kit for this method is SureCell ATAC-seq Library Prep Kit.

For beginners who are new to droplet technology, it might not be clear what improvement dscATAC-seq has made. I will briefly explain here. In the droplet technologies, you have beads with barcodes going into droplets, and this follows a Poisson distribution with a certain mean. You also have cells going into droplets, and this also follows a Poisson distribution with a different mean. These two loading processes are independent, so you will have droplets with no beads no cells, with beads but no cells, with cells but no beads. Only a small proportion of the droplets contain exactly one bead and one cell. That's why in many droplet technologies, you load 100k cells, but eventually only get back 1-2k cells after sequencing.

In 2009, Abates et al. showed that if you increase the size of the beads to a point that they are close-packed, you can insert a controllable number of beads into every droplet. inDrop and 10x Genomics used this type of "big" close-packed hydrogel beads to solve the bead loading problem, though the hydrogel beads used by inDrop and 10x Genomics are different. However, for small beads, such as those used in Drop-seq, Bio-Rad SureCell and many other home-made droplet device, this cannot be achieved.

What Lareau et al. did was that they load many more (superload) beads than usual to significantly reduce the number of drops without beads. However, this also increase the number of drops with more than one bead. Here is the clever bit, if more than one bead end up in the same drop, they will share the same set of genomic fragments to start with the amplification from that drop. Therefore, those beads from the same drops should have much higher overlap of Tn5 insertion positions among each other than beads from different drops (see Supplementary Fig1 and 2 in the paper). Therefore, one can figure out whether beads are from the same drops by checking the extent of overlap of reads from different bead barcodes. If the cells are loaded at a normal concentration, one can make sure most drops have at most 1 cell as usual.

dsciATAC-seq takes a step further. It builds on top of dscATAC-seq, but take the idea of combinatorial indexing strategy, like that used in sci-ATAC-seq, and treat cells in different reactions with barcoded Tn5 and pooled all the reactions together before loading onto the machine to make emulsion. In this case, both beads and cells are overloaded. One can use the aformentioned strategy to figure out whether beads are from the same drop, and on top of that, use the Tn5 barcodes to figure out whether the reads are from the same single cells. In this way, the throughput becomes really high.



Adapter and primer sequences:

Huge thanks to @caleblareau for sharing the following information and make this page possible.



ATAC-v2.1 beads-p5-bc-nextera-read1: |--5'- TTTTTTTUUUTTTTTAATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC[7-bp barcode1][0-4bp Phase Block]TATGCATGAC[7-bp barcode2]AGTCACTGAG[7-bp barcode3]TCGTCGGCAGCGTC -3'

* The bead barcode in this protocol is based on the combination of the barcode1+barcode2+barcode3, 7-bp each, and 21-bp in total. The full oligos are generated in a split-pool manner. The Phase Block is there to make sure each sequencing cycle has decent base complexity. The full sequences (including different version history) can be found from this excel file from the bap GitHub repository.

Nextera Tn5 binding site (19-bp Mosaic End (ME)): 5'- AGATGTGTATAAGAGACAG -3'

Nextera S5xx primer entry point (s5): 5'- TCGTCGGCAGCGTC -3'

Nextera N7xx primer entry point (s7): 5'- GTCTCGTGGGCTCGG -3'

SureCell ddSEQ Sample Index (12009360): 5'- CAAGCAGAAGACGGCATACGAGAT[8-bp sample index]GTCTCGTGGGCTCGG -3'

Read 1 sequencing primer: 5'- GCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC -3'

Read 2 sequencing primer: 5'- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG -3'

Sample Index sequencing primer: 5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3'

Illumina P5 adapter: 5'- AATGATACGGCGACCACCGAGATCTACAC -3'

Illumina P7 adapter: 5'- CAAGCAGAAGACGGCATACGAGAT -3'



Step-by-step library generation (only dscATAC-seq is shown here; dsciATAC-seq is the same, but with barcoded Tn5):

(1) Bulk Tn5 tagging by incubation of nuclei and Tn5:

Tn5 dimer

(2) There are 3 different products after step (1) (will create 9 bp gap):


Product 1 (s5 at both ends, not amplifiable due to semi-suppressiev PCR):

5'- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXXXXXXXXXXX...XXX         CTGTCTCTTATACACATCT
                  TCTACACATATTCTCTGTC         XXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGACTGCGACGGCTGCT -5'


Product 2 (s7 at both ends, not amplifiable due to semi-suppressiev PCR):

5'- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGXXXXXXXXXXXX...XXX         CTGTCTCTTATACACATCT
                   TCTACACATATTCTCTGTC         XXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'


Product 3 (different ends, amplifiable):

5'- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXXXXXXXXXXX...XXX         CTGTCTCTTATACACATCT
                  TCTACACATATTCTCTGTC         XXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'

(3) Add SureCell Sample Index, droplet capture, NEB USER Enzyme treatment to cut U to release oligos from the beads into drops (the first step of Barcoding and Amplification of Fragments, 37 degree 30mins), gap fill-in (the third step, 72 degree 5mins), and amplification. This steps achieves bead barcodes and sample index addition:


|--5'- TTTTTTT   TTTTTAATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC[7-bp barcode1][0-4bp Phase Block]TATGCATGAC[7-bp barcode2]AGTCACTGAG[7-bp barcode3]TCGTCGGCAGCGTC------------------>
                                                                                                                                                                        5'- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXXXXXXXXXXX...XXXXXXXXXXXXCTGTCTCTTATACACATCTCCGAGCCCACGAGAC
                                                                                                                                                                            AGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXXXXXXXXXXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'
                                                                                                                                                                                                                                        <------------------GGCTCGGGTGCTCTG[8-bp sample index]TAGAGCATACGGCAGAAGACGAAC -5'

(4) DNA purification. This is the product from above:


5'- TTTTTAATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC[7-bp barcode1][0-4bp Phase Block]TATGCATGAC[7-bp barcode2]AGTCACTGAG[7-bp barcode3]TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGAC[8-bp sample index]ATCTCGTATGCCGTCTTCTGCTTG -3'
3'- AAAAATTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATG[7-bp barcode1][0-4bp Phase Block]ATACGTACTG[7-bp barcode2]TCAGTGACTC[7-bp barcode3]AGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG[8-bp sample index]TAGAGCATACGGCAGAAGACGAAC -5'

(5) Second amplification using ATAC Primer Mix (I assume this is the mix of Illumina P5 and P7 primers):


     5'- AATGATACGGCGACCACCGAGATCTACAC------------------->
5'- TTTTTAATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC[7-bp barcode1][0-4bp Phase Block]TATGCATGAC[7-bp barcode2]AGTCACTGAG[7-bp barcode3]TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGAC[8-bp sample index]ATCTCGTATGCCGTCTTCTGCTTG -3'
3'- AAAAATTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATG[7-bp barcode1][0-4bp Phase Block]ATACGTACTG[7-bp barcode2]TCAGTGACTC[7-bp barcode3]AGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG[8-bp sample index]TAGAGCATACGGCAGAAGACGAAC -5'
                                                                                                                                                                                                                                          <-------------------TAGAGCATACGGCAGAAGACGAAC -5'

(6) Final library structure:

dscATAC-seq library:


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTACNNNNNNNNNNNTATGCATGACNNNNNNNAGTCACTGAGNNNNNNNTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG
    TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGNNNNNNNNNNNATACGTACTGNNNNNNNTCAGTGACTCNNNNNNNAGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'
            Illumina P5                    Read 1 sequence           barcode1 Phase        barcode2        barcode3      s5               ME          gDNA          ME                s7      8-bp sample      Illumina P7
                                                                              Block                                                                                                             index

dsciATAC-seq library:


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTACNNNNNNNNNNNTATGCATGACNNNNNNNAGTCACTGAGNNNNNNNTCGTCGGCAGCGTCNNNNNNAGATGTGTATAAGAGACAGXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG
    TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGNNNNNNNNNNNATACGTACTGNNNNNNNTCAGTGACTCNNNNNNNAGCAGCCGTCGCAGNNNNNNTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'
            Illumina P5                    Read 1 sequence           barcode1 Phase        barcode2        barcode3      s5        Tn5          ME          gDNA          ME                s7      8-bp sample      Illumina P7
                                                                              Block                                              barcode                                                               index



Library sequencing:

(1) Add Read 1 sequencing primer to sequence the first read (118 cycles, bottom strand as template, the first 74 - 84 bases include barcode1+barcode2+barcode3 and Tn5 barcode if using dsciATAC-seq. After that, it will be gDNA insert):


                             5'- GCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC-------------------------------------------------------------------------------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGNNNNNNNNNNNATACGTACTGNNNNNNNTCAGTGACTCNNNNNNNAGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(2) Add Index sequencing primer to sequence sample index (bottom strand as template):


                                                                                                                                                         5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGNNNNNNNNNNNATACGTACTGNNNNNNNTCAGTGACTCNNNNNNNAGCAGCCGTCGCAGTCTACACATATTCTCTGTCXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(3) Cluster regeneration, and add read 2 sequencing primer to sequence read 2 (40 cycles, top strand as template, these are gDNA reads):


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTACNNNNNNNNNNNTATGCATGACNNNNNNNAGTCACTGAGNNNNNNNTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG
                                                                                                                                                    <--------GACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'