PIP-seq

The PIP-seq method was published in Nature Biotechnology on 2023 March 06: Clark et al. Nat Biotechnol. 2023 [Clark2023]. It can be used to perform high-throughput droplet single cell RNA-seq without microfluidic device (well ... you still need one to generate the hydrogel beads). Mix your cells, RT reagents, barcoded hydrogel beads and oil in a tube, then a simple vortex. You are done with droplets generation and cell encapsulation. It is for real!

Before you start to look at the library construction procedures, there are a few technical notes you need to know. Basically, there are a few different versions of PIP-seq, and they are quite different!

NOTE 1: In the Clark2023 paper, the authors indicated that the barcode design is based on the previous study published in Scientific Reports: Modular barcode beads for microfluidic single cell genomics [Delley2021].. However, they change the oligo design in PIP-seq. If you want to see how the method in the Delley2021 paper works, go to this page.

NOTE 2: If you look into the FastQ files in the Clark2023 paper, you will realise that there are at least three different library structures. The first type, which I call v1prototype, was used in the experiments of looking at the single-cell transcriptional responses of two cancer cell lines (H1975 and PC9) to gefitinibis, e.g. SRR19086115, SRR19086119 and a few additional samples. This library type is probably obsolete now and no longer used. Its linker sequence and barcode configuration resembles those in inDrop V2. The library structure is so different from the latest PIP-seq libraries, so I put it into a different page for the sake of record keeping. You can visit this page to have a look.

NOTE 3: The second type, which I call PIP-seq V2 here, was used in the species mixing experiment (for example, SRR19180490) in the Clark2023 paper. The cell barcodes consists of three rounds of split-pool. It is probably obsolete as well, but it is similar to the latest PIP-seq version, so I show it here on this page.

NOTE 4: The third type, called FluentBio PIPseq™ V3.0, was used in many samples in the Clark2023 paper (for example, SRR19184609, SRR21853664 and many others). FluentBio is the company that provides the commercial version of the PIP-seq method. FluentBio PIPseq™ V3.0 was the main library structure back in 2022. This is shown in this page and should be your starting point if you want to analyse the PIP-seq data by yourself.

NOTE 5: Early 2023, the company announced FluentBio PIPseq™ V4.0. There is no example data at this time of writing (2023 May 27). You can check their news from their website. When example data are available, I will update here. Update (2023-July-02): the V4 FASTQ files are available on the website, the whitelists are exactly the same as V3. The library structure is also extremely similar to V3, except there are 0 - 3 bases at the beginning of Read 1 in V4 to desync the sequencing cycles. See this twitter thread for more details.

NOTE 6: About the name: PIP-seq or PIPseq™ ? Well ... PIP-seq is from the Clark2023 paper, and PIPseq™ is the commerical product.


PIP-seq V2 / FluentBio PIPseq™ V3.0 & V4.0


PIP-seq V2

Adapter and primer sequences:

* The oligo sequences presented here are based on Read 1 FastQ from SRR19180490 and the method section of the Clark2023 paper, so the final sequences and library structures should be correct (the exact procedures to produce the beads and library might not be). The cell barcodes are the combination of three rounds of split-pool ligations. Plain files for the barcodes in each round can be downloaded as follows:

PIP-seq V2 Round 1 Barcodes (96)

PIP-seq V2 Round 2 Barcodes (96)

PIP-seq V2 Round 3 Barcodes (96)

Sequence used for barcoded bead generation (before the PIP-seq experiment):

pBB1: 5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCC -3'

pBB2: 5'- /5Phos/AGATCGGAAGAGCGTCGTGTAGGGAAAGAGGAGTCGTACTCTGCGTTGATACCACTGCTT -3'

plate-1-BC: 5'- /5Phos/GATCT[8-bp barcode1]ATGCATC -3'

plate-1-SP: 5'- /5Phos/[8-bp barcode1 rc] -3'

plate-2-BC: 5'- /5Phos/CTCGAGG[8-bp barcode2 rc]GATGCAT -3'

plate-2-SP: 5'- /5Phos/[8-bp barcode2] -3'

plate-3-BC: 5'- /5Phos/CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV -3'

plate-3-Sp: 5'- /5Phos/[8-bp barcode3 rc] -3'

Sequence used during the PIP-seq experiment:

Barcoded beads-oligo: |--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV -3'

PIPS_TSO: 5'- AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG -3'

PIPS_WTA_primer: 5'- AAGCAGTGGTATCAACGCAGAGT -3'

PIPs_P5library: 5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGT*A*C -3'

Nextera N7xx: 5'- CAAGCAGAAGACGGCATACGAGAT[8-bp i7 index]GTCTCGTGGGCTCGG -3'

TruSeq Read 1: 5'- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3'

Index 1 sequencing primer (i7): 5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3'

Nextera Read 2: 5'- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG -3'

Illumina P5 adapter: 5'- AATGATACGGCGACCACCGAGATCTACAC -3'

Illumina P7 adapter: 5'- CAAGCAGAAGACGGCATACGAGAT -3'


Step-by-step generation of barcoded beads:

(1) Anneal plate-1-BC with plate-1-SP:


5'- GATCT[8-bp barcode1]ATGCATC - 3'
     3'- [8-bp barcode1] -5'

(2) Anneal plate-2-BC with plate-2-SP:


       5'- [8-bp barcode2] -3'
3'- TACGTAG[8-bp barcode2]GGAGCTC -5'

(3) Anneal plate-3-BC with plate-3-SP:


5'- CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV -3'
       3'- [8-bp barcode3] -5'

(4) Form dissolvable acrylamide gel beads together with pBB1 and anneal pBB2 to the bead oligo:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCC     -3'
                  3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA -5'

(5) Split the gel beads into wells and add annealed plate-1 for barcode1 ligation:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC -3'
                  3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1] -5'

(6) Pooling and redistribute to a new 96-well plate and add annealed plate-2 for barcode2 ligation:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2] -3'
                  3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TACGTAG[8-bp barcode2]GGAGCTC -5'

(7) Pooling and redistribute to a new 96-well plate and add annealed plate-3 for barcode3 ligation:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV -3'
                  3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TACGTAG[8-bp barcode2]GGAGCTC[8-bp barcode3] -5'

(8) Denature by NaOH and get rid of the bottom strand. The beads oligos are ready to use for experiments:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV -3'


Step-by-step library generation

(1) Cell encapsulation by vortexing, cell lysis by heat, mRNA capture, then add RT reagent for reverse transcription:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI]TTTTTTTTTTTTTTTTTTTV---->
                                                                                                                                                           AAAAAAA...AAAAAABXXX...XXX -5'

(2) The terminal transferase activity of MMLV adds extra Cs:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXXCCC
                                                                                                                                                        (pA)BXXX...XXX -5'

(3) TSO is already in the RT reagent and it will incorporate into the template:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXXCCC----->
                                                                                                                                                        (pA)BXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA -5'

(4) This is the first-strand cDNA after reverse transcription:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'

(5) Without purification, immediately add PIPS_WTA_primer for single-primer semi-suppressive PCR:


                  5'- AAGCAGTGGTATCAACGCAGAGT------------------>
|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'
                                                                                                                                                        <--------------------TGAGACGCAACTATGGTGACGAA -5'

(6) Purify amplified double-stranded cDNA::


5'- AAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'
3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TACGTAG[8-bp barcode2]GGAGCTC[8-bp barcode3][12-bp UMI](pA)BXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA -5'

(7) Use the Illumina Nextera XT kit for cDNA fragmentation:

Tn5 dimer

 Product 1 (left end of cDNA + Nextera s7, the only amplifiable fragment):

5'- AAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXX         CTGTCTCTTATACACATCT -3'
3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TACGTAG[8-bp barcode2]GGAGCTC[8-bp barcode3][12-bp UMI](pA)BXXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'


 Product 2 (right end of cDNA + Nextera s7, not amplifiable, due to the PIPs_P5library primer ends with AC):

5'- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGXXXXXXXXXXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT
                   TCTACACATATTCTCTGTC         XXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA -3'


 Products 3 - 7 (omitted, none of them are amplifiable due the primers used in the next round):

    Left end of cDNA + Nextera s5
    Right end of cDNA + Nextera s5
    Nextera s5 or s7 + middle part of the cDNA + Nextera s5 or s7

(8) Library Amplification using PIPs_P5library and Nextera N7xx primers:


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC------------------>
                                         5'- AAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATGCATC[8-bp barcode2]CCTCGAG[8-bp barcode3][12-bp UMI](dT)VXXX...XXX         CTGTCTCTTATACACATCT -3'
                                         3'- TTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TACGTAG[8-bp barcode2]GGAGCTC[8-bp barcode3][12-bp UMI](pA)BXXX...XXXXXXXXXXXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'
                                                                                                                                                                                                          <--------------GGCTCGGGTGCTCTG[i7]TAGAGCATACGGCAGAAGACGAAC -5'

(9) Final library structure:


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNATGCATCNNNNNNNNCCTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACGTAGNNNNNNNNGGAGCTCNNNNNNNNNNNNNNNNNNNN(pA)BXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'
             Illumina P5                              TSO                          TruSeq Read 1           8-bp           8-bp           8-bp      12-bp          cDNA           ME               s7         8-bp        Illumina P7
                                                                                                        barcode1        barcode2       barcode3     UMI                                                   sample index


Library sequencing using Illumina primers

(1) Add TruSeq Read 1 sequencing primer to sequence the first read (bottom strand as template, cell barcodes and UMI, at least 50 cycles):


                                                                    5'- ACAC
                                                                            TCTTTCCCTACACGACGCTCTTCCGATCT------------------------------------------------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACGTAGNNNNNNNNGGAGCTCNNNNNNNNNNNNNNNNNNNN(pA)BXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(2) Add Index 1 sequencing primer to sequence the sample index at the i7 side (bottom strand as template, 8 cycles):


                                                                                                                                                                     5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGCGGACAGGCGCCTTCGTCACCATAGTTGCGTCTCATGCTGAGGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACGTAGNNNNNNNNGGAGCTCNNNNNNNNNNNNNNNNNNNN(pA)BXXX...XXXGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(3) Cluster regeneration, add Nextera Read 2 primer to sequence the second read (top strand as template, >67 cycles, these are cDNA reads):


5'- AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNATGCATCNNNNNNNNCCTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXCTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
                                                                                                                                                                  <------GACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'


FluentBio PIPseq™ V3.0 & V4.0

* The commercial kit comes with barcoded hydrogel beads, so you do not need to worry about generating them yourself. We can start immediately with cell encapsulation. The sequences here are based on educational guesses based on the example data sets from their website. In this version, the cell barcodes are the combination of four rounds of split-pool ligations. The same set of four barcodes are used in V3.0 and V4.0. Therefore, the whitelist is the same for V3.0 and V4.0. Plain files for the barcodes in each round can be downloaded as follows:

FluentBio PIPseq™ V3.0 Round 1 Barcodes (8bp, 96)

FluentBio PIPseq™ V3.0 Round 2 Barcodes (6bp, 96)

FluentBio PIPseq™ V3.0 Round 3 Barcodes (6bp, 96)

FluentBio PIPseq™ V3.0 Round 4 Barcodes (8bp, 96)

Adapter and primer sequences:

Sequence used during the PIP-seq experiment:

Barcoded beads-oligo (T2/20/100 PIPs, FB0003913/FB0003914/FB0003915):

    
        V3.0: |--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI]TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTV -3'

        V4.0: |--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[None/T/GT/TGA][8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI]TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTV -3'
    

TSO (FB0001042/FB0003140/FB0003078): 5'- AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG -3'

WTA Primer (FB0002006/FB0003084):

    
        Forward: 5'- CTCTTTCCCTACACGACGCTC -3'

        Reverse: 5'- AAGCAGTGGTATCAACGCAGAGT -3'
    

Library Adapter Mix (FB0001605):

    
        5'-/5Phos/ CTGTCTCTTATACACATCTCCGAGCC -3'
        3'-       TGACAGAGAATAT                        -5'
    

Library P5 Index (FB0001915-1918, FB0001666-1669): 5'- AATGATACGGCGACCACCGAGATCTACAC[i5]ACACTCTTTCCCTACACGACGC -3'

Library P7 Index (FB0001626-1627, FB0001629-1633, FB0002092): 5'- CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG -3'

TruSeq Read 1: 5'- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3'

Index 1 sequencing primer (i7): 5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3'

Nextera Read 2 sequencing primer: 5'- GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG -3'

Index 2 sequencing primer (i5): 5'- AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -3'

Illumina P5 adapter: 5'- AATGATACGGCGACCACCGAGATCTACAC -3'

Illumina P7 adapter: 5'- CAAGCAGAAGACGGCATACGAGAT -3'


Step-by-step library generation (only showing V3.0)

(1) Cell encapsulation by vortexing, cell lysis by heat, mRNA capture, then add RT Additive Mix, RT Enzyme Mix reagent and TSO for reverse transcription:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI]T30V---->
                                                                                                                                                       AAAAAAA...AAAAAABXXX...XXX -5'

(2) The terminal transferase activity of MMLV adds extra Cs:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXCCC
                                                                                                                                                                    (pA)BXXX...XXX -5'

(3) TSO is already in the RT reagent and it will incorporate into the template:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXCCC----->
                                                                                                                                                                    (pA)BXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA -5'

(4) This is the first-strand cDNA after reverse transcription:


|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'

(5) cDNA amplification with WTA buffer mix and WTA Primer:


                                                5'- CTCTTTCCCTACACGACGCTC------------------->
|--5'- /5Acryd/TTTTTTTAAGCAGTGGTATCAACGCAGAGTACGACTCCTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'
                                                                                                                                                                    <--------------------TGAGACGCAACTATGGTGACGAA -5'

(6) Purify amplified double-stranded cDNA::


5'- CTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT -3'
3'- GAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TAC[6-bp barcode2]CTC[6-bp barcode3]AGCTG[8-bp barcode4][12-bp UMI](pA)BXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA -5'

(7) cDNA fragmentation (by fragmentase??), end repair and A-tailing:


 Product 1 (left end of cDNA):

5'-  CTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXA -3'
3'- AGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TAC[6-bp barcode2]CTC[6-bp barcode3]AGCTG[8-bp barcode4][12-bp UMI](pA)BXXX...XXX  -5'


 Product 2 (right end of cDNA):

5'-  XXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTTA  -3'
3'- AXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAA   -5'


 Products 3 (middle part of cDNA):

5'-  XXXXXXXXXXXXXXXXXXXXXXXXXX...XXXXXXXXXXXXXXXXXXXXXXXXXXA -3'
3'- AXXXXXXXXXXXXXXXXXXXXXXXXXX...XXXXXXXXXXXXXXXXXXXXXXXXXX  -5'


(8) Add Library Adapter Mix for ligation:


 Product 1 (left end of cDNA, assuming the 5' of the WTA Primer is blocked, the only amplifiable fragment):

5'-  CTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCC -3'
3'- AGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TAC[6-bp barcode2]CTC[6-bp barcode3]AGCTG[8-bp barcode4][12-bp UMI](pA)BXXX...XXXTGACAGAGAATAT   -5'


 Product 2 (right end of cDNA, assuming the 5' of TSO is blocked, not amplifiable due to the primers used in the next round):

5'-               TATAAGAGACAGTXXX...XXXCCCATTCACTCTGCGTTGATACCACTGCTT  -5'
3'- CCGAGCCTCTACACATATTCTCTGTCAXXX...XXXGGGTAAGTGAGACGCAACTATGGTGACGAAA -3'


 Products 3 (middle part of cDNA, not amplifiable due to the primers used in the next round):

5'-               TATAAGAGACAGTXXXXXXXXXXXXXXXXXXXXXXXXXX...XXXXXXXXXXXXXXXXXXXXXXXXXXACTGTCTCTTATACACATCTCCGAGCC -3'
3'- CCGAGCCTCTACACATATTCTCTGTCAXXXXXXXXXXXXXXXXXXXXXXXXXX...XXXXXXXXXXXXXXXXXXXXXXXXXXTGACAGAGAATAT  -5'


(9) Library Amplification using Library P5 Index and Library P7 Index:


5'- AATGATACGGCGACCACCGAGATCTACAC[i5]ACACTCTTTCCCTACACGACGC------------------------->
                                   5'-  CTCTTTCCCTACACGACGCTCTTCCGATCT[8-bp barcode1]ATG[6-bp barcode2]GAG[6-bp barcode3]TCGAG[8-bp barcode4][12-bp UMI](dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCC -3'
                                   3'- AGAGAAAGGGATGTGCTGCGAGAAGGCTAGA[8-bp barcode1]TAC[6-bp barcode2]CTC[6-bp barcode3]AGCTG[8-bp barcode4][12-bp UMI](pA)BXXX...XXXTGACAGAGAATAT   -5'
                                                                                                                                                        <--------------GACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG[i7]TAGAGCATACGGCAGAAGACGAAC -5'

(10) Final library structure:


V3.0:

5'- AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNATGNNNNNNGAGNNNNNNTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGNNNNNNNNTGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACNNNNNNCTCNNNNNNAGCTCNNNNNNNNNNNNNNNNNNNN(pA)VXXX...XXXTGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'
             Illumina P5           8-bp             TruSeq Read 1           8-bp      6-bp     6-bp        8-bp       12-bp         cDNA   ↓        ME               s7         8-bp        Illumina P7
                                 i5 index                                 barcode1  barcode2  barcode3   barcode4      UMI                 ↓                                  i7 index
                                                                                                                                           ↓
                                                                                                                            This means the second read will
                                                                                                                                always start with a T

V4.0:

5'- AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCTN.NNNNNNNNNATGNNNNNNGAGNNNNNNTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGNNNNNNNNTGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGAN.NNNNNNNNNTACNNNNNNCTCNNNNNNAGCTCNNNNNNNNNNNNNNNNNNNN(pA)VXXX...XXXTGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'
             Illumina P5           8-bp             TruSeq Read 18-bp      6-bp     6-bp        8-bp       12-bp         cDNA   ↓        ME               s7         8-bp        Illumina P7
                                 i5 index                                  ↓ barcode1  barcode2  barcode3   barcode4      UMI                 ↓                                  i7 index
                                                                           ↓                                                                  ↓
                                                                None, or T, or GT, or TGA                                   This means the second read will always start with a T,
                                                                                                                        but the FASTQ files suggest otherwise. Maybe there is a dark cycle.

Library sequencing using Illumina primers

(1) Add TruSeq Read 1 sequencing primer to sequence the first read (bottom strand as template, cell barcodes and UMI, 51 cycles):


                                     5'- ACACTCTTTCCCTACACGACGCTCTTCCGATCT-------------------------------------------------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGNNNNNNNNTGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACNNNNNNCTCNNNNNNAGCTCNNNNNNNNNNNNNNNNNNNN(dA)VXXX...XXXTGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(2) Add Index 1 sequencing primer to sequence the sample index at the i7 side (bottom strand as template, 8 cycles):


                                                                                                                                        5'- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC------->
3'- TTACTATGCCGCTGGTGGCTCTAGATGTGNNNNNNNNTGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGANNNNNNNNTACNNNNNNCTCNNNNNNAGCTCNNNNNNNNNNNNNNNNNNNN(dA)VXXX...XXXTGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGNNNNNNNNTAGAGCATACGGCAGAAGACGAAC -5'

(3) Cluster regeneration, add Index Read 2 primer to sequence the sample index at the i5 side (top strand as template, 8 cycles):


5'- AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNATGNNNNNNGAGNNNNNNTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
                                 <-------TGTGAGAAAGGGATGTGCTGCGAGAAGGCTAGA -5'

(4) Add Nextera Read 2 primer to sequence the second read (top strand as template, >67 cycles, these are cDNA reads, the first base is T, in theory):


5'- AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNATGNNNNNNGAGNNNNNNTCGAGNNNNNNNNNNNNNNNNNNNN(dT)VXXX...XXXACTGTCTCTTATACACATCTCCGAGCCCACGAGACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG -3'
                                                                                                                                     <------GACAGAGAATATGTGTAGAGGCTCGGGTGCTCTG -5'