scg_lib_structs

Single Cell Genomics Library Structure

Collections of library structure and sequence of popular single cell genomic methods (mainly scRNA-seq).

Before you start

Make sure you understand the basic configuration of the Illumina libraries, because most single cell sequencing methods are developed to be sequenced on the Illumina platforms. If you are not familiar with the Illumina sequencing libraries, click here to check some general information about Illumina library structures and the nature of library preparation.

The HTML pages listed below contain step-by-step procedures of how the libraries are generated experimentally. For the computational preprocessing pipelines for each method, please see this accompanying ReadTheDocs documentation. For the machine-readable format of the library structure, check seqspec.

How to use?

Click the following links to view the methods. Notes:

Index1 (i7) is always sequenced using the bottom strand as template, regardless of the Illumina machine in use. That is why the index sequences are reverse complementary to the primer sequences.
IMPORTANT: In a dual-index library, how index2 (i5) is sequenced differs from machines to machines. According to the Index Sequencing Guide from Illumina, Miseq, Hiseq2000/2500, MiniSeq (Rapid) and NovaSeq 6000 (v1.0) use the bottom strand as template (Forward Strand Workflow), which is why the index sequences are the same as the primer sequences in those machines. iSeq 100, MiniSeq, NextSeq, HiSeq X, HiSeq 3000/4000 and NovaSeq 6000 (v1.5) use the top strand as template (Reverse Complement Workflow), which is why the index sequences are reverse-complementary to the primer sequences in those machines. All methods listed below use iSeq 100, MiniSeq (Standard), NextSeq, HiSeq X, HiSeq 3000/4000 and NovaSeq 6000 (v1.5) as examples, because this configuration is more frequently used nowadays.

Gene expression
Chromatin accessibility and protein-DNA interactions
Genomic DNA or DNA methylation
- scBS-seq
- MALBAC
- s3-WGS
- scRRBS
- LIANTI
Multi-Omics
TODO list:
- Trac-looping
- MATQ-seq
- ASTAR-seq
- Drop scChIP-seq
- sci-Plex
- sci-CAR-seq
- snmC2T-seq
- MALBAC-DT
- GRID-seq
- ZipSeq
- scCC
- scSPRITE
- TEA-seq/ICICLE-seq
- hsrChST-seq
- TIP-seq
- ECCITE-seq
- ASAP-seq/DOGMA-seq
- PHAGE-ATAC
- Spatial-ATAC-seq
- Spatial C&T
- scTEM-seq
- DBiT-seq
- scGET-seq
- Multi-CUT&Tag
- MulTI-Tag
- TIME-seq
- scSPLAT
- EpiDamID
- scRibo-seq
- sc-end5-seq
- Slide-seq / Slide-seqV2 / Slide-DNA-seq / Slide-tags
- CoTECH
- PairedTag
- GoT-ChA
- Methyl-HiC
- SNuBar-ATAC
- RAISIN RNA-seq & MIRACL-seq
- Microbe-seq
- SEC-seq
- scONE-seq
- BacDrop
- SPEAC-seq
- DisCo
- spinDrop
- sciPlex-ATAC-seq
- SCITO-seq
- snRandom-seq
- LAST-seq
- GAGE-seq
- scCARE-seq
- HiRES
- LiMCA
- nano-CT
- NTT-seq
- BuTT-seq
- M3-seq
- inDrops-2
- scRCAT-seq
- Phospho-seq
- RamDA-seq
- SIMPLE-seq
- scTAPS
- MUSIC
- Direct-seq
- Strand-seq
- Drop-BS
- CROPseq-multi
- ChAIR
- SPATAC-seq
- snapTotal-seq
- easySHARE-seq
- EasySci
- OAK
- BAG DNA RNA
- CAP-seq
- wellDA-seq
- microSPLiT <- need to check updates
- ProBac-seq

scRNA-seq technical comparisons

The basic chemistry is very similar, the main differences among those scRNA-seq methods are summarised in the table below. For a detailed discussion, check the text boxes from our review: From Tissues to Cell Types and Back: Single-Cell Gene Expression Analysis of Tissue Architecture

	Single cell isolation/capture	Where RT happens	2nd strand synthesis	Full-length cDNA synthesis	Barcode addition	Pooling before library	Library amplification	Gene coverage
10x Chromium Single Cell 3’	Droplet	In droplets	TSO	Yes	Barcoded RT primers	Yes	PCR	3’
10x Chromium Single Cell 5’	Droplet	In droplets	TSO	Yes	Barcoded TSO primers	Yes	PCR	5’
BD Rhapsody	Nanowells	In collection tubes	Random priming and primer extension	No	Barcoded RT primers	Yes	PCR	3’
CEL-seq/CEL-seq2	FACS	In 96w/384w wells	RNase H and DNA pol I	No	Barcoded RT primers	Yes	In vitro transcription	3’
Drop-seq	Droplet	In collection tubes	TSO	Yes	Barcoded RT primers	Yes	PCR	3’
Illumina Bio-Rad SureCell 3’ WTA	Droplet	In droplets	RNase H and DNA pol I	No	Barcoded RT primers	Yes	PCR	3’
inDrop	Droplet	In droplets	RNase H and DNA pol I	No	Barcoded RT primers	Yes	In vitro transcription	3’
MARS-seq/MARS-seq2.0	FACS	In 96w/384w wells	RNase H and DNA pol I	No	Barcoded RT primers	Yes	In vitro transcription	3’
Microwell-seq	Nanowells	In collection tubes	TSO	Yes	Barcoded RT primers	Yes	PCR	3’
Quartz-seq	FACS	In 96w/384w wells	PolyA tailing and primer ligation	Yes in principle	Ligation of barcoded Truseq adapters	No	PCR	3’
Quartz-seq2	FACS	In 96w/384w wells	PolyA tailing and primer ligation	Yes in principle	Barcoded RT primers	Yes	PCR	3’
sci-RNA-seq	Not needed	In situ	RNase H and DNA pol I	No	Barcoded RT primers and library PCR with barcoded primers	Yes	PCR	3’
sci-RNA-seq3	Not needed	In situ	RNase H and DNA pol I	No	Barcoded RT primers and hairpin adapters	Yes	PCR	3’
scifi-RNA-seq	Droplet multiple cells	In situ	TSO	Yes	Barcoded RT primers and gel bead barcodes	Yes	PCR	3’
SCRB-seq/mcSCRB-seq	FACS	In 96w/384w wells	TSO	Yes	Barcoded RT primers	Yes	PCR	3’
Seq-Well	Nanowells	In collection tubes	TSO	Yes	Barcoded RT primers	Yes	PCR	3’
Seq-Well S3	Nanowells	In collection tubes	Random priming and primer extension	No	Barcoded RT primers	Yes	PCR	3’
SMART-seq/SMART-seq2/SMART-seq3	FACS or Fluidigm C1	In 96w/384w wells	TSO	Yes	Library PCR with barcoded primers	No	PCR	full-length
SPLiT-seq	Not needed	In situ	TSO	Yes	Ligation of barcoded RT primers	Yes	PCR	3’
STRT-seq	FACS	In 96w/384w wells	TSO	Yes	Barcoded TSO primers	Yes	PCR	5’
STRT-seq-C1	Fluidigm C1	In microfluidic chambers	TSO	Yes	Barcoded Tn5 transposase	No	PCR	5’
STRT-seq-2i	FACS or dilution	In 9600w wells	TSO	Yes	Barcoded PCR primers and Tn5 transposase	Yes	PCR	5’
Tang 2009	FACS or manual	In 96w/384w wells	PolyA tailing and primer extension	Yes in principle	Ligation of barcoded adaptors	No	PCR	Biased to 3’

scATAC-seq technical comparisons

This is basically Table 1 from our scATAC-seq protocol: A plate-based single-cell ATAC-seq workflow for fast and robust profiling of chromatin accessibility

	Tn5 and adaptors	Staring cell number	Tagmentation	Single-cell/nucleus isolation	Library amplification	Barcode addition	Throughput
sci-ATAC-seq/snATAC-seq	Custom-made	500,000+	Bulk	FACS or dilution	PCR	Tn5 + PCR barcodes	10,000
scTHS-seq	Custom-made	500,000+	Bulk	FACS or dilution	IVT and PCR	Tn5 + PCR barcodes	10,000
Plate_scATAC-seq and Pi-ATAC-seq	Nextera	5,000+	Bulk	FACS	PCR	PCR barcodes	1,000
Fluidigm C1	Nextera	4,000-20,000	Single cells	Microfluidics	PCR	PCR barcodes	100
Takara ICELL8	Nextera	16,000	Single cells	Microfluidics	PCR	PCR barcodes	1,000
10x Chromium Single Cell ATAC	Nextera	800-15,000	Bulk	Droplets	PCR	PCR barcodes	10,000
Bio-Rad dscATAC-seq	Nextera	60,000+	Bulk	Droplets	PCR	PCR barcodes	10,000
Bio-Rad dsciATAC-seq	Custom-made	600,000+	Bulk	Droplets	PCR	Tn5 + PCR barcodes	100,000

Motivation

I was a little bit bombarded with all the single cell methods and got completely lost. To help myself understand all of them and future troubleshooting, I start to perform an on-paper library preparation whenever I see a new single cell method.

Why bother?

Here I borrow from Feyman:

What I cannot create, I do not understand.

Citation

If you find this repository useful and would like to cite this resource, please consider citing this repo and the seqspec preprint together:

@misc{xi_chen_teichlabscg_lib_structs_2023,
	title = {Teichlab/scg\_lib\_structs: {Release} 26th {Oct} 2023},
	copyright = {Creative Commons Attribution 4.0 International},
	shorttitle = {Teichlab/scg\_lib\_structs},
	url = {https://zenodo.org/doi/10.5281/zenodo.10042390},
	abstract = {This is the first release to get a DOI so that people can cite the repo.},
	urldate = {2023-10-26},
	publisher = {Zenodo},
	author = {Xi Chen and Patrick Roelli and Darío Hereñú and Pontus Höjer and Tim Stuart},
	month = oct,
	year = {2023},
	doi = {10.5281/ZENODO.10042390},
}

@article{booeshaghi.pachter.Bioinformatics2024,
  title = {A Machine-Readable Specification for Genomics Assays},
  author = {Booeshaghi, Ali Sina and Chen, Xi and Pachter, Lior},
  editor = {Kendziorski, Christina},
  year = {2024},
  month = mar,
  journal = {Bioinformatics},
  volume = {40},
  number = {4},
  pages = {btae168},
  issn = {1367-4811},
  doi = {10.1093/bioinformatics/btae168},
  urldate = {2024-05-01},
  abstract = {Motivation: Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries.},
  copyright = {https://creativecommons.org/licenses/by/4.0/},
  langid = {english}
}

Feedback

I would be very happy if you go through them and let me know what you think. If you spot some errors/mistakes, or I’ve missed some key methods. Feel free to raise an issue in the GitHub repository, or contact me directly:

Xi Chen
chenx9@sustech.edu.cn

scg_lib_structs

Single Cell Genomics Library Structure

Before you start

How to use?

Gene expression

Chromatin accessibility and protein-DNA interactions

Genomic DNA or DNA methylation

Multi-Omics

TODO list: