Sensitivity of SNARE-seq chromatin data

Sensitivity of SNARE-seq chromatin data. transcripts detected per nucleus by SNARE-seq with different single-cell/nucleus chromatin convenience or RNA-seq methods. a, Histogram showing the numbers of accessible sites captured by SNARE-seq chromatin profiles. b, Histogram showing the numbers of accessible sites detected per nucleus with different single-cell/nucleus ATAC-seq methods. The processed peak count matrices of published reports were downloaded from GEO (scATAC, “type”:”entrez-geo”,”attrs”:”text”:”GSE65360″,”term_id”:”65360″GSE65360; SQ109 sci-ATAC, “type”:”entrez-geo”,”attrs”:”text”:”GSE68103″,”term_id”:”68103″GSE68103; snATAC, “type”:”entrez-geo”,”attrs”:”text”:”GSE100033″,”term_id”:”100033″GSE100033; sci-CAR, “type”:”entrez-geo”,”attrs”:”text”:”GSE117089″,”term_id”:”117089″GSE117089) and binarized. c, Histogram showing the portion of reads in peaks (FRiP) within GM12878 or postnatal day 0 mouse cerebral cortex SNARE-seq chromatin convenience data. GM12878, GM; Human cell lines combination (BJ, GM12878, H1 and K562), lyzed by Triton-X, HuMix; Human cell lines combination, lyzed by Nuclei EZ Prep, HuMix2; Postnatal day 0 mouse cerebral cortex, P0-brain; Adult mouse cerebral cortex, Ad-brain. d, Histogram showing the numbers of UMIs and genes captured by SNARE-seq expression profiles. e, Histogram showing the number of UMIs and genes detected per nucleus with different single-cell/nucleus RNA-seq methods. The UMI count matrices of published reports were downloaded from GEO (snDrop-seq, “type”:”entrez-geo”,”attrs”:”text”:”GSE97942″,”term_id”:”97942″GSE97942; SPLiT-seq, “type”:”entrez-geo”,”attrs”:”text”:”GSE110823″,”term_id”:”110823″GSE110823; sciCAR, “type”:”entrez-geo”,”attrs”:”text”:”GSE117089″,”term_id”:”117089″GSE117089). Adult human brain cortex, Brain (H); Postnatal day 2 mouse cerebral cortex, Brain (M). Supplementary Physique 3. SNARE-seq recognized cell types within a human cell line combination (n=1,047). a, Feature plot showing the marker gene expression of individual cell lines within each cluster. b, Biplot showing the contribution of accessible peak topics (n=11) recognized by cisTopic in classifying cell types with chromatin data. c, Dot plot showing the expression of transcription factors (TF) in individual clusters. The size of the dot represents the percentage of nuclei within a cell type expressing the transcription factor and the color indicates the average expression level. d, Motif analysis identified the level of significance (in p-value) of transcription factor binding within differential accessible peak topics (n=404,665 fragments) as mentioned above. One-tailed Fisher’s exact test was used to calculate significance, and Bonferroni correction was made for multiple screening. p-value of marker TF for each cell type is usually colored in reddish. Supplementary Physique 4. Comparison of SNARE-seq dual-omics assay (n=1,043) with single-omic expression (snDrop-seq, n=591) and chromatin (chromatin only, n=494) methods. a, Clustering of snDrop-seq and SNARE-seq combined expression profiles of human cell collection combination. Cells were labeled by cell type (left) or method (right). b, Clustering of SNARE-seq chromatin profiles (dual or chromatin-only assay) of human cell line combination. Cells were labeled by cell type (left) or method (right). c, Distribution of transcripts and accessible chromatin peaks detected by SNARE-seq method in individual cell types d, Pearson correlation of gene expression (n=34,828 genes) and chromatin profiles (n=309,891 genomic regions) between dual- and single-omic assays. Aggregated transcript reads and chromatin reads were log10 normalized. e, Distribution of transcripts and chromatin peaks detected by dual- and single-omic assays. The median numbers of transcripts detected by snDrop-seq and SNARE-seq are 1747 and 1159 respectively and the median quantity of chromatin peaks detected by SNARE-seq single- and dual-omic assay are 2254 and 1960 respectively. In box plots, center lines show the median, box limits correspond to the first and third quartiles and whiskers show 1.5x interquartile range. f, Species-mixing experiment showing the transcript and chromatin reads detected by SNARE-seq and proportion of human reads in each barcodes. Supplementary Physique 5. Reproducibility of SNARE-seq (n=5 replicates). a, Pair-wise correlation of gene expression profiles between SQ109 individual replicates of postnatal day 0 sample. Aggregated transcript reads were log10 normalized. b, Pair-wise correlation of chromatin convenience profiles between SQ109 individual replicates. Aggregated genome protection was log10 normalized. c, Proportion of sequencing reads mapped to different genomic features. Top, mapping of reference expression reads, chromatin reads and accessible peaks. Bottom, mapping of SNARE-seq expression reads, chromatin reads and accessible peaks of mouse cerebral cortex data. For this analysis, total expression reads of snDrop-seq and SNARE-seq are 32,059,445 and 8,238,261, respectively. Total chromatin reads and peaks called are 180,548,727 and 140,102, 428,942,515 and 175,298 for snATAC and SNARE-seq, respectively. Supplementary Physique 6. Robustness of SNARE-seq. a, Barplot showing the numbers of nuclei recovered for each cell type. UMAP projection of mouse cerebral cortex expression data (n=5,081) as in Fig. 2a showing batch identity (b), and UMI go through depth (c). UMAP projection of chromatin convenience data (n=5,081) as in SLCO2A1 Fig. 2c showing batch identity (d), and peak go through depth (e). Supplementary Physique 7. Neonatal mouse cerebral cortex SNARE-seq profiles are correlated with published expression and chromatin data. a, Pearson correlation heatmap of mouse cerebral cortex cell types recognized with SNARE-seq expression data (n=4,768) compared with previously recognized cell types using SPLiT-seq (n=28,384)..