Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

scTSSR-D: Gene Expression Recovery by Two-side Self-Representation and Dropout Information for scRNA-seq Data

Author(s): Meng Liu, Wenhao Chen, Jianping Zhao*, Chunhou Zheng and Feilong Guo

Volume 18, Issue 4, 2023

Published on: 27 March, 2023

Page: [285 - 295] Pages: 11

DOI: 10.2174/1574893618666230217085543

Price: $65

Abstract

Background: Single-cell RNA sequencing is an advanced technology that makes it possible to unravel cellular heterogeneity and conduct single-cell analysis of gene expression. However, owing to technical defects, many dropout events occur during sequencing, bringing about adverse effects on downstream analysis.

Methods: To solve the dropout events existing in single-cell RNA sequencing, we propose an imputation method scTSSR-D, which recovers gene expression by two-side self-representation and dropout information. scTSSR-D is the first global method that combines a partial imputation method to impute dropout values. In other words, we make full use of genes, cells, and dropout information when recovering the gene expression.

Results: The results show scTSSR-D outperforms other existing methods in the following experiments: capturing the Gini coefficient and gene-to-gene correlations observed in single-molecule RNA fluorescence in situ hybridization, down-sampling experiments, differential expression analysis, and the accuracy of cell clustering.

Conclusion: scTSSR-D is a more stable and reliable method to recover gene expression. Meanwhile, our method improves even more dramatically on large datasets compared to the result of existing methods.

Keywords: scRNA-seq, two-side self-representation, dropout information, cellular heterogeneity, gene expression, downstream analysis.

Next »
Graphical Abstract
[1]
Wang Y, Mashock M, Tong Z, et al. Changing technologies of RNA sequencing and their applications in clinical oncology. Front Oncol 2020; 10: 447.
[http://dx.doi.org/10.3389/fonc.2020.00447]
[2]
Malone ER, Oliva M, Sabatini PJB, Stockley TL, Siu LL. Molecular profiling for precision cancer therapies. Genome Med 2020; 12(1): 8.
[http://dx.doi.org/10.1186/s13073-019-0703-1] [PMID: 31937368]
[3]
Tang F, Barbacioru C, Wang Y, et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 2009; 6(5): 377-82.
[http://dx.doi.org/10.1038/nmeth.1315] [PMID: 19349980]
[4]
Pierson E, Yau C. ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol 2015; 16(1): 241.
[http://dx.doi.org/10.1186/s13059-015-0805-z] [PMID: 26527291]
[5]
Stegle O, Teichmann SA, Marioni JC. Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet 2015; 16(3): 133-45.
[http://dx.doi.org/10.1038/nrg3833] [PMID: 25628217]
[6]
Björklund ÅK, Forkel M, Picelli S, et al. The heterogeneity of human CD127+ innate lymphoid cells revealed by single-cell RNA sequencing. Nat Immunol 2016; 17(4): 451-60.
[http://dx.doi.org/10.1038/ni.3368] [PMID: 26878113]
[7]
Poulin JF, Tasic B, Hjerling-Leffler J, Trimarchi JM, Awatramani R. Disentangling neural cell diversity using single-cell transcriptomics. Nat Neurosci 2016; 19(9): 1131-41.
[http://dx.doi.org/10.1038/nn.4366] [PMID: 27571192]
[8]
Villani AC, Satija R, Reynolds G, et al. Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 2017; 356(6335): eaah4573.
[http://dx.doi.org/10.1126/science.aah4573] [PMID: 28428369]
[9]
Chen G, Ning B, Shi T. Single-cell RNA-Seq technologies and related computational data analysis. Front Genet 2019; 10: 317.
[http://dx.doi.org/10.3389/fgene.2019.00317] [PMID: 31024627]
[10]
Kiselev VY, Andrews TS, Hemberg M. Challenges in unsupervised clustering of single-cell RNA-seq data. Nat Rev Genet 2019; 20(5): 273-82.
[http://dx.doi.org/10.1038/s41576-018-0088-9] [PMID: 30617341]
[11]
Zhao Y, Wang T, Liu Z, et al. Single-cell transcriptomics of immune cells in lymph nodes reveals their composition and alterations in functional dynamics during the early stages of bubonic plague. Sci China Life Sci 2023; 66(1): 110-26.
[http://dx.doi.org/10.1007/s11427-021-2119-5] [PMID: 35943690]
[12]
Dang HH, Ta HDK, Nguyen TTT, et al. Identifying GPSM family members as potential biomarkers in breast cancer: A comprehensive bioinformatics analysis. Biomedicines 2021; 9(9): 1144.
[http://dx.doi.org/10.3390/biomedicines9091144]
[13]
Li Y, Jin J, Bai F. Cancer biology deciphered by single-cell transcriptomic sequencing. Protein Cell 2022; 13(3): 167-79.
[http://dx.doi.org/10.1007/s13238-021-00868-1] [PMID: 34405376]
[14]
Dang Huy Hoang. Prospective role and immunotherapeutic targets of sideroflexin protein family in lung adenocarcinoma: Evidence from bioinformatics validation. Funct Integr Genomics 2022; 22(5): 1057-72.
[http://dx.doi.org/10.1007/s10142-022-00883-3]
[15]
Huang M, Wang J, Torre E, et al. SAVER: Gene expression recovery for single-cell RNA sequencing. Nat Methods 2018; 15(7): 539-42.
[http://dx.doi.org/10.1038/s41592-018-0033-z] [PMID: 29941873]
[16]
Chen M, Zhou X. VIPER: Variability-preserving imputation for accurate gene expression recovery in single-cell RNA sequencing studies. Genome Biol 2018; 19(1): 196.
[http://dx.doi.org/10.1186/s13059-018-1575-1] [PMID: 30419955]
[17]
Li WV, Li JJ. An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat Commun 2018; 9(1): 997.
[http://dx.doi.org/10.1038/s41467-018-03405-7] [PMID: 29520097]
[18]
Linderman GC, J. Zhao, Y. Kluger. Zero-preserving imputation of scRNA-seq data using low-rank approximation. Cold Spring Harbor Laboratory 2018.
[http://dx.doi.org/10.1101/397588]
[19]
van Dijk D, Sharma R, Nainys J, et al. Recovering gene interactions from single-cell data using data diffusion. Cell 2018; 174(3): 716-729.e27.
[http://dx.doi.org/10.1016/j.cell.2018.05.061] [PMID: 29961576]
[20]
Zhu K, Anastassiou D. 2DImpute: Imputation in single-cell RNA-seq data from correlations in two dimensions. Bioinformatics 2020; 36(11): 3588-9.
[http://dx.doi.org/10.1093/bioinformatics/btaa148] [PMID: 32108864]
[21]
Wang J, Ma A, Chang Y, et al. scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 2021; 12(1): 1882.
[http://dx.doi.org/10.1038/s41467-021-22197-x] [PMID: 33767197]
[22]
Jin K, Ou-Yang L, Zhao XM, Yan H, Zhang XF. scTSSR: Gene expression recovery for single-cell RNA sequencing using two-side sparse self-representation. Bioinformatics 2020; 36(10): 3131-8.
[http://dx.doi.org/10.1093/bioinformatics/btaa108] [PMID: 32073600]
[23]
Ran D, Zhang S, Lytal N, An L. scDoc: Correcting drop-out events in single-cell RNA-seq data. Bioinformatics 2020; 36(15): 4233-9.
[http://dx.doi.org/10.1093/bioinformatics/btaa283] [PMID: 32365169]
[24]
Elhamifar E, Vidal R. Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 2013; 35(11): 2765-81.
[http://dx.doi.org/10.1109/TPAMI.2013.57] [PMID: 24051734]
[25]
Dempster AP. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser A Stat Soc 1997; 39.
[26]
Van den Berge K, Perraudeau F, Soneson C, et al. Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications. Genome Biol 2018; 19(1): 24.
[http://dx.doi.org/10.1186/s13059-018-1406-4] [PMID: 29478411]
[27]
Vershynin R. Introduction to the non-asymptotic analysis of random matrices Compressed Sensing. Cambridge University Press: Cambridge, UK 2010.
[28]
Tang Q, Iyer S, Lobbardi R, et al. Dissecting hematopoietic and renal cell heterogeneity in adult zebrafish at single-cell resolution using RNA sequencing. J Exp Med 2017; 214(10): 2875-87.
[http://dx.doi.org/10.1084/jem.20170976] [PMID: 28878000]
[29]
Baron M, Veres A, Wolock SL, et al. A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure. Cell Syst 2016; 3(4): 346-360.e4.
[http://dx.doi.org/10.1016/j.cels.2016.08.011] [PMID: 27667365]
[30]
La Manno G, Gyllborg D, Codeluppi S, et al. Molecular diversity of midbrain development in mouse, human, and stem cells. Cell 2016; 167(2): 566-580.e19.
[http://dx.doi.org/10.1016/j.cell.2016.09.027] [PMID: 27716510]
[31]
Chen R, Wu X, Jiang L, Zhang Y. Single-cell RNA-seq reveals hypothalamic cell diversity. Cell Rep 2017; 18(13): 3227-41.
[http://dx.doi.org/10.1016/j.celrep.2017.03.004] [PMID: 28355573]
[32]
Zeisel A, Muñoz-Manchado AB, Codeluppi S, et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 2015; 347(6226): 1138-42.
[http://dx.doi.org/10.1126/science.aaa1934] [PMID: 25700174]
[33]
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 2018; 36(5): 411-20.
[http://dx.doi.org/10.1038/nbt.4096] [PMID: 29608179]
[34]
Gong W, Kwak IY, Pota P, Koyano-Nakagawa N, Garry DJ. DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics 2018; 19(1): 220.
[http://dx.doi.org/10.1186/s12859-018-2226-y] [PMID: 29884114]
[35]
Shaffer SM, Dunagin MC, Torborg SR, et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 2017; 546(7658): 431-5.
[http://dx.doi.org/10.1038/nature22794]
[36]
Torre E, Dueck H, Shaffer S, et al. Rare cell detection by single-Cell RNA sequencing as guided by single-molecule RNA FISH. Cell Syst 2018; 6(2): 171-179.e5.
[http://dx.doi.org/10.1016/j.cels.2018.01.014] [PMID: 29454938]
[37]
Jiang L, Chen H, Pinello L, Yuan GC. GiniClust: Detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 2016; 17(1): 144.
[http://dx.doi.org/10.1186/s13059-016-1010-4] [PMID: 27368803]
[38]
Zhao J, Wang N, Wang H, et al. SCDRHA: A scRNA-seq data Dimensionality Reduction Algorithm based on Hierarchical Autoencoder. Frontiers in Genetics 2021; 12(2021): 1485.
[http://dx.doi.org/10.3389/fgene.2021.733906] [PMID: 34512734]
[39]
Wang D, Gu J. VASC: dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genomics Proteomics Bioinformatics 2018; 16(5): 320-31.
[http://dx.doi.org/10.1016/j.gpb.2018.08.003] [PMID: 30576740]
[40]
Sun Z, Wang T, Deng K, et al. DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data. Bioinformatics 2018; 34(1): 139-46.
[http://dx.doi.org/10.1093/bioinformatics/btx490] [PMID: 29036318]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy