World Scientific
  • Search
  •   
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

Optimized splitting of mixed-species RNA sequencing data

    https://doi.org/10.1142/S0219720022500019Cited by:1 (Source: Crossref)

    Gene expression studies using xenograft transplants or co-culture systems, usually with mixed human and mouse cells, have proven to be valuable to uncover cellular dynamics during development or in disease models. However, the mRNA sequence similarities among species presents a challenge for accurate transcript quantification. To identify optimal strategies for analyzing mixed-species RNA sequencing data, we evaluate both alignment-dependent and alignment-independent methods. Alignment of reads to a pooled reference index is effective, particularly if optimal alignments are used to classify sequencing reads by species, which are re-aligned with individual genomes, generating >97% accuracy across a range of species ratios. Alignment-independent methods, such as convolutional neural networks, which extract the conserved patterns of sequences from two species, classify RNA sequencing reads with over 85% accuracy. Importantly, both methods perform well with different ratios of human and mouse reads. While non-alignment strategies successfully partitioned reads by species, a more traditional approach of mixed-genome alignment followed by optimized separation of reads proved to be the more successful with lower error rates.

    References

    • 1. Rosenthal N, Brown S, The mouse ascending: perspectives for human-disease models, Nat Cell Biol 9 :993–999, 2007. Crossref, MedlineGoogle Scholar
    • 2. Morse III H, Building a better mouse: One hundred years of genetics and biology, in The Mouse in Biomedical Research, Elsevier, Amsterdam, The Netherlands, pp. 1–11. Google Scholar
    • 3. Mouse Genome Sequencing Consortium, Initial sequencing and comparative analysis of the mouse genome, Nature 420 :520–562, 2002. Crossref, MedlineGoogle Scholar
    • 4. Bedell MA, Jenkins NA, Copeland NG, Mouse models of human disease. Part I: techniques and resources for genetic analysis in mice, Genes Dev 11 :1–10, 1997. Crossref, MedlineGoogle Scholar
    • 5. Hoyt R, Hawkins J, St Clair M, Kennett M, The mouse in biomedical research, in American College of Laboratory Animal Medicine, Vol. 3, Elsevier, Amsterdam, The Netherlands, 2007. Google Scholar
    • 6. Kerbel RS, What is the optimal rodent model for anti-tumor drug testing?, Cancer Metastasis Rev 17 :301–304, 1998. Crossref, MedlineGoogle Scholar
    • 7. Elsea SH, Lucas RE, The mousetrap: what we can learn when the mouse model does not mimic the human disease, ILAR J 43 :66–79, 2002. Crossref, MedlineGoogle Scholar
    • 8. Espuny-Camacho I et al., Hallmarks of Alzheimer’s disease in stem-cell-derived human neurons transplanted into mouse brain, Neuron 93 :1066–1081, 2017. Crossref, MedlineGoogle Scholar
    • 9. Crews L, Masliah E, Molecular mechanisms of neurodegeneration in Alzheimer’s disease, Hum Mol Genet 19 :R12–R20, 2010. Crossref, MedlineGoogle Scholar
    • 10. Lin S, Lin Y, Nery JR, Urich MA, Breschi A, Davis CA, Dobin A, Zaleski C, Beer MA, Chapman WC, Gingeras TR, Ecker JR, Snyder MP, Comparison of the transcriptional landscapes between human and mouse tissues, Proc Natl Acad Sci USA 111 :17224–17229, 2014. Crossref, MedlineGoogle Scholar
    • 11. Thapa KS, Chen AB, Lai D, Xuei X, Wetherill L, Tischfield JA, Liu Y, Edenberg HJ, Identification of functional genetic variants associated with alcohol dependence and related phenotypes using a high-throughput assay, Alcohol Clin Exp Res 44 :2494–2518, 2020. Crossref, MedlineGoogle Scholar
    • 12. Xiao X, Chang H, Li M, Molecular mechanisms underlying noncoding risk variations in psychiatric genetic studies, Mol Psychiatry 22 :497–511, 2017. Crossref, MedlineGoogle Scholar
    • 13. Bocher O, Genin E, Rare variant association testing in the non-coding genome, Hum Genet 139 :1345–1362, 2020. Crossref, MedlineGoogle Scholar
    • 14. Low LK, Cheng HJ, Axon pruning: An essential step underlying the developmental plasticity of neuronal connections, Philos Trans R Soc Lond B Biol Sci 361 :1531–1544, 2006. Crossref, MedlineGoogle Scholar
    • 15. Windrem MS, Schanz SJ, Morrow C, Munir J, Chandler-Militello D, Wang S, Goldman SA, A competitive advantage by neonatally engrafted human glial progenitors yields mice whose brains are chimeric for human glia, J Neurosci 34 :16153–16161, 2014. Crossref, MedlineGoogle Scholar
    • 16. Xu RJ, Li XX, Boreland AJ, Posyton A, Kwan K, Hart RP, Jiang P, Human iPSC-derived mature microglia retain their identity and functionally integrate in the chimeric mouse brain, Nat Commun 11 :1–16, 2020. MedlineGoogle Scholar
    • 17. Thompson LH, Bjorklund A, Reconstruction of brain circuitry by neural transplants generated from pluripotent stem cells, Neurobiol Dis 79 :28–40, 2015. Crossref, MedlineGoogle Scholar
    • 18. Shi Y, Kirwan P, Smith J, Robinson HP, Livesey FJ, Human cerebral cortex development from pluripotent stem cells to functional excitatory synapses, Nat Neurosci 15 :477–486, 2012. Crossref, MedlineGoogle Scholar
    • 19. Oni EN, Halikere A, Li G, Toro-Ramos AJ, Swerdel MR, Verpeut JL, Moore JC, Bello NT, Bierut LJ, Goate A, Tischfield JA, Pang ZP, Hart RP, Increased nicotine response in iPSC-derived human neurons carrying the CHRNA5 N398 allele, Sci Rep 6 :34341, 2016. Crossref, MedlineGoogle Scholar
    • 20. Halikere A, Popova D, Scarnati MS, Hamod A, Swerdel MR, Moore JC, Tischfield JA, Hart RP, Pang ZP, Addiction associated N40D mu-opioid receptor variant modulates synaptic function in human neurons, Mol Psychiatry 25 :1406–1419, 2020. Crossref, MedlineGoogle Scholar
    • 21. Pang ZP, Yang N, Vierbuchen T, Ostermeier A, Fuentes DR, Yang TQ, Citri A, Sebastiano V, Marro S, Sudhof TC, Wernig M, Induction of human neuronal cells by defined transcription factors, Nature 476 :220–223, 2011. Crossref, MedlineGoogle Scholar
    • 22. Fridman WH, Pages F, Sautes-Fridman C, Galon J, The immune contexture in human tumours: Impact on clinical outcome, Nat Rev Cancer 12 :298–306, 2012. Crossref, MedlineGoogle Scholar
    • 23. Rahier J, Goebbels RM, Henquin JC, Cellular composition of the human diabetic pancreas, Diabetologia 24 :366–371, 1983. Crossref, MedlineGoogle Scholar
    • 24. Wang J, Huang M, Torre E, Dueck H, Shaffer S, Murray J, Raj A, Li M, Zhang NR, Gene expression distribution deconvolution in single-cell RNA sequencing, Proc Natl Acad Sci USA 115 :E6437–E6446, 2018. MedlineGoogle Scholar
    • 25. Mohammadi S, Zuckerman N, Goldsmith A, Grama A, A critical survey of deconvolution methods for separating cell types in complex tissues, Proc IEEE 105 :340–366, 2016. CrossrefGoogle Scholar
    • 26. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA, Robust enumeration of cell subsets from tissue expression profiles, Nat Methods 12 :453–457, 2015. Crossref, MedlineGoogle Scholar
    • 27. Baron M, Veres A, Wolock SL, Faust AL, Gaujoux R, Vetere A, Ryu JH, Wagner BK, Shen-Orr SS, Klein AM, Melton DA, Yanai I, A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure, Cell Syst 3 :346–360, 2016. Crossref, MedlineGoogle Scholar
    • 28. Wang X, Park J, Susztak K, Zhang NR, Li M, Bulk tissue cell type deconvolution with multi-subject single-cell expression reference, Nat Commun 10 :1–9, 2019. Crossref, MedlineGoogle Scholar
    • 29. Ziegenhain C, Vieth B, Parekh S, Reinius B, Guillaumet-Adkins A, Smets M, Leonhardt H, Heyn H, Hellmann I, Enard W, Comparative analysis of single-cell RNA sequencing methods, Mol Cell 65 :631–643, 2017. Crossref, MedlineGoogle Scholar
    • 30. Jew B, Alvarez M, Rahmani E, Miao Z, Ko A, Garske KM, Sul JH, Pietiläinen KH, Pajukanta P, Halperin E, Accurate estimation of cell composition in bulk expression through robust integration of single-cell information, Nat Commun 11 :1–11, 2020. MedlineGoogle Scholar
    • 31. Wan X, Song H, Luo L, Li Z, Sheng G, Jiang X, Pattern recognition of partial discharge image based on one-dimensional convolutional neural network, Condition Monitoring and Diagnosis (CMD), pp. 1–4, 2018. CrossrefGoogle Scholar
    • 32. Kalchbrenner N, Espeholt L, Simonyan K, Avd O, Graves A, Kavukcuoglu K, Neural machine translation in linear time, arXiv:1610.10099. Google Scholar
    • 33. Scarnati MS, Boreland AJ, Joel M, Hart RP, Pang ZP, Differential sensitivity of human neurons carrying mu opioid receptor (MOR) N40D variants in response to ethanol, Alcohol 87 :97–109, 2020. Crossref, MedlineGoogle Scholar
    • 34. Shao Z et al., Dysregulated protocadherin-pathway activity as an intrinsic defect in induced pluripotent stem cell-derived cortical interneurons from subjects with schizophrenia, Nat Neurosci 22 :229–242, 2019. Crossref, MedlineGoogle Scholar
    • 35. Clark SC, Chereji RV, Lee PR, Fields RD, Clark DJ, Differential nucleosome spacing in neurons and glia, Neurosci Lett 714 :134559, 2020. Crossref, MedlineGoogle Scholar
    • 36. Albawi S, Mohammed T, Al-Zawi S, Understanding of a convolutional neural network, Int Conf Engineering and Technology (ICET), pp. 1–6, 2017. CrossrefGoogle Scholar
    • 37. Oquab M, Bottou L, Laptev I, Sivic J, Learning and transferring mid-level image representations using convolutional neural networks, Proc IEEE Conf Computer Vision and Pattern Recognition, pp. 1717–1724, 2014. CrossrefGoogle Scholar
    • 38. Tolias G, Sicre R, Jégou H, Particular object retrieval with integral max-pooling of CNN activations, arXiv:1511.05879. Google Scholar
    • 39. Nagi J, Ducatelle F, Di Caro GA, Cireşan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella LM, Max-pooling convolutional neural networks for vision-based hand gesture recognition, IEEE Int Conf Signal and Image Processing Applications (ICSIPA), pp. 342–347, 2011. CrossrefGoogle Scholar
    • 40. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR, Improving neural networks by preventing co-adaptation of feature detectors, arXiv:1207.0580. Google Scholar
    • 41. Poernomo A, Kang DK, Biased dropout and crossmap dropout: learning towards effective dropout regularization in convolutional neural network, Neural Netw 104 :60–67, 2018. Crossref, MedlineGoogle Scholar
    • 42. Husain SS, Bober M, REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval, IEEE Trans Image Process 28 :5201–5213, 2019. CrossrefGoogle Scholar
    • 43. Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y, Residual dense network for image super-resolution, Proc IEEE Conf Computer Vision and Pattern Recognition, pp. 2472–2481, 2018. CrossrefGoogle Scholar
    • 44. Bray NL, Pimentel H, Melsted P, Pachter L, Near-optimal probabilistic RNA-seq quantification, Nat Biotechnol 34 :525–527, 2016. Crossref, MedlineGoogle Scholar
    • 45. Eddy SR, Profile hidden Markov models, Bioinformatics 14 :755–763, 1998. Crossref, MedlineGoogle Scholar
    • 46. Munch K, Krogh A, Automatic generation of gene finders for eukaryotic species, BMC Bioinformatics 7 :1–12, 2006. Crossref, MedlineGoogle Scholar
    • 47. Wheeler TJ, Eddy SR, nhmmer: DNA homology search with profile HMMs, Bioinformatics 29 :2487–2489, 2013. Crossref, MedlineGoogle Scholar
    • 48. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A, The Pfam protein families database: Towards a more sustainable future, Nucleic Acids Res 44, D279–5285, 2016. Crossref, MedlineGoogle Scholar
    • 49. Finn RD, Clements J, Eddy SR, HMMER web server: Interactive sequence similarity searching, Nucleic Acids Res 39 :W29–W37, 2011. Crossref, MedlineGoogle Scholar
    • 50. Burks DJ, Azad RK, Higher-order Markov models for metagenomic sequence classification, Bioinformatics 36 :4130–4136, 2020. Crossref, MedlineGoogle Scholar
    • 51. Zou KH, O’Malley AJ, Mauri L, Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models, Circulation 115 :654–657, 2007. Crossref, MedlineGoogle Scholar
    • 52. Valueva MV, Nagornov NN, Lyakhov PA, Valuev GV, Chervyakov NI, Application of the residue number system to reduce hardware costs of the convolutional neural network implementation, Math Comput Simul 177 :232–243, 2020. CrossrefGoogle Scholar
    • 53. Behnke S, Hierarchical Neural Networks for Image Interpretation, Springer, Berlin, Heidelberg, Germany, 2003. CrossrefGoogle Scholar
    • 54. Zhang Y, Wallace B, A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, arXiv:1510.03820. Google Scholar
    • 55. Yih W-t, Toutanova K, Platt JC, Meek C, Learning discriminative projections for text similarity measures, Proc Fifteenth Conf Computational Natural Language Learning, pp. 247–256, 2011. Google Scholar
    • 56. Shen Y, He X, Gao J, Deng L, Mesnil G, Learning semantic representations using convolutional neural networks for web search, Proc 23rd Int Conf World Wide Web, pp. 373–374, 2014. CrossrefGoogle Scholar
    • 57. Kalchbrenner N, Grefenstette E, Blunsom P, A convolutional neural network for modelling sentences, Proc 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1, pp. 655–665, 2014. CrossrefGoogle Scholar
    • 58. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P, Natural language processing (almost) from scratch, J Mach Learn Res 12 :2493–2537, 2011. Google Scholar
    • 59. Kathiresan N, Temanni R, Almabrazi H, Syed N, Jithesh PV, Al-Ali R, Accelerating next generation sequencing data analysis with system level optimizations, Sci Rep 7 :9058, 2017. Crossref, MedlineGoogle Scholar
    • 60. Piovesan A, Pelleri MC, Antonaros F, Strippoli P, Caracausi M, Vitale L, On the length, weight and GC content of the human genome, BMC Res Notes 12 :106, 2019. Crossref, MedlineGoogle Scholar
    • 61. Guénet J, Inducing alterations in the mammalian genome for investigating the functions of genes, in Mammalian Genomics, eds. Ruvinsky A, Marshall Graves J, CABI Publishing, Cambridge, pp. 221–262, 2005. CrossrefGoogle Scholar
    • 62. Ewing B, Hillier L, Wendl MC, Green P, Base-calling of automated sequencer traces usingPhred. I. Accuracy assessment, Genome Res 8 :175–185, 1998. Crossref, MedlineGoogle Scholar
    • 63. Gavrilov AD, Jordache A, Vasdani M, Deng J, Preventing model overfitting and underfitting in convolutional neural networks, Int J Softw Sci Comput Intell 10 :19–28, 2018. CrossrefGoogle Scholar
    • 64. Özgenel ÇF, Sorguç AG, Performance comparison of pretrained convolutional neural networks on crack detection in buildings, Proc Int Symp Automation and Robotics in Construction, pp. 1–8, 2018. CrossrefGoogle Scholar
    • 65. Arif R, Siddique M, Khan M, Oishe M, Study and observation of the variations of accuracies for handwritten digits recognition with various hidden layers and epochs using convolutional neural network, 4th Int Conf Electrical Engineering and Information & Communication Technology (iCEEiCT), pp. 112–117, 2018. CrossrefGoogle Scholar
    • 66. Phan H, Andreotti F, Cooray N, Chen OY, De Vos M, Joint classification and prediction cnn framework for automatic sleep stage classification, IEEE Trans Biomed Eng 66 :1285–1296, 2019. Crossref, MedlineGoogle Scholar
    • 67. Zhu X, Bain M, B-CNN: Branch convolutional neural network for hierarchical classification, arXiv:1709.09890. Google Scholar
    • 68. Agrawal A, Mittal N, Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy, Visual Comput 36 :405–412, 2020. CrossrefGoogle Scholar
    • 69. Huang Z, Dong M, Mao Q, Zhan Y, Speech emotion recognition using CNN, Proc 22nd ACM Int Conf Multimedia, pp. 801–804, 2014. CrossrefGoogle Scholar
    • 70. Akhtar N, Ragavendran U, Interpretation of intelligence in CNN-pooling processes: A methodological survey, Neural Comput Appl 32 :879–898, 2020. CrossrefGoogle Scholar
    • 71. Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M, Striving for simplicity: The all convolutional net, arXiv:1412.6806. Google Scholar
    • 72. Gong Y, Wang L, Guo R, Lazebnik S, Multi-scale orderless pooling of deep convolutional activation features, European Conf Computer Vision, pp. 392–407, 2014. CrossrefGoogle Scholar
    • 73. Chen J, Hua Z, Wang J, Cheng S, A convolutional neural network with dynamic correlation pooling, 13th Int Conf Computational Intelligence and Security (CIS), pp. 496–499, 2017. CrossrefGoogle Scholar
    • 74. Xu Z, Yang Y, Hauptmann AG, A discriminative CNN video representation for event detection, Proc IEEE Conf Computer Vision and Pattern Recognition, pp. 1798–1807, 2015. CrossrefGoogle Scholar
    • 75. Koushik J, Hayashi H, Improving stochastic gradient descent with feedback, 2016, arXiv:16.11.01505, https://arxiv.org/abs/16.11.01505. Google Scholar
    • 76. Kingma DP, Ba J, Adam: A method for stochastic optimization, arXiv:1412.6980. Google Scholar
    • 77. Reddi SJ, Kale S, Kumar S, On the convergence of adam and beyond, arXiv:1904.09237. Google Scholar