Application of a bioinformatic pipeline to RNA-seq data identifies novel virus-like sequence in human blood

Marko Melnick, Patrick Gonzales, Thomas J. LaRocca, Yuping Song, Joanne Wuu, Michael Benatar, Björn Oskarsson, Leonard Petrucelli, Robin D. Dowell, Christopher D. Link, Mercedes Prudencio

Research output: Contribution to journalArticlepeer-review


Numerous reports have suggested that infectious agents could play a role in neurodegenerative diseases, but specific etiological agents have not been convincingly demonstrated. To search for candidate agents in an unbiased fashion, we have developed a bioinformatic pipeline that identifies microbial sequences in mammalian RNA-seq data, including sequences with no significant nucleotide similarity hits in GenBank. Effectiveness of the pipeline was tested using publicly available RNA-seq data and in a reconstruction experiment using synthetic data. We then applied this pipeline to a novel RNA-seq dataset generated from a cohort of 120 samples from amyotrophic lateral sclerosis patients and controls, and identified sequences corresponding to known bacteria and viruses, as well as novel virus-like sequences. The presence of these novel virus-like sequences, which were identified in subsets of both patients and controls, were confirmed by quantitative RT-PCR. We believe this pipeline will be a useful tool for the identification of potential etiological agents in the many RNA-seq datasets currently being generated.

Original languageEnglish (US)
Article numberjkab141
JournalG3: Genes, Genomes, Genetics
Issue number9
StatePublished - 2021


  • ALS
  • Microbiome
  • RNA-seq
  • Transcriptomics
  • Virome

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics
  • Genetics(clinical)


Dive into the research topics of 'Application of a bioinformatic pipeline to RNA-seq data identifies novel virus-like sequence in human blood'. Together they form a unique fingerprint.

Cite this