Detecting transcription of ribosomal protein pseudogenes in diverse human tissues in RNA-seq data
This study presents findings of transcription for ribosomal protein pseudogenes using a novel pipeline for RNA-Seq expression data. A number of RP psuedogenes are transcribed at levels comparable to known functional genes. Here, we present online access to the results of the associated publication.
Peter Tonner, Vinodh Srinivasasainagendra, Shaojie Zhang, and Degui Zhi, "Detecting transcription of ribosomal protein pseudogenes in diverse human tissues in RNA-seq data",
BMC Genomics (in press)
All Right Reserved by the authors.
Shaojie Zhang (shzhang at cs.ucf.edu)
Background: Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potential cross-hybridization between transcripts from the parent genes and pseudogenes. Recently, transcriptome sequencing (RNA-seq) provides an opportunity to ascertain the transcription of pseudogenes. A challenge for pseudogene expression discovery in RNA-seq data lies in the difficulty to uniquely identify reads mapped to pseudogene regions, which are typically also similar to the parent genes.
Results: Here we developed a specialized pipeline for pseudogene transcription discovery. We first construct a "composite genome" that includes the entire human genome sequence as well as mRNA sequences of real ribosomal protein genes. We then map all sequence reads to the composite genome, and only exact matches were retained. Moreover, we restrict our analysis to strictly defined mappable regions and calculate the RPKM values as measurement of pseudogene transcription levels.
We report evidences for the transcription of RP pseudogenes in 16 human tissues. By analyzing the Human Body Map 2.0 study RNA-sequencing data using our pipeline, we identified that one ribosomal protein (RP) pseudogene (PGOHUM00000249508) is transcribed with RPKM 170 in thyroid. Moreover, a total of four RP pseudogenes are transcribed with RPKM>10, a level similar to that of the normal RP genes. Furthermore, an additional thirteen RP pseudogenes are of RPKM>5, corresponding to the 20-30 percentile among all genes. Unlike ribosomal protein genes that are constitutively expressed in almost all tissues, RP pseudogenes are differentially expressed, suggesting that they may contribute to tissue-specific biological processes.
Conclusions: Using a specialized read mapping method, we identified the transcription of ribosomal protein pseudogenes in human tissues using RNA-seq data.
Additional Supplementary Data