PAL: Pseudoknot Aligment for RNA sequences


PAL is an implementation of an efficient algorithm to compute an optimal structural alignment of an RNA sequence against a genomic substring. PAL is introduced in:

  • Buhm Han, Banu Dost, Vineet Bafna and Shaojie Zhang, "Structural alignment o f pseudoknotted RNA", Journal of Computational Biology, 15(5) page 489-504, 2008. Pubmed
  • Banu Dost, Buhm Han, Shaojie Zhang and Vineet Bafna, "Structural alignment o f pseudoknotted RNA", In RECOMB 2006: Proceedings of the 10th annual international conference on re search in computational molecular biology, 2006. SpringerLink

  • All Right Reserved by the authors. CONTACT: Shaojie Zhang (shzhang at cs.ucf.edu)

    Abstract

  • In this paper, we address the problem of discovering novel non-coding RNA (ncRNA) using primary sequence, and secondary structure conservation, focusing on ncRNA families with pseudoknotted structures. Our main technical result is an efficient algorithm for computing an optimum structural alignment of an RNA sequence against a genomic substring. This algorithm has two applications. First, by scanning a genome, we can identify novel (homologous) pseudoknotted ncRNA, and second, we can infer the secondary structure of the target aligned sequence. We test an implementation of our algorithm (PAL) and show that it has near-perfect behavior for predicting the structure of many known pseudoknots. Additionally, it can detect the true homologs with high sensitivity and specificity in controlled tests. We also use PAL to search entire viral genome and mouse genome for novel homologs of some viral and eukaryotic pseudoknots, respectively. In each case, we have found strong support for novel homologs.

    Download

  • PAL software package (ver 1.3)