Efficient alignment of RNA secondary structures using sparse dynamic programming

  • Cuncong Zhong, Department of EECS, University of Central Florida, Orlando, FL 32816-2362 USA, cczhong at eecs dot ucf dot edu
  • Shaojie Zhang*, Department of EECS, University of Central Florida, Orlando, FL 32816-2362 USA, shzhang at eecs dot ucf dot edu
  • *To whom the correspondence should be addressed to.

    Abstract

    Background:

    Current advances of the next-generation sequencing technology have revealed a large number of un-annotated RNA transcripts. Comparative study of the RNA structurome is an important approach to assess their biological functionalities. Due to the large sizes and abundance of the RNA transcripts, an efficient and accurate RNA structure-structure alignment algorithm is in urgent need to facilitate the comparative study. Despite the importance of the RNA secondary structure alignment problem, there are no computational tools available that provide high computational efficiency and accuracy. In this case, designing and implementing such an efficient and accurate RNA secondary structure alignment algorithm is highly desirable.

    Results:

    In this work, through incorporating the sparse dynamic programming technique, we implemented an algorithm that has an O(n^3) expected time complexity, where n is the average number of base pairs in the RNA structures. This complexity, which can be shown assuming the polymer-zeta property, is confirmed by our experiments. The resulting new RNA secondary structure alignment tool is called ERA. Benchmark results indicate that ERA can significantly speedup RNA structure-structure alignments compared to other state-of-the-art RNA alignment tools, while maintaining high alignment accuracy.

    Conclusions:

    Using the sparse dynamic programming technique, we are able to develop a new RNA secondary structure alignment tool that is both efficient and accurate. We anticipate that the new alignment algorithm ERA will significantly promote comparative RNA structure studies. The program, ERA, is freely available from this website.

    Data set used for benchmarking:

  • BraliBase II
  • Individual structures from RFAM

    Download ERA:

  • ERA_v1.0