Motivation

In order to improve the overall quality of research done on RNA structures, accessibility to highly accurate data must be ensured. Though the availability of data related to RNA structure has been growing tremendously over the past few years, maintaining their quality and integrity has become the greater challenge. Since the structural data available in PDB are results of different independent research, they might be highly similar to previously submitted data. To remove RNA chain redundancy in PDB, we have introduced a non-redundant dataset of RNA chains, known as RNA- NRD. Here, pair of RNA chains within the same organism containing a sequence identity ≥ 80%, RMSD less than 4Å and Alignment Ratio ≥ 80% are considered to be redundant. As depending on applications, the definition of redundant RNA structures can vary, we have generated another variation of RNA-NRD dataset where we don’t divide the RNA chains based on source organism. We refer to this dataset as RNA-NRD-without-Organism-Division.

RNA-NRD Dataset Features

The dataset is updated every three months on a regular basis. It contains the following features:

  1. Cluster ID
  2. Representative
  3. Redundant Cluster
  4. Organism
  5. Macromolecule Name
  6. Rfam Family Name

Versions of RNA-NRD Dataset

  1. Version 1 [Supplementary Table S3] (View/ Download/ Data Collection Date: 03/17/2021)
  2. Version 2 [Supplementary Table S5] (View/ Download/ Data Collection Date: 02/23/2022)

Versions of RNA-NRD-without-Organism-Division Dataset

  1. Version 1 [Supplementary Table S4] (View/ Download/ Data Collection Date: 03/17/2021)
  2. Version 2 [Supplementary Table S6] (View/ Download/ Data Collection Date: 02/23/2022)

About Us

We are a research group within the Department of Computer Science at the University of Central Florida. This group was founded in 2007.

Contact

Contact Person: Nabila Shahnaz Khan
E-mail: nabilakhan@knights.ucf.edu