Protein-RNA Binding Affinity Dataset

Protein-RNA Binding Affinity Benchmark 2.0 (PRBAB2.0) is a dataset for non-redundant protein-RNA binding affinity, which derived from the available protein-RNA structures in the Protein Database Bank. The database includes a previous published dataset[1] PRBAB1.0 and 72 new protein-RNA complexes with binding affinity, which can be downloaded as PRBAB 2.0.

How to build the PRBAB2.0?

  First, we built a protein-RNA complex structure set ,which needs to satisfy the following conditions:
     (1) The RNA sequence has at least five nucleotides, the protein sequence has at least twenty amino acids.
     (2) Larger Ribosome complex and virus structure were removed.
     (3) The protein sequences in these complexes with sequence identity cutoff of 70% are assigned.
   The python scripts of constructing the data set are available:
     construct_dataset.zip

In this part, we adopted the approach of semi-automatic, detailed information please click here.

Second, we search the scientific literature manually for binding affinity data for those protein-RNA complexes selected above.

PRdeltaGPred: the program of protein-RNA binding affinity prediction

PRdeltaGPred is protein-RNA binding affinity prediction program, which is based on complex structural features, including non-interaction surface, desolvation energy, hbond energy and salt bridge energy so on. The source of PRdeltaGPred can be download from here.

If you feel it is useful, please cite:
1. Yang, X., H. Li, et al. (2013). "The dataset for protein-RNA binding affinity." Protein Science [PubMed] [PDF]
2. Xu Hong, Xiaoxue Tong, Juan Xie, Pinyu Liu, Xudong Liu, Qi Song, Sen Liu & Shiyong Liu. (2023). "An updated dataset and a structure-based prediction model for protein-RNA binding affinity" Proteins: Structure, Function, and Bioinformatics [PDF]

Last modified: Wed May. 19 12:10:00 CST 2021