Liu Lab at Huazhong University of Science and Technology


CPPred-sORF: Coding Potential Prediction of sORF based on non-AUG

CPPred-sORF is a prediction tool that distinguishes sORF from lncRNA,which uses CUG and GUG as the start condon, in addition to the classical start codon AUG.

CPPred-sORF is available:
  • CPPred-sORF(CPPred-sORF.tar.gz,2.70M)

  • Training sets:
  • Training: sORF (16938,7.2M)+ lncRNAs(13138,16M)

  • Testing sets:
  • Independent_Testing_set: 8469 sORFs+ 6569 lncRNAs
  • Maize-Testing: 315 sORFs+ 281 lncRNAs
  • Soybean-Testing: 315 sORFs+ 110 lncRNAs


  • Uncompress and usage of CPPred-sORF:

    Download the CPPred-sORF package.

    (1) Type "tar -zxvf CPPred-sORF.tar.gz" to uncompress the package

    (2) Type "cd CPPred-sORF/bin" to change the current directory

    (3) Run "python CPPred-sORF.py -i input_RNA.fa -hex Hexamer.tsv -r range -m model -spe species -o result" to predict. Here, "input_RNA.fa" is RNAs file in FASTA format. "Hexamer.tsv" is a pre-built hexamer frequency table. "range" is pre-built training range file. "model" is pre-built training model. "species" is the model of the species to choose (Human,Integrated). The "result" file in it is the final result for each prediction.

    Example :

    python CPPred-sORF.py -i data/sORF_testing.fa -hex Hexamer/Integrated_Hexamer.txv -r Model/Integrated.range -mol Model/Integrated.model -spe Integrated -o sORF.result

    python CPPred-sORF.py -i data/lncRNA_testing.fa -hex Hexamer/Integrated_Hexamer.txv -r Model/Integrated.range -mol Model/Integrated.model -spe Integrated -o lncRNA.result

    Program and modules connected with CPPred-sORF:

  • LIBSVM: A Library of Support Vector Machines (Version 3.22, December 2016). It is downloaded from http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz. For more information, please see https://www.csie.ntu.edu.tw/~cjlin/libsvm/
  • FrameKmer.py and fickett.py: The python scripts of CPAT (Version 1.2.2) are downloaded from https://sourceforge.net/projects/rna-cpat/files/v1.2.2/. For more information, please see http://rna-cpat.sourceforge.net/
  • Contact us:

    Any questions about CPPred-sORF, please email to liushiyong@gmail.com.

    Reference:

    If you use CPPred-sORF, please cite:

    1. Xiaoxue Tong and Shiyong Liu. CPPred: Coding Potential Prediction based on the global description of RNA sequence. Nucleic Acids Research, 1 February 2019 [PubMed] [PDF]

    2. Xiaoxue Tong, Xu Hong, Juan Xie and Shiyong Liu. CPPred-sORF: Coding Potential Prediction of sORF based on non-AUG. (submit)



    Last modified: Sun. Mar. 12 20:00:00 CST 2020