Frequently Asked Questions about the OrfPredictor Server
Who are we?
OrfPredictor (ORF-Predictor) server was implemented by Dr. Xiangjia (Jack) Min when he worked at Concordia University, Montreal, Quebec, Canada, with the fungal gemomics project principal investigators including Drs. A. Tsang, R. Storms and G. Butler. Alex Spurmanis designed the logos and Wei Ding assisted in the development of the server interface. The webserver was originally installed at Concordia University. The current site is supported by Youngstown State University.
Generating expressed sequence tags (EST) remains a primary method for gene discovery in most organisms. Predicting open reading frames and coding regions for EST cDNA sequences is essential for functionally annotating them. Our server is designed for predicting the ORFs of a batch of EST/cDNA sequences. Note: (1) This server is NOT designed for identifying exons (or protein-coding genes) from genomic sequences. For gene prediction from genomic sequences, please go to other related sites, such as GeneMark or GenScan etc. (2) Sequences generated from the next-generation sequencers are suitable for using this tool, however, sequences having a length less than 60 bp may NOT be predicted correctly.
How does it work?
If a BLASTX output file is provided by a user, for sequences having a BLASTX hit, the frames used by BLASTX are used for identifying the coding regions of EST cDNA sequences. For sequences without a BLASTX hit or sequences without a BLASTX output file, the coding regions are predicted based on the intrinsic signals of the sequences.
A total of four files are generated. One file (OrfPredictor.pep) is in FASTA format: the definition line contains the query identifier, the frame, the beginning and the end position of the predicted coding region, and the predicted protein peptide sequences. If there is a 'FS' flag in the definition line, it means there is a frame shift in the query sequence that was detected by the BLASTX program. Only the most likely open reading frame and one coding region for a given sequence are predicted. The second file contains query identifiers not having a coding region predicted, i.e., the sequence only contains either the 5' or 3' untranslated region. In response to users' requests, two new ouput files are generated: (1) a file contains 6-frame translation of the sequences; (2) a file contains protein-coding DNA sequences extracted from the original sequences.
Security of user submitted data
The data submitted to our server will be automatically deleted after they are processed. We do not keep data submitted by a user.
How to obtain user's results
The results can be downloaded from the server web site. The results will be kept on the site for 2 days only after processing, then it will be deleted.
How to cite us
Min, X.J., Butler, G., Storms, R. and Tsang, A. OrfPredictor: predicting protein-coding regions in EST-derived sequences. Nucleic Acids Res., 2005, Web Server Issue W677-W680. Please include the server URL (http://proteomics.ysu.edu/tools/OrfPredictor.html) in your paper as the original server site was terminated.
Standalone OrfPredictor availability The standalone version of the OrfPredictor software is available free for academic use only. It is written in Perl - easy to run in any OS. Please contact Dr. Min in the YSU Bioinformatics Lab.
Comments and suggestions
Please contact Dr. Min in the YSU Bioinformatics Lab.
|Back to the OrfPredictor Server||Top of Page||Back to Index Page|