About ProtSecKB


Who are we?

The Protist Secretome and Subcellular Proteome KnowledgeBase (ProtSecKB) was created by Dr. Xiangjia (Jack) Min, Vamshi Amerishetty, and Brian Powell at Youngstown State University (YSU). Dr. Min collected the data and designed the prediction algorithms. Vamshi Amerishetty and Brian Powell, with assistance from John Meinken, implemented the database and built the website.The work was supported by a grant from the YSU University Research Council, Research Professorship award, and STEM Dean's reassigned time to Dr. Min. Vamshi was supported by a graduate assistantship from the YSU Center for Applied Chemical Biology. This server is supported by YSU.


Our motivation

Dramatic increases in the number of protein sequences and full proteomes have led to an increased need for computational tools that can automate analysis of proteins based on the protein sequence. One area where automated analysis has shown considerable promise is in the prediction of protein subcellular location. Many publicly available tools have been developed to analyze a protein sequence for information related to its subcellular location.

The core goal of this project is to combine information from multiple tools in order to produce aggregate predictions that are more accurate than the predictions made by the individual tools alone. Our website offers a single location where researchers can see our predictions as well as see all of the data we have collected from the individual tools. In addition to making predictions, the knowledgebase also serves as a testing site where we can compare prediction accuracies of different tools.


Our Process

The data for this website was retrieved from the UniProtKB 2016-02 release. It includes 1,970,022 proteins (8,661 from UniProt/Swiss-Prot; 1,961,361 from UniProt/TrEMBL). For each protein, we perform analysis using SignalP3, SignalP4, TMHMM, Phobius, TargetP, and WoLF PSORT.


Further Reading

Min XJ. (2010) Evaluation of computational methods for secreted protein prediction in different eukaryotes. J. Proteomics Bioinform. 3:143-147.
Lum G, Min XJ. (2011) FunSecKB: the Fungal Secretome KnowledgeBase. Database - the Journal of Biological Databases and Curation. Vol. 2011. bar001. doi: 10.1093/database/bar001.
Meinken J, Min XJ. (2012) Computational prediction of protein subcellular locations in eukaryotes: an experience report. Computational Molecular Biology. 2(1): 1-7.
Lum G, Meinken J, Orr J, Frazier S, Min XJ. (2014) PlantSecKB: the Plant Secretome and Subcellular Proteome KnowledgeBase. Computational Molecular Biology. 4(1).
Meinken J, Asch DK, Neizer-Ashun KA, Chang GH, Cooper JR CR, Min XJ. (2014) FunSecKB2: a fungal protein subcellular location knowledgebase. Computational Molecular Biology. 4(7):1-17.

Using This Website

The home page has four different search options:

Search By ID - Use this option if you have a protein ID from UniProt or NCBI or you know the gene name of the protein you are interested in.

Search By Subcellular Location - Use this option to get a list of all proteins for a species that are predicted in a specific subcellular location. The species can be selected from a list of common species or entered manually.

Search By Protein Keywords or Function - Use this option to get a list of all proteins for a species that match a protein name, function or keyword. For example, to get a list of all proteins involved in amino acid transport, enter the search text "amino acid transport" (word order does not matter). The species can be selected from a list of common species or entered manually.

BLAST Search - This will take you to our BLAST search page where you can search against this database as well as several other databases we maintain.

Searches:

Get a FASTA formatted list of search results:
When doing a search by subcellular location or protein keyword/function, use the "FASTA Download" button to get the results in FASTA format. The results can be easily copied and pasted to a text file if needed. For individual proteins, the FASTA formatted protein sequence is included at the bottom of the results page.

Download the search results as a text file:
When doing a search by subcellular location or protein keyword/function, use the "Search" button to get a paginated list of results. At the top of the page, you can click the link to "Download result set as a tab delimited text file".

Get the count of proteins in a search result set:
When doing a search by subcellular location or protein keyword/function, use the "Search" button to get a paginated list of results. The number of results returned along with a description of the search parameters will be included at the top of the page.

Results:

Get our prediction for subcellular location:
The results page contains a summary section at the top and a details section at the bottom. Our prediction can be found in the summary section under "Predicted Subcellular Location(s)". Note that our prediction algorithms can sometimes produce no prediction or more than one prediction. The logic for how the prediction was made will be included next to the prediction.

Get results from individual computational tools
All of the data we collected from the individual computational tools is included in the details section on the results page.

Get our annotated data
When available, our curated annotation will be included at the bottom of the details section on the results page. However, most proteins do not have local curated annotations. UniProt annotations for subcellular location are included in the summary table on the results page when available. If you want to see the supporting reference for a UniProt annotation, click the UniProt AC value to view that entry in the UniProtKB.

Submit an Annotation:

This database accepts public annotation for subcellular location based on experimental evidence. Submissions will be added to the database after being reviewed by our curator. We have an online form for submitting protein annotations one at a time. Or if you have a large number of proteins to submit, you can contact us directly.