Bioinformatics Tools

Pages

Saturday, September 15, 2007

Predicting Subcellular Localization of Proteins

It is interesting to study the localization of proteins in subcellular due to several reasons. Here is a collection of the online available softwares that help in predicting subcellular localization of the proteins. Prediction is done with the help of programs which are trained for this purpose, this greatly helps in selection procedure, to select for a protein to work upon. Though there are more I have enlisted some commonly used.
CELLO : CELLO is a multi-class SVM classification system. CELLO uses 4 types of sequence coding schemes: the amino acid composition, the di-peptide composition, the partitioned amino acid composition and the sequence composition based on the physico-chemical properties of amino acids. We combine votes from these classifiers and use the jury votes to determine the final assignment. Yu CS, Lin CJ, Hwang JK: Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions. Protein Science 2004, 13:1402-1406.
PSORTb: Based on a study last performed in 2010, PSORTb v3.0.2 is the most precise bacterial localization prediction tool available. PSORTb v3.0.2 has a number of improvements over PSORTb v2.0.4. Version 2 of PSORTb is maintained here. You can currently submit one or more Gram-positive or Gram-negative bacterial sequences or archaeal sequences in FASTA format. Copy and paste your FASTA-formatted sequences into the textbox below or select a file containing your sequences to upload from your computer.


TMHMM Server: This server is for prediction of transmembrane helices in proteins. You can submit many proteins at once in one fasta file. Please limit each submission to at most 4000 proteins. Please tick the 'One line per protein' option. Please leave time between each large submission.S. Moller, M.D.R. Croning, R. Apweiler. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics, 17(7):646-653, July 2001.

SignalP 3.0 Server: SignalP 3.0 server predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks and hidden Markov models. Locating proteins in the cell using TargetP, SignalP, and related tools Olof Emanuelsson, Søren Brunak, Gunnar von Heijne, Henrik Nielsen Nature Protocols 2, 953-971 (2007).


LOCtree: LOCtree can predict the subcellular localization and DNA-binding propensity of non-membrane proteins in non-plant and plant eukaryotes as well as prokaryotes. LOCtree classifies eukaryotic animal proteins into one of five subcellular classes, while plant proteins are classified into one of six classes and prokaryotic proteins are classified into one of three classes . The novel feature of using a hierarchical architecture is the ability to make intermediate localization class predictions at much higher accuracy's. Another source of improvement is the use of 'noisy' training data. 'Noisy' predictions from LOCKey (SWISS-PROT keyword based annotations) and LOCHom (annotations using sequence homology) are used to train the hierarchical SVMs.


PredictProtein: PredictProtein integrates feature prediction for secondary structure, solvent accessibility, transmembrane helices, globular regions, coiled-coil regions ,structural switch regions, B-values, disorder regions, intra-residue contacts, protein-protein and protein-DNA binding sites, sub-cellular localization, domain boundaries, beta-barrels, cysteine bonds, metal binding sites and disulphide bridges.