Pred-TMBB (2004): http://bioinformatics.biol.uoa.gr/PRED-TMBB/input.jsp
Saturday, August 29, 2015
Pred-TMBB (2004): http://bioinformatics.biol.uoa.gr/PRED-TMBB/input.jsp
PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. The beta-barrel outer membrane proteins constitute one of the two known structural classes of membrane proteins. Whereas there are several different web-based predictors for alpha-helical membrane proteins, currently there is no freely available prediction method for beta-barrel membrane proteins, at least with an acceptable level of accuracy. We present here a web server (PRED-TMBB, http://bioinformatics.biol.uoa.gr/PRED-TMBB) which is capable of predicting the transmembrane strands and the topology of beta-barrel outer membrane proteins of Gram-negative bacteria. The method is based on a Hidden Markov Model, trained according to the Conditional Maximum Likelihood criterion. The model was retrained and the training set now includes 16 non-homologous outer membrane proteins with structures known at atomic resolution. The user may submit one sequence at a time and has the option of choosing between three different decoding methods. The server reports the predicted topology of a given protein, a score indicating the probability of the protein being an outer membrane beta-barrel protein, posterior probabilities for the transmembrane strand prediction and a graphical representation of the assumed position of the transmembrane strands with respect to the lipid bilayer. http://nar.oxfordjournals.org/content/32/suppl_2/W400.long
BOCTOPUS: improved topology prediction of transmembrane β barrel proteins
Transmembrane β barrel proteins (TMBs) are found in the outer membrane of Gram-negative bacteria, chloroplast and mitochondria. They play a major role in the translocation machinery, pore formation, membrane anchoring and ion exchange. TMBs are also promising targets for antimicrobial drugs and vaccines. Given the difficulty in membrane protein structure determination, computational methods to identify TMBs and predict the topology of TMBs are important. Results: Here, we present BOCTOPUS; an improved method for the topology prediction of TMBs by employing a combination of support vector machines (SVMs) and Hidden Markov Models (HMMs). The SVMs and HMMs account for local and global residue preferences, respectively. Based on a 10-fold cross-validation test, BOCTOPUS performs better than all existing methods, reaching a Q3 accuracy of 87%. Further, BOCTOPUS predicted the correct number of strands for 83% proteins in the dataset. BOCTOPUS might also help in reliable identification of TMBs by using it as an additional filter to methods specialized in this task. http://bioinformatics.oxfordjournals.org/content/28/4/516.long
4. BETAWARE (2013): http://www.biocomp.unibo.it/~savojard/betawarecl/
BETAWARE: a machine-learning tool to detect and predict transmembrane beta-barrel proteins in prokaryotes. The annotation of membrane proteins in proteomes is an important problem of Computational Biology, especially after the development of high-throughput techniques that allow fast and efficient genome sequencing. Among membrane proteins, transmembrane β-barrels (TMBBs) are poorly represented in the database of protein structures (PDB) and difficult to identify with experimental approaches. They are, however, extremely important, playing key roles in several cell functions and bacterial pathogenicity. TMBBs are included in the lipid bilayer with a β-barrel structure and are presently found in the outer membranes of Gram-negative bacteria, mitochondria and chloroplasts. Recently, we developed two top-performing methods based on machine-learning approaches to tackle both the detection of TMBBs in sets of proteins and the prediction of their topology. Here, we present our BETAWARE program that includes both approaches and can run as a standalone program on a linux-based computer to easily address in-home massive protein annotation or filtering. http://bioinformatics.oxfordjournals.org/content/29/4/504.abstract
TMBETA-NET: discrimination and prediction of membrane spanning β-strands in outer membrane proteins. We have developed a web-server, TMBETA-NET for discriminating outer membrane proteins and predicting their membrane spanning β-strand segments. The amino acid compositions of globular and outer membrane proteins have been systematically analyzed and a statistical method has been proposed for discriminating outer membrane proteins. The prediction of membrane spanning segments is mainly based on feed forward neural network and refined with β-strand length. Our program takes the amino acid sequence as input and displays the type of the protein along with membrane-spanning β-strand segments as a stretch of highlighted amino acid residues. Further, the probability of residues to be in transmembrane β-strand has been provided with a coloring scheme. We observed that outer membrane proteins were discriminated with an accuracy of 89% and their membrane spanning β-strand segments at an accuracy of 73% just from amino acid sequence information. The prediction server is available at http://psfs.cbrc.jp/tmbeta-net/
7. TMB-HUNT (2005): http://www.bioinformatics.leeds.ac.uk/betaBarrel/
TMB-Hunt: a web server to screen sequence sets for transmembrane β-barrel proteins. TMB-Hunt is a program that uses a modified k-nearest neighbour (k-NN) algorithm to classify protein sequences as transmembrane β-barrel (TMB) or non-TMB on the basis of whole sequence amino acid composition. By including differentially weighted amino acids, evolutionary information and by calibrating the scoring, a discrimination accuracy of 92.5% was achieved, as tested using a rigorous cross-validation procedure. The TMB-Hunt web server, available at www.bioinformatics.leeds.ac.uk/betaBarrel, allows screening of up to 10 000 sequences in a single query and provides results and key statistics in a simple colour coded format. http://nar.oxfordjournals.org/content/33/suppl_2/W188.long
8. TMBPro (2008): suite of specialized predictors for predicting secondary structure, beta-contacts, and tertiary structure of Transmembrane Beta-Barrel (TMB) proteins. http://tmbpro.ics.uci.edu/ TMBpro: secondary structure, β-contact and tertiary structure prediction of transmembrane β-barrel proteins. Transmembrane β-barrel (TMB) proteins are embedded in the outer membranes of mitochondria, Gram-negative bacteria and chloroplasts. These proteins perform critical functions, including active ion-transport and passive nutrient intake. Therefore, there is a need for accurate prediction of secondary and tertiary structure of TMB proteins. Traditional homology modeling methods, however, fail on most TMB proteins since very few non-homologous TMB structures have been determined. Yet, because TMB structures conform to specific construction rules that restrict the conformational space drastically, it should be possible for methods that do not depend on target-template homology to be applied successfully.Results: We develop a suite (TMBpro) of specialized predictors for predicting secondary structure (TMBpro-SS), β-contacts (TMBpro-CON) and tertiary structure (TMBpro-3D) of transmembrane β-barrel proteins. We compare our results to the recent state-of-the-art predictors transFold and PRED-TMBB using their respective benchmark datasets, and leave-one-out cross-validation. Using the transFold dataset TMBpro predicts secondary structure with per-residue accuracy (Q2) of 77.8%, a correlation coefficient of 0.54, and TMBpro predicts β-contacts with precision of 0.65 and recall of 0.67. Using the PRED-TMBB dataset, TMBpro predicts secondary structure with Q2 of 88.3% and a correlation coefficient of 0.75. All of these performance results exceed previously published results by 4% or more. Working with the PRED-TMBB dataset, TMBpro predicts the tertiary structure of transmembrane segments with RMSD <6.0 Å for 9 of 14 proteins. For 6 of 14 predictions, the RMSD is <5.0 Å, with a GDT_TS score greater than 60.0. http://bioinformatics.oxfordjournals.org/content/24/4/513.long
TMB-Hunt: a web server to screen sequence sets for transmembrane β-barrel proteins
TMB-Hunt is a program that uses a modified k-nearest neighbour (k-NN) algorithm to classify protein sequences as transmembrane β-barrel (TMB) or non-TMB on the basis of whole sequence amino acid composition. By including differentially weighted amino acids, evolutionary information and by calibrating the scoring, a discrimination accuracy of 92.5% was achieved, as tested using a rigorous cross-validation procedure. The TMB-Hunt web server, available at www.bioinformatics.leeds.ac.uk/betaBarrel, allows screening of up to 10 000 sequences in a single query and provides results and key statistics in a simple colour coded format. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160145/
transFold: a web server for predicting the structure and residue contacts of transmembrane beta-barrels. Transmembrane β-barrel (TMB) proteins are embedded in the outer membrane of Gram-negative bacteria, mitochondria and chloroplasts. The cellular location and functional diversity of β-barrel outer membrane proteins makes them an important protein class. At the present time, very few non-homologous TMB structures have been determined by X-ray diffraction because of the experimental difficulty encountered in crystallizing transmembrane (TM) proteins. The transFold web server uses pairwise inter-strand residue statistical potentials derived from globular (non-outer-membrane) proteins to predict the supersecondary structure of TMB. Unlike all previous approaches, transFold does not use machine learning methods such as hidden Markov models or neural networks; instead, transFold employs multi-tape S-attribute grammars to describe all potential conformations, and then applies dynamic programming to determine the global minimum energy supersecondary structure. The transFold web server not only predicts secondary structure and TMB topology, but is the only method which additionally predicts the side-chain orientation of transmembrane β-strand residues, inter-strand residue contacts and TM β-strand inclination with respect to the membrane. The program transFold currently outperforms all other methods for accuracy of β-barrel structure prediction. Available at http://bioinformatics.bc.edu/clotelab/transFold. http://nar.oxfordjournals.org/content/34/suppl_2/W189.full
BOMP: a program to predict integral β-barrel outer membrane proteins encoded within genomes of Gram-negative bacteria. This work describes the development of a program that predicts whether or not a polypeptide sequence from a Gram-negative bacterium is an integral β-barrel outer membrane protein. The program, called the β-barrel Outer Membrane protein Predictor (BOMP), is based on two separate components to recognize integral β-barrel proteins. The first component is a C-terminal pattern typical of many integral β-barrel proteins. The second component calculates an integral β-barrel score of the sequence based on the extent to which the sequence contains stretches of amino acids typical of transmembrane β-strands. The precision of the predictions was found to be 80% with a recall of 88% when tested on the proteins with SwissProt annotated subcellular localization in Escherichia coli K 12 (788 sequences) and Salmonella typhimurium (366 sequences). When tested on the predicted proteome of E.coli, BOMP found 103 of a total of 4346 polypeptide sequences to be possible integral β-barrel proteins. Of these, 36 were found by BLAST to lack similarity (E-value score < 1e−10) to proteins with annotated subcellular localization in SwissProt. BOMP predicted the content of integral β-barrels per predicted proteome of 10 different bacteria to range from 1.8 to 3%. BOMP is available at http://www.bioinfo.no/tools/bomp http://nar.oxfordjournals.org/content/32/suppl_2/W394.full
TMBETA-NET: discrimination and prediction of membrane spanning beta-strands in outer membrane proteins. We have developed a web-server, TMBETA-NET for discriminating outer membrane proteins and predicting their membrane spanning beta-strand segments. The amino acid compositions of globular and outer membrane proteins have been systematically analyzed and a statistical method has been proposed for discriminating outer membrane proteins. The prediction of membrane spanning segments is mainly based on feed forward neural network and refined with beta-strand length. Our program takes the amino acid sequence as input and displays the type of the protein along with membrane-spanning beta-strand segments as a stretch of highlighted amino acid residues. Further, the probability of residues to be in transmembrane beta-strand has been provided with a coloring scheme. We observed that outer membrane proteins were discriminated with an accuracy of 89% and their membrane spanning beta-strand segments at an accuracy of 73% just from amino acid sequence information. The prediction server is available at http://psfs.cbrc.jp/tmbeta-net/. http://nar.oxfordjournals.org/content/33/suppl_2/W164.long
14. TMBB-DB (2012): http://beta-barrel.tulane.edu/index.html
TMBB-DB: a transmembrane β-barrel proteome database. We previously reported the development of a highly accurate statistical algorithm for identifying β-barrel outer membrane proteins or transmembrane β-barrels (TMBBs), from genomic sequence data of Gram-negative bacteria (Freeman,T.C. and Wimley,W.C. (2010) Bioinformatics, 26, 1965–1974). We have now applied this identification algorithm to all available Gram-negative bacterial genomes (over 600 chromosomes) and have constructed a publicly available, searchable, up-to-date, database of all proteins in these genomes. For each protein in the database, there is information on (i) β-barrel membrane protein probability for identification of β-barrels, (ii) β-strand and β-hairpin propensity for structure and topology prediction, (iii) signal sequence score because most TMBBs are secreted through the inner membrane translocon and, thus, have a signal sequence, and (iv) transmembrane α-helix predictions, for reducing false positive predictions. This information is sufficient for the accurate identification of most β-barrel membrane proteins in these genomes. In the database there are nearly 50 000 predicted TMBBs (out of 1.9 million total putative proteins). Of those, more than 15 000 are ‘hypothetical’ or ‘putative’ proteins, not previously identified as TMBBs. This wealth of genomic information is not available anywhere else. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3463127/
Thursday, July 16, 2015
Another recent publication from our lab.
Sharma, A. K., Gupta, A., Kumar, S., Dhakan, D. B., & Sharma, V. K. (2015). Woods: A fast and accurate functional annotator and classifier of genomic and metagenomic sequences. Genomics.
Functional annotation of the gigantic metagenomic data is one of the major time-consuming and computationally demanding tasks, which is currently a bottleneck for the efficient analysis. The commonly used homology-based methods to functionally annotate and classify proteins are extremely slow. Therefore, to achieve faster and accurate functional annotation, we have developed an orthology-based functional classifier 'Woods' by using a combination of machine learning and similarity-based approaches. Woods displayed a precision of 98.79% on independent genomic dataset, 96.66% on simulated metagenomic dataset and >97% on two real metagenomic datasets. In addition, it performed >87 times faster than BLAST on the two real metagenomic datasets. Woods can be used as a highly efficient and accurate classifier with high-throughput capability which facilitates its usability on large metagenomic datasets.
The Woods web server is freely accessible at http://metagenomics.iiserb.ac.in/woods/index.php and http://metabiosys.iiserb.ac.in/woods/index.php. The standalone version of Woods can be downloaded from the above web servers and usage instructions are provided in Text S1 and also in the Tutorial section of the web server.
For complete article click here: Woods: A fast and accurate functional annotator and classifier of genomic and metagenomic sequences
For further information and quarries please directly contact Ashok K. Sharma (firstname.lastname@example.org).
Thursday, February 5, 2015
Another work from our group which came sometime back. Please do write to (email@example.com) in case of any problem. Comments are welcome.
The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets.
Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51–100 amino acids and Blind B: 30–50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets.
To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100–150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php.