Bioinformatics has evolved as a great tool for molecular biologists. There are various tools available for saving time required to analyze biological materials be it DNA, RNA, Proteins, etc. I wish to list here a few of the commonly used tools. Please send me suggestions to improve the content. If you like or dislike something, let me know, your inputs matters. contact me: drsanjivk[at]gmail[dot]com
Friday, November 30, 2012
Sunday, March 18, 2012
Research resources for tuberculosis
Tuberculist: http://tuberculist.epfl.ch/ The
TubercuList knowledge base integrates genome details, protein information, drug
and transcriptome data, mutant and operon annotation, bibliography, structural
views and comparative genomics, in a structured manner required for the
rational development of new diagnostic, therapeutic and prophylactic measures
against tuberculosis. With the means of expert curation and continuous updates,
we deliver a broad view of the Mycobacterium tuberculosis genome.
TB Database: http://www.tbdb.org/
Tuberculosis
(TB) is a public health challenge of paramount importance. Control of TB will
require a multifaceted approach integrating efficient public health
interventions with the discovery and use of new vaccines and drugs. TBDatabase
(TBDB) makes available the tools and resources available at the Stanford Microarray Database and the Broad Institute. Anyone is welcome to access the
published data available on the TBDB site without signing in. Some data in TBDB
are unpublished and can therefore only be accessed by the authors and their
collaborators after they sign in. The "Access Polices" page provides more
information about TBDB accounts. A grant from the Bill & Melinda Gates Foundation has enabled
us to create an integrated software platform for tuberculosis drug discovery
and research. Learn more >
webTB.org: http://www.webtb.org/
a resource for Mycobacterium tuberculosis researchers is brought to you
by the TB Structural Genomics Consortium with lots of tools: Gene expression correlation display: This
server presents the pair wise gene expression correlation for two or more
genes. For a set of genes you can get all the pair wise correlations and
express it as a matrix, graphically display it.Operon Search: Search for
operons and directions in the TB genome. Known operons are listed by name.
These data will be expanded as more sources are found. BLAST the TB genome: BLAST a
sequence against the TB genome and other NCBI databases. Target Explorer: Target
Explorer is intended to be an interactive web site that allows researchers in
the Tuberculosis research community to experiment with different target
selection criteria and explore alternative ways of prioritizing gene targets
for experimentation (crystallization structure solution, high-throughput
screening for inhibitor discovery, etc.) Mycobacterial Genome DataBase: The
site is a database of genomes of mycobacterial strains sequenced in the lab of
James C. Sacchettini at Texas A&M University. It provides access to
sequence data (including coverage statistics) and comparison of polymorphisms
among various strains of tuberculosis, with a focus on drug-resistance. The
sequencing is done on an Illumina GenomeAnalyzer II (short reads). The data was
analyzed using customized sequence-assembly methods written by Tom Ioerger and
his group at Texas A&M. The Genome
browser: Graphically scan the entire TB genome for information on each ORF
with indications of any predicted operons and includes links back to the quick
search page for more gene information ORF Progress search tool:
Search the status and progress of MTb ORFs that are targeted and pursued by
consortium members. Structure Gallery:
See the protein structures determined by members of the consortium.Structure Summary pagesThis page is
a front page portal to the structure information on known TB proteins. All
other WebTB servers can be accessed from this page, including the Gallery and JMOL viewer. MTBreg Database: A
database of proteins up- and down-regulated in Mycobacterium
tuberculosis grown under conditions mimicking infection as well as information
on proteins that are regulated by selected transcription factors or other
regulatory proteins. TBDB
Legacy Tools: Legacy tools from the former TBSGC site to search and browse
the TB Genome
TBDreamDB
: http://www.tbdreamdb.com/index.html
Exciting news at TBDreamDB! The Database is currently undergoing complete
redesign and data evaluation. The database will be converted to a fully
searchable relational Database with a new look front end website as well. We
hope to have the beta design up and running towards the end of 2011 early 2012.
Any feedback would be great, simply contact us at the curator email below. Also
please keep sending in any errors you find in the dataset. The data is being
evaluated and corrected as it is being inserted into the new database format.
-Your Emails are much appreciated!
Sanger
Institute Welcome Trust: The
Wellcome Trust and DEFRA has funded the Sanger Instiute to sequence reference
genomes for Mycobacterium africanum http://www.sanger.ac.uk/resources/downloads/bacteria/mycobacterium.html
Mycobacterium is a genus within the order Actinomycetales that
comprises a large number of well characterised species, several of which are
associated with human and animal disease such as tuberculosis and leprosy.
MGDD
: http://mirna.jnu.ac.in/mgdd/index.html
(Mycobacterial Genome Divergence Database) is a repository of genetic
differences among different strains and species of organisms belonging to Mycobacterium tuberculosis complex. The
differences are based on comparison of user chosen organisms. The query
sequences are used to compare against subject sequences. The users can also
choose the type of genetic divergence, that is, SNPs (Single Nucleotide
Polymorphism), insertions, repeat expansion and divergent sequences that they
are interested in. The results from a specific region (based on boundary
defined by nucleotide sequence) or a specific gene can be displayed based on
user's choice. Presently, the database has precomputed analysis from three
different fully sequenced genomes of this complex. These are Mycobacterium
tuberculosis H37Rv, Mycobacterium tuberculosis CDC1551 and Mycobacterium bovis
AF2122/97. In future it will be updated with more strains species as fully
sequenced genomes become available.
MTBreg:
http://www.doe-mbi.ucla.edu/Services/MTBreg/
Proteins up- and down- regulated in Mycobacterium tuberculosis grown
under conditions mimicking infection are included in this database. It also
includes information on proteins that are regulated by selected transcription
factors or other regulatory proteins. The literature data provided here is
complimentary to the databases provided by Michael Strong that include recent TB
computational functional linkages and the Prolinks Database by Peter
Bowers.
MycoperonDB: http://cdfd.org.in/mycoperondb/home.html is a database of computationaly predicted operons and transcriptional units of Mycobacteria. MycoperonDB is setup to provide operon and trancriptional unit information of different mycobacterial species at one place. At present, this database covers five species from mycobacteria and consist of an insilico model of operon organization of 18,053 genes . The operon information provides a basis and a refenece for a comprehensive understanding of how the transcriptional control are encoded in genome. The database has a user friendly web interface which takes simple sequence, gene name or ORF ID as an input and reports the transcription unit and operon associated with the input query.
Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
TB Drug Target Database: http://www.bioinformatics.org/tbdtdb/
TB Drug Target Database contains information on the antituberculer drugs and
the target proteins for the treatment of TB. Informations are avilable on the
drugs and other possible inhibitors including their Structural details, also
the analysis made to the target proteins are made available.
MIRU-VNTRplus web
application: http://www.miru-vntrplus.org/MIRU/index.faces
Molecular typing of bacteria from the Mycobacterium tuberculosis complex
(MTBC) is essential for epidemiological purposes such as investigating the
spreading of specific genotypes. Recently, mycobacterial interspersed
repetitive units (MIRU) typing has become an important method, as it allows
high-throughput, discriminatory and reproducible analysis of clinical isolates.
MIRU is a MTBC specific name of a multiple locus VNTR [variable number of
tandem repeats] analysis (MLVA) bacterial typing scheme. Because of its
portable data format, MIRU typing has the potential to be a versatile tool for
individual strain identification based on large reference databases. However,
specialized bioinformatic web tools to analyze MIRU data and public reference
databases are not available. To meet this need, a collection of 186 strains
representing the major MTBC lineages was used for implementing a web server,
MIRU-VNTRplus (http://www.miru-vntrplus.org/). For each strain species,
lineage, and epidemiologic information was stored together with copy numbers of
24 MIRU loci, spoligotyping patterns, regions of difference (RD) profiles,
single nucleotide polymorphisms (SNPs), susceptibility data, and IS6110 RFLP
fingerprint images. Via the freely accessible MIRU-VNTRplus service users can
compare their strain(s) with the reference strains for the assignment of MTBC
species, lineages, and genotypes. For easier scientific communication a
universal expanding nomenclature (MLVA MtbC15-9) to name different MIRU
genotypes is maintained at the server. Comparisons can be based on MIRU-,
spoligo-, RD-, SNP-, susceptibility-typing data, or by a combination of
different data types. Several distance coefficients are available, including
Jaccard's and categorical. Based upon the respective distance matrix, a
dendrogram can be calculated using UPGMA or neighbor-joining clustering
algorithms. The resulting trees may be exported in various data formats.
MIRU-VNTRplus provides also functions for the user to analyze own strains
without interrogating the reference database. Extensive documentation (manual
and tutorials) of the service is available to make best use of all features.
Web resources for Mycobacterium tuberculosis
Mycobacterium
tuberculosis the causative agent of tuberculosis
(TB) still the major killer among human population, that has co-evolved with
the human civilization so beautifully and so robustly, is really surprising.
The war against tuberculosis is on and we are trying here to integrate major
resources that are available for the disease and the organism all over the web
to a single place. This is one of the open articles that would be upgraded time
to time for acknowledging the recent development in the field, so that you as a
used need not spend much time in data collection and integration. Hope this
helps.
General
information about tuberculosis:
Wiki
page on tuberculosis on
Wikipedia: http://en.wikipedia.org/wiki/Tuberculosis
Tuberculosis, MTB, or TB (short for tubercle bacillus) is a common, and in many cases
lethal, infectious disease caused by various strains of mycobacteria,
usually Mycobacterium tuberculosis.
Tuberculosis usually attacks the lungs but can also affect other parts of the body. It is spread
through the air when people who have an active MTB infection cough, sneeze, or otherwise
transmit their saliva through the air. Most infections in humans result in an asymptomatic,
latent infection, and about one in ten latent infections eventually progress to
active disease, which, if left untreated, kills more than 50% of those
infected.
Wiki
page on M. tuberculosis: Mycobacterium
tuberculosis (MTB) is a pathogenic bacterial species in the genus Mycobacterium
and the causative agent of most cases of tuberculosis
(TB). First discovered in 1882 by Robert Koch,
M. tuberculosis has an unusual, waxy coating on its cell surface
(primarily mycolic acid), which makes the cells impervious to Gram staining,
so acid-fast
detection techniques are used, instead. The physiology of M. tuberculosis
is highly aerobic and requires high levels of oxygen.
Primarily a pathogen of the mammalian respiratory system, MTB infects the lungs. The
most frequently used diagnostic methods for TB are the tuberculin skin test,
acid-fast stain, and chest radiographs. The M. tuberculosis genome was sequenced in
1998.
Centers for Disease Control and Prevention
(CDC) page on tuberculosis:
Tuberculosis (TB) is a disease caused by a bacterium called Mycobacterium
tuberculosis. The bacteria usually attack the lungs, but TB bacteria can
attack any part of the body such as the kidney, spine, and brain. If not
treated properly, TB disease can be fatal. TB disease was once the leading
cause of death in the United States. Focusing on Data and Statistics, Diseases
and Conditions, Emergencies and Disasters, Environmental Health, Healthy Living,
Injury, Violence and Safety, Life Stages and Populations, Travelers' Health, Workplace
Safety and Health, Learn More »
Global
TB database: http://www.who.int/tb/country/global_tb_database/en/index2.html
Access the database to make data queries,
interactive maps and country profiles. Country profiles: Country
profiles provide key indicators, notification and treatment outcome data, and
budget and financing graphs (for a subset of countries, including all
high-burden countries). For high-burden countries, there is also a brief
overview of TB control and epidemiology in the country, and a summary of
achievements, challenges and planned activities related to implementing the
first 5 components of the Stop TB Strategy.
U.S. National
Library of Medicine Resource on tuberculosis: Has a resourced on various fields of the
disease, from the basics to the advanced including the disease data.
WHO
page on tuberculosis: Too many people have undetected TB
for too long; late detection of TB increases their risk of transmitting the
disease to others, having poor health outcomes, or that they and their family
will suffer distress and economic hardship. WHO has produced an overview of
approaches, guidelines and tools to improve early detection of TB. It presents
a framework to assess barriers for early detection and helps identify
appropriate actions.
The
Stop TB Partnership: The Stop TB Partnership is leading
the way to a world without tuberculosis (TB), a disease that is curable but
still kills three people every minute. Founded in 2001, the Partnership's
mission is to serve every person who is vulnerable to TB and ensure that
high-quality treatment is available to all who need it. Together our nearly
1000 partners are a collective force that is transforming the fight against TB
in more than 100 countries. They include international and technical
organizations, government programmes, research and funding agencies,
foundations, NGOs, civil society and community groups and the private sector. We
operate through a secretariat hosted by the World Health Organization (WHO) in
Geneva, Switzerland and seven working groups whose role is to accelerate
progress on access to TB diagnosis and treatment; research and development for
new TB diagnostics, drugs and vaccines; and tackling drug resistant- and
HIV-associated TB. The secretariat is governed by a coordinating board that
sets strategic direction for the global fight against TB.
Tuesday, March 6, 2012
Binding free energy estimation
X-Score:
http://sw16.im.med.umich.edu/software/xtool/
is basically a "scoring
function", which computes the binding affinities of the given ligand
molecules to their target protein. It can be applied to structure-based drug
design studies in combination with molecular docking or de novo
structure generation programs. X-Score is developed by Dr. Renxiao Wang in Dr.
Shaomeng Wang's group at the Department of Internal Medicine, University of
Michigan Medical School. The first paper that reported X-Score was published on
Journal of Computer-Aided Molecular Design, 16: 11–26, 2002. Note that
X-Score was formerly known as X-CScore for a short while. To learn more about
the X-Score program please read the X-Score
on-line manual. X-Score is released to the public for free. The latest
release is X-Score v1.2. You can download the program by clicking the link
below. You will go through a license agreement and fill in some necessary registration
information. Once we have received your signed license agreement, we will send
you instructions of how to log on our server and download the X-Score package.
The X-Score v1.2 package includes the program (executable and source codes),
user manual, examples, references and the protein-ligand complex data set
originally used for developing X-Score. Click here to
get the X-Score v1.2 package now!
eHiTS 2009 Binding
Affinity Prediction: eHiTS has a novel scoring function that takes
advantage of temperature factor information provided in PDB files to give a
more complete picture of interactions. All atoms in a PDB file have a
temperature factor (B) associated with them. This temperature factor
indicates the how much the atom varies from the mean position. Some atom
positions are very precisely defined while others vary greatly, this has a very
strong influence on the weight that should be assigned to the position. The
novel approach in eHiTS uses the probability of the atom position during the
statistic collection to create a statistically derived empirical scoring
function. The eHiTS scoring function provides a scoring function that is
smooth, and accurately represents a wide variety of problems at hand. One of
the most recent studies with eHiTS Score 2009 was done using the PDBBind-2008
dataset. Please see a picture below for correlation of eHiTS Score to the
experimental binding affinity. http://www.simbiosys.ca/ehits/ehits_score.html
DrugScore-Online (DSX):
http://pc1664.pharmazie.uni-marburg.de/drugscore/
DSXONLINE is a web-based
user interface for the knowledge-based scoring function DSX.
DSXONLINE enables you to score (putative) protein-ligand complexes
of your interest, to browse and download the scoring results, and to visualize
the per-atom score contributions (see section Visualization).
DSX: DSX pair potentials are derived in analogy the the
DrugScore formalism developed by Gohlke et al. However, another set of atom
types is used and contact types are clustered to circumvent problems with the
reference state. Torsion potentials and solvent accessible surface ratio potentials
are derived using the same formalism. For more details see the upcoming
publication which is currently in preparation. For more consistences, DSX
always assigns its own atom types and hydrogens are not regarded. If you have
ligand poses from GOLD docking where water molecules were included it is
possible to consider the corresponding ON-marked waters in the solutions file.
Please note that there are even more options (like considering solutions from a
docking with flexible receptor residues) available in the DSX standalone
version, which will be freely available after publication. Visualization
of the per-atom score contributions: The visualization of the per-atom
score contributions is an intuitive way to learn about differences between
putative ligand geometries, the effects of scaffold modifications or about the
importance of certain binding regions.
BAPPL serve:
http://www.scfbio-iitd.res.in/software/drugdesign/bappl.jsp
Binding Affinity Prediction of Protein-Ligand (BAPPL) server computes the
binding free energy of a non-metallo protein-ligand complex using an all atom
energy based empirical scoring function BAPPL server provides two methods as
options: Method 1 : Input should be an energy minimized
protein-ligand complex with hydrogens added, protonation states, partial atomic
charges and van der Waals parameters (R* and ε) assigned for each atom. The
server directly computes the binding affinity of the complex using the assigned
parameters. For format specifications on the input, please refer to the README file. Method 2 : Input should
be an energy minimized protein-ligand complex with hydrogens added and
protonation states assigned. The net charge on the ligand should be specified.
The server derives the partial atomic charges of the ligand using the AM1-BCC
procedure and GAFF force
field for van der Waals parameters. Cornell et al. force field is used to
assign partial atomic charges and van der Waals parameters for the proteins.
For format specifications on the input, please refer to the README file.
PreDDICTA:
http://www.scfbio-iitd.res.in/software/drugdesign/preddicta.jsp
employs an all-atom energy based function for computing the binding affinity of
a DNA oligomer with a non-covalently bound drug. The function has been validated
against experimental binding free energies, ΔGo bind and change in
melting temperature of the DNA oligomer upon drug binding, ΔTm, for
50 DNA Drug complexes. Click here to access the DNA-drug complex dataset. DNA is
an important anticancer/antibiotic target and PreDDICTA can be employed to aid
and expedite rational drug design attempts for DNA.Click
here to know more about DNA Drug interaction How to use PreDDICTA: 1. Tool 1 incorporates the PreDDICTA energy function which
calculates the electrostatics, van der Waals, rotational and translational
entropy and hydration free energy change for the DNA-drug complex. These are
summed to yield the total calculated binding energy which is converted to the
binding free energy and ΔTm based on the relations reported in. Input
for this tool is a PDB file for any DNA-minor groove binder complex, conforming
to the standard PDB format, as described in Input format 2. Tool 2 simply converts any number input as ΔTm
to the corresponding expected binding free energy, using the relation between
these two quantities reported in. 3. Tool 3 converts any number input as binding free energy to
the corresponding expected ΔTm value, using the relation between
these two quantities reported in.
PharmaGist: http://bioinfo3d.cs.tau.ac.il/pharma/about.html
Predicting molecular interactions
is a major goal in rational drug design. Pharmacophore, which is the spatial
arrangement of features that is essential for a molecule to interact with a
specific target receptor, is important for achieving this goal. PharmaGist is a
freely available web server for pharmacophore detection. The employed method is
ligand based. It does not require the structure of the target receptor. Instead,
the input is a set of structures of drug-like molecules that are known to bind
to the receptor. We compute candidate pharmacophores by multiple flexible
alignments of the input ligands. The main innovation of this approach is that
the flexibility of the input ligands is handled explicitly and in deterministic
manner within the alignment process. The method is highly efficient, where a
typical run with up to 32 drug-like molecules takes seconds to a few minutes on
a stardard PC. Another important characteristic of the method is the capability
of detecting pharmacophores shared by different subsets of input molecules.
This capability is a key advantage when the ligands belong to different binding
modes or when the input contains outliers. The download version includes
virtual screening capability. The performance of PharmaGist for virtual
screening was successfully evaluated on a commonly used data set of G-Protein
Coupled Receptor alpha1A. Additionally, a large-scale evaluation using the DUD
(directory of useful decoys) data set was performed. DUD contains 2950 active
ligands for 40 different receptors, with 36 decoy compounds for each active
ligand. PharmaGist enrichment rates are comparable with other state-of-the-art
tools for virtual screening.
IC50-to-Ki
converter: http://botdb.abcc.ncifcrf.gov/toxin/kiConverter.jsp
The IC50-to-Ki
converter computes Ki values from experimentally determined IC50
values for inhibitors of enzymes that obey classic Michaelis-Menten kinetics
and of protein-ligand interactions. A new web-server tool estimates Ki
values from experimentally determined IC50 values for
inhibitors of enzymes and of binding reactions between macromolecules (e.g.
proteins, polynucleic acids) and ligands. This converter was developed to
enable end users to help gauge the quality of the underlying assumptions used
in these calculations which depend on the type of mechanism of inhibitor action
and the concentrations of the interacting molecular species. Additional
calculations are performed for nonclassical, tightly bound inhibitors of
enzyme-substrate or of macromolecule-ligand systems in which free, rather than
total concentrations of the reacting species are required. Required
user-defined input values include the total enzyme (or another target molecule)
and substrate (or ligand) concentrations, the Km of the
enzyme-substrate (or the Kd of the target-ligand) reaction,
and the IC50 value. Assumptions and caveats for these
calculations are discussed along with examples taken from the literature. The
host database for this converter contains kinetic constants and other data for
inhibitors of the proteolytic clostridial neurotoxins (http://botdb.abcc.ncifcrf.gov/toxin/kiConverter.jsp).
Sunday, February 26, 2012
In-silico Binding Site Prediction in Proteins
CASTp : http://sts.bioengr.uic.edu/castp/
Computed Atlas of Surface Topography
of proteins (CASTp) provides an online resource for locating, delineating and
measuring concave surface regions on three-dimensional structures of proteins.
These include pockets located on protein surfaces and voids buried in the
interior of proteins. The measurement includes the area and volume of pocket or
void by solvent accessible surface model (Richards' surface) and by molecular
surface model (Connolly's surface), all calculated analytically. CASTp can be
used to study surface features and functional regions of proteins. CASTp
includes a graphical user interface, flexible interactive visualization, as
well as on-the-fly calculation for user uploaded structures. CASTp is updated
daily and can be accessed at http://cast.engr.uic.edu.
LigASite:
http://www.bigre.ulb.ac.be/Users/benoit/LigASite/index.php?home
is a gold-standard dataset of biologically relevant binding sites in protein
structures. It consists of proteins with one unbound structure and at least one
structure of the protein-ligand complex. Both a redundant and a non-redundant
(sequence identity lower than 25%) version is available. Quaternary structures
proposed by PISA (3)
are used for all structures in the dataset.
PDBeMotif:
http://www.ebi.ac.uk/pdbe-site/pdbemotif/
is an extremely fast and powerful search tool that facilitates exploration
of the Protein Data Bank (PDB) by combining protein sequence, chemical
structure and 3D data in a single search. Currently it is the only tool that
offers this kind of integration at this speed. PDBeMotif can be used to examine
the characteristics of the binding sites of single proteins or classes of
proteins such as Kinases and the conserved structural features of their
immediate environments either within the same specie or across different
species. For example, it can highlight a conserved activation loop common to
protein kinases, which is important in regulating activity and is marked by
conserved DFG and APE motifs at the start and end of the loop, respectively.
The prediction of the effect of modifications to small molecules that bind to
the active and/or regulatory sites of proteins on their efficacy can be based
on the outcome of analytic work done using PDBeMotif.
fPOP: http://pocket.uchicago.edu/fpop/ (footprinting
Pockets Of Proteins, http://pocket.uchicago.edu/fpop/) is a database of the protein
functional surfaces identified by shape analysis. In this relational database,
we collected the spatial patterns of protein binding sites including both holo
and apo forms from more than 40,000 structures. To identify protein binding
sites, we model the shape of a split pocket induced by a binding ligand(s).
Essentially, we use a purely geometric method to extract site-specific spatial
patterns of split pockets as templates to match those from unbound structures.
To perform an effective shape comparison, we utilize the Smith-Waterman
algorithm to footprint an unbound pocket fragment with those selected from the
canonical functional surfaces of >19,000 structures in the SplitPocket
(http://pocket.uchicago.edu/). The pairwise alignment of the unbound and
split-pocket fragments is superimposed to evaluate the local structural
similarity for detecting the unbound split characteristic through the RMSD
measurement. Furthermore, we conduct a large-scale computation to
systematically identify binding sites of proteins. In addition to the geometric
measurements, we extensively measure the propensity of surface conservation
encapsulated in the evolutionary history.(more)
metaPocket: http://metapocket.eml.org/ is a meta server to identify pockets on
protein surface to predict ligand-binding sites. The identification of
ligand-binding sites is often the starting point for protein function
annotation and structure-based drug design. Many computational methods for the
prediction of ligand-binding sites have been developed in recent decades. Here
we present a consensus method metaPocket, in which the predicted sites from
four methods: LIGSITEcs, PASS, Q-SiteFinder, and SURFNET are
combined together to improve the prediction success rate. All these methods are
evaluated on two datasets of 48 unbound/bound structures and 210 bound
structures. The comparison results show that metaPocket improves the success
rate from 70 to 75% at the
top 1 prediction. MetaPocket is available at http://metapocket.eml.org.
PocketQuery:
http://pocketquery.csb.pitt.edu/
is a web service for interactively
exploring not only hot spot and anchor residues, but hot regions,
defined by clusters of residues, at the interface of protein-protein
interactions. An assortment of metrics, including changes in solvent accessible
surface area, energy-based scores, and sequence conservation, are available to
screen and sort clusters of residues. PocketQuery was developed by David Koes from the Camacho Lab in the Department of Computational and System Biology
at the University of Pittsburgh.
IBIS: http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi
is
the NCBI Inferred Biomolecular Interactions Server. For a given protein
sequence or structure query, IBIS reports physical interactions observed in
experimentally-determined structures for this protein. IBIS also
infers/predicts interacting partners and binding sites by homology, by
inspecting the protein complexes formed by close homologs of a given query. To
ensure biological relevance of inferred binding sites, the IBIS algorithm
clusters binding sites formed by homologs based on binding site sequence and
structure conservation.
3DLigandStie: http://www.sbg.bio.ic.ac.uk/~3dligandsite/
is
an automated method for the prediction of ligand binding sites. Users can
either submit a sequence or a protein structure. If a sequence is submitted
then Phyre is run to predict the structure. The structure is then ussed to
search a structural library to identify homologous structures with bound
ligands. These ligands are superimposed onto the protein structure to predict a
ligand binding site.
SitesBase:
http://www.modelling.leeds.ac.uk/sb/
is a database of known ligand binding sites within the PDB which is navigable
by PDB identifier or ligand 3 letter code e.g. NAD. Each binding site has a
frequently updated register of structurally similar binding sites sharing
atomic similarity detected by geometric hashing (Brakoulias and Jackson 2004).
Multiple alignments, structural superpositions and links to other structural
databases are also available enabling further analysis.
PROSURFER:
http://163.43.140.95/top contains
information about structural similarities with respect to the query surfaces. A
pocket search algorithm detected 48,347 potential ligand binding sites from the
9,708 non-redundant protein entries in the PDB database. All-against-all
structural comparison was performed for the predicted sites, and the similar
sites with the Z-score ≥ 2.5 were selected. These results can be accessed by
the PDB code or ligand name.
KBDOCK:
http://kbdock.loria.fr/index.php
is a 3D database system that defines and spatially clusters protein binding
sites for knowledge-based protein docking. KBDOCK integrates protein
domain-domain interaction information from 3DID and sequence alignments
from PFAM together with structural
information from the PDB in order to analyse the
spatial arrangements of DDIs by Pfam family, and to propose structural
templates for protein docking. [More]
Pocketome:
http://www.pocketome.org/
The Pocketome is an encyclopedia of conformational ensembles of all
druggable binding sites that can be identified experimentally from co-crystal
structures in the Protein Data
Bank.
sc-PDB:
http://cheminfo.u-strasbg.fr:8080/scPDB/2011/db_search/about_scpdb.html To assist structure-based approaches in
drug design, we have processed the PDB to identify binding sites suitable for
the docking of a drug-like ligand and we have so created a database called
sc-PDB. The sc-PDB database provides separated MOL2 files for the ligand, its
binding site and the corresponding protein chain(s). Ions and cofactors at the
vicinity of the ligand are included in the protein. More details about the
sc-PDB scope, its content and its evolution during the 2004-2009 period are
provided in a pdf
document.
The FunFOLD
Binding Site Residue Prediction Server: BACKGROUND: The accurate prediction of ligand binding residues from amino
acid sequences is important for the automated functional annotation of novel
proteins. In the previous two CASP experiments, the most successful methods in
the function prediction category were those which used structural
superpositions of 3D models and related templates with bound ligands in order
to identify putative contacting residues. However, whilst most of this
prediction process can be automated, visual inspection and manual adjustments
of parameters, such as the distance thresholds used for each target, have often
been required to prevent over prediction. Here we describe a novel method
FunFOLD, which uses an automatic approach for cluster identification and
residue selection. The software provided can easily be integrated into existing
fold recognition servers, requiring only a 3D model and list of templates as
inputs. A simple web interface is also provided allowing access to non-expert
users. The method has been benchmarked against the top servers and manual
prediction groups tested at both CASP8 and CASP9.RESULTS: The FunFOLD method
shows a significant improvement over the best available servers and is shown to
be competitive with the top manual prediction groups that were tested at CASP8.
The FunFOLD method is also competitive with both the top server and manual
methods tested at CASP9. When tested using common subsets of targets, the
predictions from FunFOLD are shown to achieve a significantly higher mean
Matthews Correlation Coefficient (MCC) scores and Binding-site Distance Test
(BDT) scores than all server methods that were tested at CASP8. Testing on the
CASP9 set showed no statistically significant separation in performance between
FunFOLD and the other top server groups tested. CONCLUSIONS: The FunFOLD
software is freely available as both a standalone package and a prediction
server, providing competitive ligand binding site residue predictions for
expert and non-expert users alike. The software provides a new fully automated
approach for structure based function prediction using 3D models of proteins.
ProBiS: http://probis.cmm.ki.si/index.php algorithm for detection of structurally
similar protein binding sites by local structural alignment. Motivation:
Exploitation of locally similar 3D patterns of physicochemical properties on
the surface of a protein for detection of binding sites that may lack sequence
and global structural conservation. Results: An algorithm, ProBiS is
described that detects structurally similar sites on protein surfaces by local
surface structure alignment. It compares the query protein to members of a
database of protein 3D structures and detects with sub-residue precision,
structurally similar sites as patterns of physicochemical properties on the
protein surface. Using an efficient maximum clique algorithm, the program
identifies proteins that share local structural similarities with the query
protein and generates structure-based alignments of these proteins with the
query. Structural similarity scores are calculated for the query protein's
surface residues, and are expressed as different colors on the query protein
surface. The algorithm has been used successfully for the detection of
protein–protein, protein–small ligand and protein–DNA binding sites. Availability:
The software is available, as a web tool, free of charge for academic users at http://probis.cmm.ki.si
Active Site
prediction: http://www.scfbio-iitd.res.in/dock/ActiveSite_new.jsp
Active Site Prediction of Protein server computes the cavities in a given
protein.
DEPTH:
http://mspc.bii.a-star.edu.sg/tankp/run_depth.html
Depth measures the closest distance of a residue/atom to bulk solvent. Accessible
surface area is a parameter that is widely used in analyses of protein
structure and stability. However accessible surface area does not distinguish
between atoms just below the protein surface and those in the core of the
protein. In order to differentiate between such buried residues, we describe a
computational procedure for calculating the depth of a residue from the protein
surface. A detailed description of the computation of depth can be found here.
FINDSITE:
http://cssb.biology.gatech.edu/findsite
FINDSITE is a threading-based
binding site prediction/protein functional inference/ligand screening algorithm
that detects common ligand binding sites in a set of evolutionarily related
proteins. Crystal structures as well as protein models can be used as the
target structures.
PocketDepth:
http://proline.physics.iisc.ernet.in/pocketdepth/ A new depth based algortihm for
identification of ligand binding sites. Abstract: Computational methods for
identifying and predicting functional sites in protein structures are
increasingly becoming important in structural biology and bioinformatics not
only for understanding the function of the molecule in detail but also for
structure-based design of possible ligands and potential drugs as well as
modified protein molecules. While there are a few structure based prediction
methods already available, given the complexity and diversity of protein
structural types, there is still a great need to explore newer methods and
concepts to develop accurate, versatile and efficient binding site prediction
algorithms. We have developed a new method PocketDepth, for identification of
binding sites in proteins. The method is purely geometry-based and proceeds in
two stages, labeling of grid cells with depth factors followed by a depth based
clustering that uses neighbourhood information. Depth is an important parameter
considered during protein structure visualization and analysis but has been
used more often intuitively than systematically. Our current implementation of
depth reflects how central a given sub-space is to a putative pocket rather
than reflecting merely how far away it is situated from the nearest external
surface of the protein. We have tested the algorithm against PDBbind, a large
curated set of 1091 proteins obtained from PDB. A prediction was considered a
true-positive if the predicted pocket had at-least 10% overlap with the actual
ligand. The prediction accuracy using this set was about 96%. Moreover, 87% of
the true-positives were identified within the first five ranks for each
protein, of which 55% are in the first rank itself. 77% of the predictions had
at least 50% overlap with the experimentally observed ligand. High prediction
rates were again observed, when the method was tested against a data-set of
apo-proteins and compared with their respective ligand complexes. A comparison
of our method with four other widely used methods for a chosen representative
set is also presented.
GHECOM 1.0 :
http://strcomp.protein.osaka-u.ac.jp/ghecom/ Grid-based HECOMi finder. A program for finding multi-scale pockets on
protein surfaces using mathematical morphology
Pocket-Finder: http://www.modelling.leeds.ac.uk/pocketfinder/
is based on the Ligsite algorithm written by Hendlich
et al. (1997). Pocket-Finder was written to compare pocket detection
with our new ligand binding site detction algorithm Q-SiteFinder.
Screen2:
http://luna.bioc.columbia.edu/honiglab/screen2/cgi-bin/screen2.cgi
is a tool for identifying protein cavities and
computing cavity attributes that can be applied for classification and
analysis. The original Screen, written by Murad Nayal, was dependent on the
obsolete Irix platform and is no longer available. Screen2 was reengineered by
Brian Y. Chen for efficiency and compatibility, and made accessible as a web
service by Raquel Norel.
ConCavity: http://compbio.cs.princeton.edu/concavity/
Identifying a protein's functional sites is an
important step towards characterizing its molecular function. Numerous
structure- and sequence-based methods have been developed for this problem.
Here we introduce ConCavity, a small molecule binding site prediction
algorithm that integrates evolutionary sequence conservation estimates with
structure-based methods for identifying protein surface cavities. In
large-scale testing on a diverse set of single- and multi-chain protein
structures, we show that ConCavity substantially outperforms existing
methods for identifying both 3D ligand binding pockets and individual ligand
binding residues. As part of our testing, we perform one of the first direct
comparisons of conservation-based and structure-based methods. We find that the
two approaches provide largely complementary information, which can be combined
to improve upon either approach alone. We also demonstrate that ConCavity
has state-of-the-art performance in predicting catalytic sites and drug binding
pockets. Overall, the algorithms and analysis presented here significantly
improve our ability to identify ligand binding sites and further advance our
understanding of the relationship between evolutionary sequence conservation
and structural and functional attributes of proteins. Data, source code, and
prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).
MultiBind and MAPPIS:
http://bioinfo3d.cs.tau.ac.il/MultiBind/index.html
Web servers for multiple alignment of protein 3D
binding sites and their interactions. Analysis of
protein–ligand complexes and recognition of spatially conserved
physico-chemical properties is important for the prediction of binding and
function. Here, we present two webservers for multiple alignment and
recognition of binding patterns shared by a set of protein structures. The
first webserver, MultiBind (http://bioinfo3d.cs.tau.ac.il/MultiBind),
performs multiple alignment of protein binding sites. It recognizes the common
spatial chemical binding patterns even in the absence of similarity of the
sequences or the folds of the compared proteins. The input to the MultiBind
server is a set of protein-binding sites defined by interactions with small
molecules. The output is a detailed list of the shared physico-chemical binding
site properties. The second webserver, MAPPIS (http://bioinfo3d.cs.tau.ac.il/MAPPIS),
aims to analyze protein–protein interactions. It performs multiple alignment of
protein–protein interfaces (PPIs), which are regions of interaction between two
protein molecules. MAPPIS recognizes the spatially conserved physico-chemical
interactions, which often involve energetically important hot-spot residues
that are crucial for protein–protein associations. The input to the MAPPIS
server is a set of protein-protein complexes. The output is a detailed list of
the shared interaction properties of the interfaces.
MolAxis:
http://bioinfo3d.cs.tau.ac.il/MolAxis/ is
a tool for the identification of high clearance pathways or corridors
which represent molecular channels in the complement space of proteins. It is
extremely efficient because it samples the medial axis of the complement of the
molecule, reducing the problem dimension to two, since the medial axis is
composed of surface patches. It is designed to analyze proteins channels,
calculate pore dimensions and analyze atom accessibility. MolAxis reads files
in the standard Protein Data Bank format (PDB) containing a single frame or
multiple frames generated by molecular dynamics (MD) simulations. MolAxis
handles two distinct scenarios: It computes channels that connect a single
point (like an inner chamber) to the bulk solvent, and it also computes
transmembrane (TM) channels. MolAxis has a friendly web interface (see the Web
Server tab). It also has a stand-alone version, statically compiled for
linux, which can be downloaded from the Download tab.
fpocket: http://fpocket.sourceforge.net/ fpocket is a very fast open source protein pocket
(cavity) detection algorithm based on Voronoi tessellation. It was developed in
the C programming language and is currently available as command line driven
program. A GUI is in development and mdpocket (fpocket on md trajectories) is
out now. fpocket includes two other programs (dpocket & tpocket) that allow
you to extract pocket descriptors and test own scoring functions respectively.
Furthermore a nifty druggability prediction score has been added to fpocket
recently. As the algorithm is very fast it can be used on a large scale level
(PDB size for instance). If you use fpocket for publication, please cite : Vincent
Le Guilloux, Peter Schmidtke and Pierre Tuffery, "Fpocket: An open
source platform for ligand pocket detection", BMC Bioinformatics, 2009,
10:168
SuMo: http://sumo-pbil.ibcp.fr/cgi-bin/sumo-welcome
allows you to screen the Protein Data Bank (PDB) for
finding ligand binding sites matching your protein structure or inversely, for
finding protein structures matching a given site in your protein. This method
is neither based on aminoacid sequence nor on fold comparisons. Priority is
given to biological relevance. SuMo uses its own heuristics for defining ligand
binding sites. Automatically selected ligand binding sites are extracted from
PDB structure files and stored into SuMo's own database.
CAVER: http://www.caver.cz/ CAVER is a software
tool for analysis and visualization of tunnels and channels in protein
structures. Tunnels are void pathways leading from a cavity buried in a protein
core to the surrounding solvent. Unlike tunnels, channels lead through the
protein structure and their both endings are opened to the surrounding solvent.
Studying of these pathways is highly important for drug design and molecular
enzymology.
SiteHound:
http://scbx.mssm.edu/sitehound/sitehound-download/download.html SiteHound identifies protein regions
that are likely to interact with ligands. The only input files required by
SITEHOUND are the PDB file of the protein and the Molecular Interaction Field
(MIFs) or Affinity Map for that protein structure structure. EasyMIFs is
provided as a tool to calculate MIFs, alternatively AutoGrid (part of the
AutoDock suite developed by Arthur Olson’s group at The Scripps Research
Insitute) or the SiteHound-web server can be used to produce Affinity maps or
MIFs. A python script named 'auto.py' is provided in the package and can
be used to perform binding site identification in a fully automated fashion.
The script will prepare the protein PDB file, compute a Molecular Interaction
Fields map with EasyMIFs and carry out binding site identification using
SiteHound. It is also possible to use EasyMIFs and SiteHound separately.
SURFNET: http://www.biochem.ucl.ac.uk/~roman/surfnet/surfnet.html
The SURFNET program generates surfaces and void
regions between surfaces from coordinate data supplied in a PDB file.
MSPocket: http://appserver.biotec.tu-dresden.de/MSPocket/
is an orientation independent program for the
detection and graphical analysis of protein surface pockets [Zhu2011]. The
approach is based on the solvent excluded surfaces generated by MSMS [Sanner1996].
Pfinder
: http://pdbfun.uniroma2.it/pfinder/index.html Pfinder is a bioinformatic method for the
prediction of phosphate-binding sites in protein structures. Given a protein
structure, Pfinder compares it with a set of 215 highly conserved structural
motifs known to bind the phosphate moiety of phosphorylated ligands.
VOIDOO:
http://xray.bmc.uu.se/usf/voidoo.html
is
a program for detection of cavities in macromolecular structures. It uses an
algorithm that makes it possible to detect even certain types of cavities that
are connected to "the outside world". Three different types of cavity
can be handled by VOIDOO: Vanderwaals cavities (the complement of the
molecular Vanderwaals surface), probe-accessible cavities (the cavity volume
that can be occupied by the centres of probe atoms) and MS-like probe-occupied
cavities (the volume that can be occupied by probe atoms, i.e. including
their radii).
PocketPicker:
http://gecco.org.chemie.uni-frankfurt.de/pocketpicker/index.html
Background: Identification and
evaluation of surface binding-pockets and occluded cavities are initial steps
in protein structure-based drug design. Characterizing the active site's shape
as well as the distribution of surrounding residues plays an important role for
a variety of applications such as automated ligand docking or in situ modeling.
Comparing the shape similarity of binding site geometries of related proteins
provides further insights into the mechanisms of ligand binding. Results: We present PocketPicker, an
automated grid-based technique for the prediction of protein binding pockets
that specifies the shape of a potential binding-site with regard to its
buriedness. The method was applied to a representative set of protein-ligand
complexes and their corresponding apo-protein structures to evaluate the
quality of binding-site predictions. The performance of the pocket detection
routine was compared to results achieved with the existing methods CAST,
LIGSITE, LIGSITEcs, PASS and SURFNET. Success rates PocketPicker
were comparable to those of LIGSITEcs and outperformed the other
tools. We introduce a descriptor that translates the arrangement of grid points
delineating a detected binding-site into a correlation vector. We show that
this shape descriptor is suited for comparative analyses of similar
binding-site geometry by examining induced-fit phenomena in aldose reductase. This
new method uses information derived from calculations of the buriedness of
potential binding-sites. Conclusion: The
pocket prediction routine of PocketPicker is a useful tool for identification
of potential protein binding-pockets. It produces a convenient representation
of binding-site shapes including an intuitive description of their
accessibility. The shape-descriptor for automated classification of
binding-site geometries can be used as an additional tool complementing
elaborate manual inspections.
McVol:
http://www.bisb.uni-bayreuth.de/index.php?page=data/mcvol/mcvol This
program was developed to integrate the molecular volume, solven accessible
volume an Van der Waals volume of proteins using a Monte carlo algorithm. Based
on this calculations, McVol is also able to identify internal cavities as well
as surface clefts und fill these cavities with water molecules. Additionally, a
membrane of dummy atoms can be placed as a disc atound the protein. The program
is available under the Gnu Public Licence. A precompiled binary (X86) can be
downloaded free of charge from here (when the associated paper is published).
Subscribe to:
Posts (Atom)