Bioinformatics Tools: March 2012

Sunday, March 18, 2012

Research resources for tuberculosis

Tuberculist: http://tuberculist.epfl.ch/ The TubercuList knowledge base integrates genome details, protein information, drug and transcriptome data, mutant and operon annotation, bibliography, structural views and comparative genomics, in a structured manner required for the rational development of new diagnostic, therapeutic and prophylactic measures against tuberculosis. With the means of expert curation and continuous updates, we deliver a broad view of the Mycobacterium tuberculosis genome.

TB Database: http://www.tbdb.org/ Tuberculosis (TB) is a public health challenge of paramount importance. Control of TB will require a multifaceted approach integrating efficient public health interventions with the discovery and use of new vaccines and drugs. TBDatabase (TBDB) makes available the tools and resources available at the Stanford Microarray Database and the Broad Institute. Anyone is welcome to access the published data available on the TBDB site without signing in. Some data in TBDB are unpublished and can therefore only be accessed by the authors and their collaborators after they sign in. The "Access Polices" page provides more information about TBDB accounts. A grant from the Bill & Melinda Gates Foundation has enabled us to create an integrated software platform for tuberculosis drug discovery and research. Learn more >

webTB.org: http://www.webtb.org/ a resource for Mycobacterium tuberculosis researchers is brought to you by the TB Structural Genomics Consortium with lots of tools: Gene expression correlation display: This server presents the pair wise gene expression correlation for two or more genes. For a set of genes you can get all the pair wise correlations and express it as a matrix, graphically display it.Operon Search: Search for operons and directions in the TB genome. Known operons are listed by name. These data will be expanded as more sources are found. BLAST the TB genome: BLAST a sequence against the TB genome and other NCBI databases. Target Explorer: Target Explorer is intended to be an interactive web site that allows researchers in the Tuberculosis research community to experiment with different target selection criteria and explore alternative ways of prioritizing gene targets for experimentation (crystallization structure solution, high-throughput screening for inhibitor discovery, etc.) Mycobacterial Genome DataBase: The site is a database of genomes of mycobacterial strains sequenced in the lab of James C. Sacchettini at Texas A&M University. It provides access to sequence data (including coverage statistics) and comparison of polymorphisms among various strains of tuberculosis, with a focus on drug-resistance. The sequencing is done on an Illumina GenomeAnalyzer II (short reads). The data was analyzed using customized sequence-assembly methods written by Tom Ioerger and his group at Texas A&M. The Genome browser: Graphically scan the entire TB genome for information on each ORF with indications of any predicted operons and includes links back to the quick search page for more gene information ORF Progress search tool: Search the status and progress of MTb ORFs that are targeted and pursued by consortium members. Structure Gallery: See the protein structures determined by members of the consortium.Structure Summary pagesThis page is a front page portal to the structure information on known TB proteins. All other WebTB servers can be accessed from this page, including the Gallery and JMOL viewer. MTBreg Database: A database of proteins up- and down-regulated in Mycobacterium tuberculosis grown under conditions mimicking infection as well as information on proteins that are regulated by selected transcription factors or other regulatory proteins. TBDB Legacy Tools: Legacy tools from the former TBSGC site to search and browse the TB Genome

TBDreamDB : http://www.tbdreamdb.com/index.html Exciting news at TBDreamDB! The Database is currently undergoing complete redesign and data evaluation. The database will be converted to a fully searchable relational Database with a new look front end website as well. We hope to have the beta design up and running towards the end of 2011 early 2012. Any feedback would be great, simply contact us at the curator email below. Also please keep sending in any errors you find in the dataset. The data is being evaluated and corrected as it is being inserted into the new database format. -Your Emails are much appreciated!

Sanger Institute Welcome Trust: The Wellcome Trust and DEFRA has funded the Sanger Instiute to sequence reference genomes for Mycobacterium africanum http://www.sanger.ac.uk/resources/downloads/bacteria/mycobacterium.html Mycobacterium is a genus within the order Actinomycetales that comprises a large number of well characterised species, several of which are associated with human and animal disease such as tuberculosis and leprosy.

MGDD : http://mirna.jnu.ac.in/mgdd/index.html (Mycobacterial Genome Divergence Database) is a repository of genetic differences among different strains and species of organisms belonging to Mycobacterium tuberculosis complex. The differences are based on comparison of user chosen organisms. The query sequences are used to compare against subject sequences. The users can also choose the type of genetic divergence, that is, SNPs (Single Nucleotide Polymorphism), insertions, repeat expansion and divergent sequences that they are interested in. The results from a specific region (based on boundary defined by nucleotide sequence) or a specific gene can be displayed based on user's choice. Presently, the database has precomputed analysis from three different fully sequenced genomes of this complex. These are Mycobacterium tuberculosis H37Rv, Mycobacterium tuberculosis CDC1551 and Mycobacterium bovis AF2122/97. In future it will be updated with more strains species as fully sequenced genomes become available.

MTBreg: http://www.doe-mbi.ucla.edu/Services/MTBreg/ Proteins up- and down- regulated in Mycobacterium tuberculosis grown under conditions mimicking infection are included in this database. It also includes information on proteins that are regulated by selected transcription factors or other regulatory proteins. The literature data provided here is complimentary to the databases provided by Michael Strong that include recent TB computational functional linkages and the Prolinks Database by Peter Bowers.

MycoperonDB: http://cdfd.org.in/mycoperondb/home.html is a database of computationaly predicted operons and transcriptional units of Mycobacteria. MycoperonDB is setup to provide operon and trancriptional unit information of different mycobacterial species at one place. At present, this database covers five species from mycobacteria and consist of an insilico model of operon organization of 18,053 genes . The operon information provides a basis and a refenece for a comprehensive understanding of how the transcriptional control are encoded in genome. The database has a user friendly web interface which takes simple sequence, gene name or ORF ID as an input and reports the transcription unit and operon associated with the input query.

GenoMycDB Browser: http://157.86.176.108/~catanho/genomycdb/

Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.

TB Drug Target Database: http://www.bioinformatics.org/tbdtdb/ TB Drug Target Database contains information on the antituberculer drugs and the target proteins for the treatment of TB. Informations are avilable on the drugs and other possible inhibitors including their Structural details, also the analysis made to the target proteins are made available.

MIRU-VNTRplus web application: http://www.miru-vntrplus.org/MIRU/index.faces Molecular typing of bacteria from the Mycobacterium tuberculosis complex (MTBC) is essential for epidemiological purposes such as investigating the spreading of specific genotypes. Recently, mycobacterial interspersed repetitive units (MIRU) typing has become an important method, as it allows high-throughput, discriminatory and reproducible analysis of clinical isolates. MIRU is a MTBC specific name of a multiple locus VNTR [variable number of tandem repeats] analysis (MLVA) bacterial typing scheme. Because of its portable data format, MIRU typing has the potential to be a versatile tool for individual strain identification based on large reference databases. However, specialized bioinformatic web tools to analyze MIRU data and public reference databases are not available. To meet this need, a collection of 186 strains representing the major MTBC lineages was used for implementing a web server, MIRU-VNTRplus (http://www.miru-vntrplus.org/). For each strain species, lineage, and epidemiologic information was stored together with copy numbers of 24 MIRU loci, spoligotyping patterns, regions of difference (RD) profiles, single nucleotide polymorphisms (SNPs), susceptibility data, and IS6110 RFLP fingerprint images. Via the freely accessible MIRU-VNTRplus service users can compare their strain(s) with the reference strains for the assignment of MTBC species, lineages, and genotypes. For easier scientific communication a universal expanding nomenclature (MLVA MtbC15-9) to name different MIRU genotypes is maintained at the server. Comparisons can be based on MIRU-, spoligo-, RD-, SNP-, susceptibility-typing data, or by a combination of different data types. Several distance coefficients are available, including Jaccard's and categorical. Based upon the respective distance matrix, a dendrogram can be calculated using UPGMA or neighbor-joining clustering algorithms. The resulting trees may be exported in various data formats. MIRU-VNTRplus provides also functions for the user to analyze own strains without interrogating the reference database. Extensive documentation (manual and tutorials) of the service is available to make best use of all features.

Web resources for Mycobacterium tuberculosis

Mycobacterium tuberculosis the causative agent of tuberculosis (TB) still the major killer among human population, that has co-evolved with the human civilization so beautifully and so robustly, is really surprising. The war against tuberculosis is on and we are trying here to integrate major resources that are available for the disease and the organism all over the web to a single place. This is one of the open articles that would be upgraded time to time for acknowledging the recent development in the field, so that you as a used need not spend much time in data collection and integration. Hope this helps.

General information about tuberculosis:

Wiki page on tuberculosis on Wikipedia: http://en.wikipedia.org/wiki/Tuberculosis Tuberculosis, MTB, or TB (short for tubercle bacillus) is a common, and in many cases lethal, infectious disease caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. Tuberculosis usually attacks the lungs but can also affect other parts of the body. It is spread through the air when people who have an active MTB infection cough, sneeze, or otherwise transmit their saliva through the air. Most infections in humans result in an asymptomatic, latent infection, and about one in ten latent infections eventually progress to active disease, which, if left untreated, kills more than 50% of those infected.

Wiki page on M. tuberculosis: Mycobacterium tuberculosis (MTB) is a pathogenic bacterial species in the genus Mycobacterium and the causative agent of most cases of tuberculosis (TB). First discovered in 1882 by Robert Koch, M. tuberculosis has an unusual, waxy coating on its cell surface (primarily mycolic acid), which makes the cells impervious to Gram staining, so acid-fast detection techniques are used, instead. The physiology of M. tuberculosis is highly aerobic and requires high levels of oxygen. Primarily a pathogen of the mammalian respiratory system, MTB infects the lungs. The most frequently used diagnostic methods for TB are the tuberculin skin test, acid-fast stain, and chest radiographs. The M. tuberculosis genome was sequenced in 1998.

Centers for Disease Control and Prevention (CDC) page on tuberculosis: Tuberculosis (TB) is a disease caused by a bacterium called Mycobacterium tuberculosis. The bacteria usually attack the lungs, but TB bacteria can attack any part of the body such as the kidney, spine, and brain. If not treated properly, TB disease can be fatal. TB disease was once the leading cause of death in the United States. Focusing on Data and Statistics, Diseases and Conditions, Emergencies and Disasters, Environmental Health, Healthy Living, Injury, Violence and Safety, Life Stages and Populations, Travelers' Health, Workplace Safety and Health, Learn More »

Global TB database: http://www.who.int/tb/country/global_tb_database/en/index2.html Access the database to make data queries, interactive maps and country profiles. Country profiles: Country profiles provide key indicators, notification and treatment outcome data, and budget and financing graphs (for a subset of countries, including all high-burden countries). For high-burden countries, there is also a brief overview of TB control and epidemiology in the country, and a summary of achievements, challenges and planned activities related to implementing the first 5 components of the Stop TB Strategy.

U.S. National Library of Medicine Resource on tuberculosis: Has a resourced on various fields of the disease, from the basics to the advanced including the disease data.

WHO page on tuberculosis: Too many people have undetected TB for too long; late detection of TB increases their risk of transmitting the disease to others, having poor health outcomes, or that they and their family will suffer distress and economic hardship. WHO has produced an overview of approaches, guidelines and tools to improve early detection of TB. It presents a framework to assess barriers for early detection and helps identify appropriate actions.

The Stop TB Partnership: The Stop TB Partnership is leading the way to a world without tuberculosis (TB), a disease that is curable but still kills three people every minute. Founded in 2001, the Partnership's mission is to serve every person who is vulnerable to TB and ensure that high-quality treatment is available to all who need it. Together our nearly 1000 partners are a collective force that is transforming the fight against TB in more than 100 countries. They include international and technical organizations, government programmes, research and funding agencies, foundations, NGOs, civil society and community groups and the private sector. We operate through a secretariat hosted by the World Health Organization (WHO) in Geneva, Switzerland and seven working groups whose role is to accelerate progress on access to TB diagnosis and treatment; research and development for new TB diagnostics, drugs and vaccines; and tackling drug resistant- and HIV-associated TB. The secretariat is governed by a coordinating board that sets strategic direction for the global fight against TB.

Tuesday, March 6, 2012

Binding free energy estimation

X-Score: http://sw16.im.med.umich.edu/software/xtool/ is basically a "scoring function", which computes the binding affinities of the given ligand molecules to their target protein. It can be applied to structure-based drug design studies in combination with molecular docking or de novo structure generation programs. X-Score is developed by Dr. Renxiao Wang in Dr. Shaomeng Wang's group at the Department of Internal Medicine, University of Michigan Medical School. The first paper that reported X-Score was published on Journal of Computer-Aided Molecular Design, 16: 11–26, 2002. Note that X-Score was formerly known as X-CScore for a short while. To learn more about the X-Score program please read the X-Score on-line manual. X-Score is released to the public for free. The latest release is X-Score v1.2. You can download the program by clicking the link below. You will go through a license agreement and fill in some necessary registration information. Once we have received your signed license agreement, we will send you instructions of how to log on our server and download the X-Score package. The X-Score v1.2 package includes the program (executable and source codes), user manual, examples, references and the protein-ligand complex data set originally used for developing X-Score. Click here to get the X-Score v1.2 package now!

eHiTS 2009 Binding Affinity Prediction: eHiTS has a novel scoring function that takes advantage of temperature factor information provided in PDB files to give a more complete picture of interactions. All atoms in a PDB file have a temperature factor (B) associated with them. This temperature factor indicates the how much the atom varies from the mean position. Some atom positions are very precisely defined while others vary greatly, this has a very strong influence on the weight that should be assigned to the position. The novel approach in eHiTS uses the probability of the atom position during the statistic collection to create a statistically derived empirical scoring function. The eHiTS scoring function provides a scoring function that is smooth, and accurately represents a wide variety of problems at hand. One of the most recent studies with eHiTS Score 2009 was done using the PDBBind-2008 dataset. Please see a picture below for correlation of eHiTS Score to the experimental binding affinity. http://www.simbiosys.ca/ehits/ehits_score.html

DrugScore-Online (DSX): http://pc1664.pharmazie.uni-marburg.de/drugscore/ DSX^ONLINE is a web-based user interface for the knowledge-based scoring function DSX. DSX^ONLINE enables you to score (putative) protein-ligand complexes of your interest, to browse and download the scoring results, and to visualize the per-atom score contributions (see section Visualization). DSX: DSX pair potentials are derived in analogy the the DrugScore formalism developed by Gohlke et al. However, another set of atom types is used and contact types are clustered to circumvent problems with the reference state. Torsion potentials and solvent accessible surface ratio potentials are derived using the same formalism. For more details see the upcoming publication which is currently in preparation. For more consistences, DSX always assigns its own atom types and hydrogens are not regarded. If you have ligand poses from GOLD docking where water molecules were included it is possible to consider the corresponding ON-marked waters in the solutions file. Please note that there are even more options (like considering solutions from a docking with flexible receptor residues) available in the DSX standalone version, which will be freely available after publication. Visualization of the per-atom score contributions: The visualization of the per-atom score contributions is an intuitive way to learn about differences between putative ligand geometries, the effects of scaffold modifications or about the importance of certain binding regions.

BAPPL serve: http://www.scfbio-iitd.res.in/software/drugdesign/bappl.jsp Binding Affinity Prediction of Protein-Ligand (BAPPL) server computes the binding free energy of a non-metallo protein-ligand complex using an all atom energy based empirical scoring function BAPPL server provides two methods as options: Method 1 : Input should be an energy minimized protein-ligand complex with hydrogens added, protonation states, partial atomic charges and van der Waals parameters (R* and ε) assigned for each atom. The server directly computes the binding affinity of the complex using the assigned parameters. For format specifications on the input, please refer to the README file. Method 2 : Input should be an energy minimized protein-ligand complex with hydrogens added and protonation states assigned. The net charge on the ligand should be specified. The server derives the partial atomic charges of the ligand using the AM1-BCC procedure and GAFF force field for van der Waals parameters. Cornell et al. force field is used to assign partial atomic charges and van der Waals parameters for the proteins. For format specifications on the input, please refer to the README file.

PreDDICTA: http://www.scfbio-iitd.res.in/software/drugdesign/preddicta.jsp employs an all-atom energy based function for computing the binding affinity of a DNA oligomer with a non-covalently bound drug. The function has been validated against experimental binding free energies, ΔG^o bind and change in melting temperature of the DNA oligomer upon drug binding, ΔT_m, for 50 DNA Drug complexes. Click here to access the DNA-drug complex dataset. DNA is an important anticancer/antibiotic target and PreDDICTA can be employed to aid and expedite rational drug design attempts for DNA.Click here to know more about DNA Drug interaction How to use PreDDICTA: 1. Tool 1 incorporates the PreDDICTA energy function which calculates the electrostatics, van der Waals, rotational and translational entropy and hydration free energy change for the DNA-drug complex. These are summed to yield the total calculated binding energy which is converted to the binding free energy and ΔT_m based on the relations reported in. Input for this tool is a PDB file for any DNA-minor groove binder complex, conforming to the standard PDB format, as described in Input format 2. Tool 2 simply converts any number input as ΔT_m to the corresponding expected binding free energy, using the relation between these two quantities reported in. 3. Tool 3 converts any number input as binding free energy to the corresponding expected ΔT_m value, using the relation between these two quantities reported in.

PharmaGist: http://bioinfo3d.cs.tau.ac.il/pharma/about.html Predicting molecular interactions is a major goal in rational drug design. Pharmacophore, which is the spatial arrangement of features that is essential for a molecule to interact with a specific target receptor, is important for achieving this goal. PharmaGist is a freely available web server for pharmacophore detection. The employed method is ligand based. It does not require the structure of the target receptor. Instead, the input is a set of structures of drug-like molecules that are known to bind to the receptor. We compute candidate pharmacophores by multiple flexible alignments of the input ligands. The main innovation of this approach is that the flexibility of the input ligands is handled explicitly and in deterministic manner within the alignment process. The method is highly efficient, where a typical run with up to 32 drug-like molecules takes seconds to a few minutes on a stardard PC. Another important characteristic of the method is the capability of detecting pharmacophores shared by different subsets of input molecules. This capability is a key advantage when the ligands belong to different binding modes or when the input contains outliers. The download version includes virtual screening capability. The performance of PharmaGist for virtual screening was successfully evaluated on a commonly used data set of G-Protein Coupled Receptor alpha1A. Additionally, a large-scale evaluation using the DUD (directory of useful decoys) data set was performed. DUD contains 2950 active ligands for 40 different receptors, with 36 decoy compounds for each active ligand. PharmaGist enrichment rates are comparable with other state-of-the-art tools for virtual screening.

IC₅₀-to-K_i converter: http://botdb.abcc.ncifcrf.gov/toxin/kiConverter.jsp The IC₅₀-to-K_i converter computes K_i values from experimentally determined IC₅₀ values for inhibitors of enzymes that obey classic Michaelis-Menten kinetics and of protein-ligand interactions. A new web-server tool estimates K_i values from experimentally determined IC₅₀ values for inhibitors of enzymes and of binding reactions between macromolecules (e.g. proteins, polynucleic acids) and ligands. This converter was developed to enable end users to help gauge the quality of the underlying assumptions used in these calculations which depend on the type of mechanism of inhibitor action and the concentrations of the interacting molecular species. Additional calculations are performed for nonclassical, tightly bound inhibitors of enzyme-substrate or of macromolecule-ligand systems in which free, rather than total concentrations of the reacting species are required. Required user-defined input values include the total enzyme (or another target molecule) and substrate (or ligand) concentrations, the K_m of the enzyme-substrate (or the K_d of the target-ligand) reaction, and the IC₅₀ value. Assumptions and caveats for these calculations are discussed along with examples taken from the literature. The host database for this converter contains kinetic constants and other data for inhibitors of the proteolytic clostridial neurotoxins (http://botdb.abcc.ncifcrf.gov/toxin/kiConverter.jsp).