Symposium on Computational Science of Biomolecules: Applications in Medicine and Therapeutics

University of Illinois at Chicago

Friday, October 8, 2004, 8:30 AM - 5:00 PM.

Home

Registration

Venue

Program Schedule

Abstracts

Directions

Organising committee

Abstracts

Dr. Eric Jakobsson

Computational biology and the NIH roadmap

TBA

Dr. Andrzej Joachimiak

The universe of protein structures and functions by structural genomics

The Midwest Center for Structural Genomics is rapidly determining the structures of large numbers of strategically selected proteins in order to elucidate the entire protein folding space. Our targets include bio-medically important and high-value proteins from human pathogens. The MCSG has assembled the core components of a protein crystallography-based structural genomics pipeline. The creation of this pipeline required the development, optimization, parallelization and integration of both experimental and computational processes. Efficient, large-scale and cost-effective multi-step manipulations of genes, proteins, crystals, diffraction data and structures have been made possible with the use of automation. The MCSG integrated pipeline generates well-characterized protein targets, expression strains, produces milligram quantities of proteins and heavy-atom labeled crystals that are tested for diffraction using synchrotron beamlines. Crystals of x-ray quality are used for structure determination using SAD/MAD approach. Structures are being determined using semi-automated approaches. The majority of the steps in the MCSG pipeline are being tracked by the databases. The combination of the MCSG production pipeline, data collection facilities at third generation synchrotrons, advanced software and computing resources has resulted in significant acceleration in the rate of protein structure determination. Several recently determined protein structures will be presented. This work was supported by National Institutes of Health Grant GM62414 and by the U.S. Department of Energy, Office of Biological and Environmental Research, under contract W-31-109-Eng-38.

Dr. Jie Liang

Computational topographics of protein surfaces: structure, function, and evolution

Proteins are working molecules of cells that function through interactions with other molecules. Because binding site of proteins involves only a small number of key residues dispersed in diverse regions of the primary sequences, it is challenging to study protein function from sequences alone. In this presentation, we show computational topographics can play important roles in understanding protein functions. We describe methods we developed for identifying surface patterns and motifs in residue, spatial arrangement, and spatial orientation that are biologically important. With statistical models to evaluate significance of these patterns, we show results of global analysis from exhaustive computation of all protein pockets and voids on 12,177 known structures in the Protein Data Bank. We give examples how these patterns can be useful in discovering deep functional and evolutionary relationship of proteins. We further describe how evolutionary history of binding surfaces can be summarized using a Bayesian Markov chain Monte Carlo method and how computational topographics combined with evoltuionary models can lead to prediction of functions of proteins with only homologous hypothetical proteins of unknown functions (see cast.engr.uic.edu and pvsoar.bioengr.uic.edu).

Dr Klaus Schulten

Physical Bioinformatics - A Case Study

Sequence and structure information are the bedrock on which an understanding of cellular functions and the underlying physical mechanisms can be built.  This lecture illustrates how the two sources of information are  combined to investigate by means of the program VMD function and mechanism of the aquaporin family of membrane channels that transport water and certain small solutes across cell walls. Introducing first the key architectural features of a single aquaporin, structures and sequences of four aquaporins are aligned and common features recognized.  The shared and distinct features are examined closely and used as guideposts leading quickly to key questions regarding the mechanism underlying aquaporin's efficient conduction and selection.  The questions are addressed by means of molecular dynamics simulations using the program NAMD that reveal  the physical principles behind water transport and highly selective solute co-transport in aquaporins.  Sequence-structure information is viewed again to elucidate tetramer binding and pathologies connected with certain aquaporin mutants. The lecture introduces the concepts behind the programs employed and emphasizes those aspects of the case study that can be applied for investigations of other protein families.

Dr. Michael Johnson

Systems biology strategies for new antibiotic discovery against B. anthracis

We are utilizing an integrated approach toward the development of new antimicrobials, new potentiators of existing antimicrobials, and direct inhibitors of the anthrax toxin as strategies to combat natural and bioengineered forms of B. anthracis.  The development of drug-resistant strains of B. anthracis is technically quite feasible, and could be a substantial threat in future terrorist attacks.  Thus, the development of new antimicrobials that inhibit new molecular targets will be important to biodefense, as well a strategy for combating bacterial drug resistance.  We are using a combination of strategies, beginning with genetic and bioinformatic identification and validation of novel bacterial targets, determination of target 3D molecular structures, utilization of diverse chemical libraries for high throughput screening, structure-based drug design, synthesis of lead compounds and their optimization, followed by macrophage and animal testing.  Our initial target is the glutamate racemase enzyme, which converts L-Glu to D-Glu, and which other studies have shown to be an essential enzyme for bacillus survival.  Additionally, B. anthracis utilizes D-Glu as an essential building block of the peptidoglycan layer in its cell wall, which, in turn protects the bacilli from macrophage destruction during infection.  Thus, glutamate racemase inhibitors will potentially exhibit direct antibiotic action, complemented by facilitating macrophage destruction.  Strategies toward glutamate racemase inhibitor design will be described, along with initial directions toward other targets.

Dr. Hui Lu

Structural bioinformatics study of protein binding

Protein binding, including protein-protein and protein-DNA interactions, plays  key roles in their functioning. In protein-protein interaction, we have developed statistical binding potential and dynamics-based binding site  prediction to distinguish correct binding complex from wrong ones. We have also developed correlation between protein interaction data with the gene expression profile from microarray experiments. In protein-DNA interactions, a three step protocol is used to model the interactions between the  transcription factor and the promoter sequences. Each step includes the data  mining of protein structures and a statistical learning procedure.

Dr. Wen-Hsiung Li

TBA

Dr. Tanya Berger-Wolf

Computational Phylogenetic Reconstruction: How Good is Good Enough?

The study of the evolutionary relationships between living organisms, or phylogeny, is central to biology. Relationships among the organisms (or taxa) are modeled as a phylogenetic tree.  Currently scientists have undertaken an ambitious task of reconstructing the Tree of Life which contains millions of species. This poses numerous and diverse computational challenges. We will briefly mention some major problems and will discuss one of them in more detail. Current phylogenetic reconstruction methods are computationally hard optimization problems and do not scale to the size of the Tree of Life. But do we really need the "optimal" answer?  Our recent experiments indicate that maybe not. How good is "good enough"? How do we know that we have reached the "good enough" point? We will present possible answers to these questions.

Dr. Bhaskar DasGupta

Randomized approximation algorithms for a combinatorial problem with applications to reverse engineering of gene and protein networks

In this talk we investigate the computational complexities of a combinatorial problem that arises in the reverse engineering of protein and gene networks. We abstract a combinatorial version of the problem and observe that this is ``equivalent'' to the set multicover problem when the ``coverage'' factor k is a function of the number of elements $n$ of the universe. An important special case for our application is the case in which k=n-1. Let 1<a<n denotes the maximum number of elements in any given set in our set multi-cover problem. Then, we show that a non-trivial analysis of a simple randomized polynomial-time approximation algorithm for this problem yields an expected approximation ratio E[r(a,k)]  that is an increasing function of a/k approaching 1. We also comment about the impossibility of a  significantly faster convergence of E[r(a,k)] towards 1  for any polynomial-time approximation algorithm.

Dr. Bellur Prabhakar

Development of immunotherapeutic interventions against SARS:

Recent discovery that a corona virus is the causative agent of SARS and its rapid genomic sequencing has created opportunities to develop effective modalities of prevention and treatment against this virus. Of particular interest is the spike glycoprotein (S protein), which is expressed as a surface projection. It is also likely that M and E proteins may also be involved in virus neutralization. Work in my laboratory is focused on developing immunotherapeutic interventions against the SARS-CoV. We have already generated a panel of mouse monoclonal antibodies (mAbs) that can neutralize the virus. We will test them against clinical isolates, S, M, and E proteins and their fragments, and overlapping peptides that span S, M and E proteins to identify neutralizing epitopes. We have also generated virus neutralizing human mAbs using XenoMouse from Abgenix Inc. Neutralizing Abs will be tested against clinical isolates and for antibody-dependent enhancement of SARS-CoV infection, if any. Abs that show the broadest reactivity will then be used in to identify the regions of the S, M and E proteins with which they react to identify functionally most important domains. Broadly neutralizing mAbs and relevant recombinant proteins will be tested for their ability to confer immunity against SARS-CoV challenge in mice. These studies are likely to result in the development of immunological interventions against SARS-CoV and help delineate structure-function relationships of proteins relevant for viral infection and anti-viral immunity.

Dr. Robert Grossman

Unique Chemical Keys for biomolecules and integration of distributed data

SMILES strings are commonly used strings for identifying chemical compounds.  It is well know that SMILES strings as originally defined are not unique.   Having unique identifiers is important when developing systems for integrating bioinformatics data, especially distributed bioinformatics data.  In this talk we survey variants of SMILES strings and related algorithms for assigning unique identifiers to biomolecules, including an algorithm called the Unique Chemical Key or UCK algorithm.   We describe some empirical studies comparing different identifiers using over 230,000 compounds obtained from the National Cancer Institute (NCI) database of chemical compounds.   We also demonstrate systems for integrating distributed data about biomolecules using UCK and related algorithms.

Dr. Yang Dai

Machine Learning Approach for Prediction of T Cell Epitopes

In Silico T cell epitope identification currently relies on the  prediction of peptide binding to major histocompatibility complex (MHC)  molecules. Antigens are degraded into a set of peptide fragments through  the action of the proteasome and the resulting peptides presented by  MHCs are recognized by one or few of a large set of T cell receptors.  Methods for the prediction of MHC binding peptides have been developed based on structural binding motifs or quantitative matrices. These techniques, however, do not discriminate between T cell epitopes and non-epitopes which are both MHC binders. In this study we propose a new peptide encoding scheme in conjunction with the use of machine leaning method for the direct recognition of T cell epitopes. The method enables the presentation of information on both (1) amino acid positions in peptides and (2) the similarity between amino acids through the use of sparse indicator vectors and the BLOSUM50 matrix. A procedure of featureselection is also introduced. The computational results demonstrate superior performance over previous techniques.