Project

Visualization of biomolecular channels and cavities

People involved: Talha Bin Masood and Raghavendra G.S.

Fundamentally all biological processes are molecular in nature. So, it is essential to understand biomolecules and their interactions to gain better insight into living systems. Proteins are constituted of chains of small building blocks called amino acids. These chains of amino acids fold in 3D space to define structure of a protein. It is known that structure of biomolecules plays an important role in defining its function. Biomolecular structures contain complex features such as pockets and protrusions on the surface, internal cavities and voids, channel and tunnel like structures connecting external surface to functional sites buried deep inside the molecule. Analysis of these features is very important for understanding of structure-function relationships, engineering new proteins with required functional properties, or designing inhibitors for existing proteins.

Proteins are often represented in space-fill model as a union of balls, where each ball corresponds to an atom (See Figure 1). This model is ideal for application of geometric and topological techniques for detailed analysis. For example, geometric algorithms have been developed for extraction of molecular surface which is extremely important in the study of any protein. Similarly, for accurate measurement of molecular volumes, identification and characterization of empty space within a molecule, methods from computation geometry and topology are applied. This project is a contribution to this area of research with special focus on integrated geometric and topological methods for visual analysis of cavities and channels in biomolecules.

With increasing availability of structures of large proteins and protein complexes at atomic detail through advancements in the field of crystallography, there is a need of designing faster and more space efficient algorithms for their analysis. Another driver for the need of efficient geometric algorithms is the availability of larger molecular dynamics trajectories, which are essentially time varying molecular structures. Designing algorithms to address these challenges is the second major focus of this project.

Figure 1. ACH receptor transmembrane protein (PDB id: 1OED). (Left) The space-fill model. (Middle) The molecular surface. (Right) The central transmembrane pore through this protein.

In the first part, we describe two methods: one for extraction and visualization of biomolecular channels, and the other for extraction of cavities in uncertain data. We also describe the two software tools based on the proposed methods targeted at the end-user, the biologists. These two web server tools publicly available for use are called ChExVis and RobustCavities. In the second part, we describe efficient parallel algorithms for two geometric structures widely used in the study of biomolecules. One of the structures we discuss is discrete Voronoi diagram which finds applications in channel visualization, while the other structure is alpha complex which is extremely useful in studying geometric and topological properties of biomolecules.

Extraction and visualization of channels

A channel is a pathway through empty space within the molecule. Understanding channels, that lead to active sites or traverse the molecule, is important in the study of molecular functions such as ion, ligand, and small molecule transport. Efficient methods for extracting, storing, and analysing protein channels are required to support such studies. We develop an integrated framework that supports computation of the channels, interactive exploration of their structure, and detailed visual analysis of their properties [1]. Key contributions are summarized below:

We describe a method for extraction of channels in biomolecules based on a representation of the molecule using the alpha complex. This is exploited to capture all geometrically feasible channels in a concise representation called channel network that supports querying for specific channels. The extracted channels are represented as a set of connected tetrahedra.
Novel methods are developed to automatically identify important channels within the network and rank them based on their significance.
The channel extraction method was compared with the existing software tools. The quality of the results was observed to be better than or comparable to other tools – Mole, Caver, MolAxis, and PoreWalker.
Novel visualization methods are proposed to facilitate detailed study of the extracted channels. Figure 2 shows visualization of potassium channel in a transmembrane protein.
The integrated channel extraction and visualization framework was successfully used to study multiple transmembrane pores and channels leading to active sites.
These methods are implemented as a web server called ChExVis which is available for public use.

Figure 2. Transmembrane pore identified in PDB structure 1K4C using our channel extraction method. The 3D view of the channel is shown on the left. Conservation and hydrophobicity profiles are shown using a blue to red color map in the middle. Four different 2D box representations of the channel are shown on the right. From left to right, boxes are labelled by amino acid type, atom type, structure and chemical properties of the lining atoms.

Extraction of robust voids and pockets in proteins

A cavity in a protein molecule refers to both voids (without openings) and pockets (with openings). These cavities play a key role in determining the stability and function of proteins. The existing methods for detection of cavities take protein structures determined from x-ray crystallography data or other lower resolution data as input. These methods are sensitive to inaccuracies that are inherent in the crystallographic measurements. While the measurements may guarantee high resolution, it is important to note that even small inaccuracies may cause a difference in the reported number of cavities. Inaccuracies may also arise due to fundamental limitations such as the notion of radii of atoms, which is determined empirically. Presence of such inaccuracies may result in a cavity detection method to report two distinct but large cavities in place of one, or report very small volume cavities. Figure 3 illustrates the problem as it occurs in a lyzosyme protein. Key contributions of this work [2] include:

We develop an interactive method to compute robust cavities in proteins where the goal is to enable the user to reduce, if not completely eliminate, the inaccuracies mentioned earlier.
We provide a novel definition for robustness in the presence of inaccuracies in the measured radii.
We propose a method for computing robust and stable cavities in proteins. This is accomplished through the use of a simple and succinct structure called the alpha complex to represent protein molecules. In order to identify the set of cavities that are stable with respect to small perturbations in the atom radii, our method symbolically modifies the radii of a select set of atoms by systematically processing and modifying the filtration. The method is efficient in terms of running time performance and also supports the elimination of very small or insignificant voids as measured by the notion of topological persistence.
We also develop software to visualize the stable cavities together with the molecule, and to calculate cavity volumes and surface areas. This software provides an interactive framework that a biologist can use to decide which cavities are more relevant and what mutations to perform.
We use this software to demonstrate the applicability of the notion of robust voids and pockets and apply it to detect potential channels and pockets in several proteins.
These methods are also implemented as a web server called RobustCavities which is available for public use.

Figure 3. The two cavities that appear very near to each other in a lyzozyme protein (200L) may be a single cavity. The solid surface represents cavities while the protein is shown as cartoon to provide context.

Connecting cavities in biomolecules

Parallel computation of alpha complex

Publications

Talha Bin Masood, Sankaran Sandhya, Nagasuma Chandra and Vijay Natarajan.
ChExVis: a tool for molecular channel extraction and visualization.
Raghavendra Sridharamurthy, Talha Bin Masood, Harish Doraiswamy, Siddharth Patel, Raghavan Varadarajan and Vijay Natarajan.
Extraction of robust voids and pockets in proteins.
Talha Bin Masood and Vijay Natarajan.
An integrated geometric and topological approach to connecting cavities in biomolecules.
Talha Bin Masood, Hari Krishna Malladi and Vijay Natarajan.
Facet-JFA: Faster computation of discrete Voronoi diagrams.

Software

ChExVis
A web server for molecular Channel Extraction and Visualization.
RobustCavities
Web portal for a software which computes cavities in proteins robustly taking into account uncertainties in the atomic radii.

Contact

Contact: talha [at] iisc [dot] ac [dot] in.