Method Overview

Abstract

Voids and pockets in a protein, collectively called as cavities, refer to empty spaces that are enclosed by the protein molecule. Existing methods to compute, measure, and visualize the cavities in a protein molecule are sensitive to inaccuracies in the empirically determined atomic radii. We present a topological framework that enables robust computation and visualization of these structures. Given a fixed set of atoms, cavities are represented as subsets of the weighted Delaunay triangulation of atom centres. A novel notion of (ε, π)-stable cavities helps identify cavities that are stable even after perturbing the atom radii by a small value. This approach is used to identify potential pockets and channels in protein structures.

Method




Graphical Abstract: The illustartion of extraction of robust cavities in a 2D example.

Algorithm
Input: Molecule (as a set of spheres), ε, π.
Step A: Compute weighted Delaunay triangulation of the set of spheres representing the molecule.
Step B: Compute Alpha complex at α=0.
Step C: Depending on value of ε, identify edges (triangles in 3D) in alpha complex which can be safely moved to the end of the filtration. The consequence of this step is that nearby cavities get merged into single cavity.
Step D: Identify connected components in complement of the alpha complex to determine the Robust cavities. This step can also be applied before applying step C to obtain the Original Cavities.
Cavity pruning: Robust cavities are pruned further based on topological persistence. This is controlled by parameter π, which removes cavities having persistence below π. This helps in removing insignificant cavities.

Definition of terms
Cavity Map Diagram: We compute cavities both before and after applying Step C. Thus, we have two sets of cavities called Original Cavities (before Step C) and Robust Cavities (After Step C and Cavity pruning). The mapping between these sets is represented as a cavity map diagram (shown on top right of the figure).
Cavity: Maximally connected empty region within a molecule.
Void: A buried cavity. It has no mouths (openings to molecular exterior).
Pocket: A cavity with openings, i.e. number of mouths > 0.
Channel: A cavity with at least two openings, i.e. number of mouths > 1.
Potential Channel: A robust cavity with more than one mouths which is formed as a result of merging of cavities, each having at most one mouth.
Potential Pocket: A robust pocket which has at least one void among its constituent.

Use cases

Our method helps identify potential pockets and channels in protein which is not accomplished by traditional cavity computation algorithms. By potential pocket, we mean a void which can 'open up' by minor perturbation in atomic radii. Similarly, potential channel is defined as a set of nearby pockets which can merge after minor perturbation to form a channel structure.


PDB-id: 2OAR — Left: Cavities detected by traditional algorithm, Right: RobustCavities successfully detects the potential channel.


PDB-id: 2YXR — Left: Cavities detected by traditional algorithm, Right: RobustCavities successfully detects the potential channel.

References

  1. Raghavendra Sridharamurthy, Harish Doraiswamy, Siddharth Patel, Raghavan Varadarajan and Vijay Natarajan. Extraction of robust voids and pockets in proteins. EuroVis 2013: Eurographics Conference on Visualization (Short Paper), 2013. [pdf]
  2. Raghavendra Sridharamurthy, Harish Doraiswamy, Siddharth Patel, Raghavan Varadarajan and Vijay Natarajan. Extraction of robust voids and pockets in proteins. Technical Report, May 2013. [pdf]

Precomputed Examples

Click on "View Results" button to view precomputed results page of the corresponding PDB structure. The images shown for each structure are obtained using the PyMol script generated by RobustCavities web-server.

Potential Channels


Example Images Remarks
Original Robust
Gramicidin A 1GRM
  1. The transmembrane channel in Gramicidin is correctly detected as a potential channel.
View Results
Mechanosensitive Channel of Large Conductance (MscL) 2OAR
  1. The transmembrane channel in 2OAR is correctly detected as a potential channel.
View Results
SecY protein translocation channel 1RHZ
  1. 1RHZ is the closed structure of SecY protein, while 2YXQ and 2YXR are mutants with half and full plug deletions respectively.
  2. Using traditional alpha-shape based cavity detection, two disconnected cavities are detected at either side of the membrane in all the structures.
  3. However, using RobustCavities, potential channels are detected in the mutants (2YXQ, 2YXR), while the two cavities remain disconnected in wild-type (1RHZ) revealing that the channel is more tightly shut in 1RHZ compared to the two mutants.
View Results
2YXQ
View Results
2YXR
View Results

Other Examples


Example Images Remarks
Original Robust
T4 Lysozyme 200L
View Results
Hemoglobin 1HGA
  1. 1HGA is hemoglobin in low affinity T-state, while 1BBB is hemoglobin in high affinity R-state.
  2. The heme sites in chains B and D (yellow and purple) merge with the central cavity (red) after applying filtration modification in the high-affinity structure (1BBB).
  3. This is not obsrved in low-affinity structure, revealing the relaxed structure of 1BBB compared to tight structure of 1HGA.
View Results
1BBB
View Results
Hydrolase 4B87
View Results
Heterodimeric complex of RAR and RXR ligand-binding domains 1DKF
View Results

Benchmarking

To validate whether the proposed method is able to correctly identify cavities, a set of 138 model mutants was created with known cavity locations. Given a protein, a model mutant was created by replacing a buried hydrophobic residue in the protein core with Alanine. This replacement results in the creation of an artificial cavity in the mutant at a known location (say p). Cavities were then extracted using RobustCavities in both the wild-type and the mutant structures. If a cavity was found closer to p in the mutant than in the wild-type, then it was considered a success. We observed a success rate of 99% in our experiments.

The complete results of benchmarking can be downloaded here.

Demo Video

The following video demonstrates a typical usage of RobustCavities web-server, starting from submitting a job to exploration of cavities.

FAQs

Q1 : How is protein input supplied to the RobustCavities server?
Ans: RobustCavities requires PDB structure of proteins in simple text format. The PDB structure can be uploaded manually, or just the PDB ID can be specified. RobustCavities automatically downloads the specified PDB structure from RCSB database.

Q2 : Is there an upper limit on PDB file size?
Ans: Yes. Files larger than 5MB are not allowed to be uploaded for security reasons.

Q3 : I don't understand all the parameters in job submission form.
Ans: Hovering over the feilds in submission form displays a brief description of that parameter. You can leave them unchanged if you are not sure, because these parameters are already set to good default values.

Q4 : I don't want to give my email-id to this website. Can I still view the results?
Ans: Yes, you can. Specifying email-id is optional. After submission of your job, link to the results page will appear. You can bookmark that link and view the results when they are ready. Specifying your email-id would be helpful in cases when the web-server has a long queue of jobs, resulting in long waiting period.

Contact

Suggestions, comments and feedback: Send email to talha@iisc.ac.in