From Protein Structures to Function Prediction

The goal of our project is to develop new computational methods for predicting the biochemical function of a protein from its 3D structure. Given a PDB file, a multi-variate field is computed to represent the geometric and chemical properties of its binding site(s). Then, a shape descriptor encodes those properties in a concise representation suitable for indexing and rapid matching with algorithms based on the spherical harmonic and inverse Wigner-D transforms. Finally, the similarity between shape descriptors is used to drive a nearest-neighbor classifier that predicts the biochemical function of the protein based on its similarities to others with known functions. This process provides a new method for predicting protein function that achieves higher classification rates than competing methods for proteins with dissimilar sequences.

This is joint work with Janet Thornton, Fabian Glaser, Roman Laskowski, Richard Morris, Irilenia Nobeli, and Gareth Stockwell from EBI, and Michael Kazhdan from Princeton.