Skip to content

Latest commit

 

History

History
249 lines (175 loc) · 11.5 KB

symmetry.md

File metadata and controls

249 lines (175 loc) · 11.5 KB

Protein Symmetry using BioJava

BioJava can be used to detect, analyze, and visualize symmetry and pseudo-symmetry in the quaternary (biological assembly) and tertiary (internal) structural levels of proteins.

Quaternary Symmetry

The quaternary symmetry of a structure defines the relation and arrangement of the individual chains or groups of chains that are part of a biological assembly. For a more exhaustive explanation about protein quaternary symmetery and the different types visit the PDB help page.

In the quaternary symmetry detection problem, we are given a set of chains (subunits) that are part of a biological assembly as input, defined by their atomic coordinates, and we are required to find the higest overall symmetry group that relates them as ouptut. The solution is divided into the following steps:

  1. First, we need to identify the chains that are identical (or similar in the pseudo-symmetry case). For that purpose, we perform a pairwise alignment of all chains and identify clusters of identical or similar subunits.
  2. Next, we reduce each of the polypeptide chains to a single point, their centroid (center of mass).
  3. Afterwards, we try different symmetry operations using a grid search to superimpose the chain centroids and score them using the RMSD.
  4. Finally, based on the parameters (cutoffs), we determine the overall symmetry of the structure, with the symmetry relations obtained in the previous step.
  5. In case of asymmetric structure, we discard combinatorially a number of chains and try to detect any local symmetries present (symmetry that does not involve all subunits of the biological assembly).

The quaternary symmetry detection algorithm is implemented in the biojava class QuatSymmetryDetector. An example of how to use it programatically is shown below:

// First download the structure in the biological assembly form
Structure s;

// Set some parameters if needed different than DEFAULT - see descriptions
QuatSymmetryParameters parameters = new QuatSymmetryParameters();
SubunitClustererParameters clusterParams = new SubunitClustererParameters();

// Instantiate the detector
QuatSymmetryDetector detector = QuatSymmetryDetector(s, parameters, clusterParams);

// Static methods in QuatSymmetryDetector perform the calculation
QuatSymmetryResults globalResults = QuatSymmetryDetector.getGlobalSymmetry(s, parameters, clusterParams);
List<QuatSymmetryResults> localResults = QuatSymmetryDetector.getLocalSymmetries(s, parameters, clusterParams);

See also the demo provided in BioJava for a real case working example.

The returned QuatSymmetryResults object contains all the information of the subunit clustering and structural symmetry. This object will be used later to obtain axes of symmetry, point group name, stoichiometry or even display the results in Jmol. The return object of quaternary symmetry (QuatSymmetryResults) contains the In case of asymmetrical structure, the result is a C1 point group. The return type of the local symmetry is a List because there can be multiple valid options of local symmetry. The list will be empty if there exist no local symmetries in the structure.

Global Symmetry

In the global symmetry mode all chains have to be part of the symmetry result.

Point Group

In a point group a single or multiple rotation axes define the overall symmetry operations, with the property that all the axes coincide in the same point.

PDB ID 1VYM

Helical

In helical symmetry there is a single axis with rotation and translation components.

PDB ID 4UDV

Local Symmetry

In local symmetry a number of chains is left out, so that the symmetry only applies to a subset of chains.

PDB ID 4F88

Pseudo-Symmetry

In pseudo-symmetry the chains related by the symmetry are not completely identical, but they share a sequence or structural similarity above the pseudo-symmetry similarity threshold.

If we consider hemoglobin, at a 95% sequence identity threshold the alpha and beta subunits are considered different, which correspond to an A2B2 stoichiometry and a C2 point group. At the structural similarity level, all four chains are considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and D2 pseudosymmetry.

PDB ID 4HHB

Internal Symmetry

Internal symmetry refers to the symmetry present in a single chain, that is, the tertiary structure. The algorithm implemented in biojava to detect internal symmetry is called CE-Symm.

CE-Symm

The CE-Symm algorithm was originally developed by [Myers-Turnbull D., Bliven SE., Rose PW., Aziz ZK., Youkharibache P., Bourne PE. & Prlić A. in 2014] (http://www.sciencedirect.com/science/article/pii/S0022283614001557) pubmed. As the name of the algorithm explicitly states, CE-Symm uses the Combinatorial Extension (CE) algorithm to generate an alignment of the structure chain to itself, disabling the identity alignment (the diagonal of the DotPlot representation of a structure alignment). This allows the identification of alternative self-alignments, which are related to symmetry and/or structural repeats inside the chain.

By a procedure called refinement, the subunits of the chain that are part of the symmetry are defined and a multiple alignment is created. This process can be thought as to divide the chain into other subchains, and then superimposing each subchain to each other to create a multiple alignment of the subunits, respecting the symmetry axes.

The internal symmetry detection algorithm is implemented in the biojava class CeSymm. It returns a MultipleAlignment object, see the explanation of the model in Data Models, that describes the similarity of the internal repeats. In case of no symmetry detected, the returned alignment represents the optimal self-alignment produced by the first step of the CE-Symm algorithm.

//Input the atoms in a chain as an array
Atom[] atoms = StructureTools.getRepresentativeAtomArray(chain);

//Initialize the algorithm
CeSymm ceSymm = new CeSymm();

//Choose some parameters
CESymmParameters params = ceSymm.getParameters();
params.setRefineMethod(RefineMethod.SINGLE);
params.setOptimization(true);
params.setMultipleAxes(true);

//Run the symmetry analysis - alignment as an output
MultipleAlignment symmetry = ceSymm.analyze(atoms, params);

//Test if the alignment returned was refined with
boolean refined = SymmetryTools.isRefined(symmetry);

//Get the axes of symmetry from the aligner
SymmetryAxes axes = ceSymm.getSymmetryAxes();

//Display the results in jmol with the SymmetryDisplay
SymmetryDisplay.display(symmetry, axes);

//Show the point group, if any of the internal symmetry
QuatSymmetryResults pg = SymmetryTools.getQuaternarySymmetry(symmetry);
System.out.println(pg.getSymmetry());

To enable some extra features in the display, a SymmetryDisplay class has been created, although the MultipleAlignmentDisplay method can also be used for that purpose (it will not show symmetry axes or symmetry menus).

Lastly, the SymmetryGUI class in the structure-gui package provides a GUI to trigger internal symmetry analysis, equivalent to the GUI to trigger structure alignments.

Symmetry Display

The symmetry display is similar to the quaternary symmetry, because part of the code is shared. See for example this beta-propeller (1U6D), where the repeated beta-sheets are connected by a linker forming a C6 point group internal symmetry:

PDB ID 1U6D

Hierarchical Symmetry

One additional feature of the internal symmetry display is the representation of hierarchical symmetries and repeats. Contrary to point groups, some structures have different levels of symmetry. That is, the whole strucutre has, e.g. C2 symmetry and, at the same time, each of the two parts has C2 symmetry, but the axes of both levels are not related by a point group (i.e. they do not cross to a single point).

A very clear example are the beta-gamma-crystallins, like 4GCR:

PDB ID 4GCR

Subunit Multiple Alignment

Another feature of the display is the option to show the multiple alignment of the symmetry related subunits created during the refinement process. Search for the option Subunit Superposition in the symmetry menu of the Jmol window. For the previous example the display looks like that:

PDB ID 4GCR

The subunit display highlights the differences and similarities between the symmetry related subunits of the chain, and helps the user to identify conseved and divergent regions, with the help of the Sequence Alignment Panel.

Quaternary + Internal Overall Symmetry

Finally, the internal and quaternary symmetries can be merged to obtain the overall combined symmetry. As we have seen before, the protein 1VYM is a DNA-clamp that has three chains arranged in a C3 symmetry. Each chain is internally fourfold symmetric with two levels of symmetry. We can analyze the overall symmetry of the structure by considering together the C3 quaternary symmetry and the fourfold internal symmetry. In this case, the internal symmetry augments the point group of the quaternary symmetry to a D6 overall symmetry, as we can see in the figure below:

PDB ID 1VYM

An example of how to toggle the combined symmetry (quaternary + internal symmetries) programatically is shown below:

// First download the structure in the biological assembly form
Structure s;

// Initialize default parameters
QuatSymmetryParameters parameters = new QuatSymmetryParameters();
SubunitClustererParameters clusterParams = new SubunitClustererParameters();

// In SubunitClustererParameters set the clustering method to STRUCTURE and the internal symmetry option to true
clusterParams.setClustererMethod(SubunitClustererMethod.STRUCTURE);
clusterParams.setInternalSymmetry(true);

// You can lower the default structural coverage to improve the recall
clusterParams.setStructureCoverageThreshold(0.75);

// Instantiate the detector
QuatSymmetryDetector detector = QuatSymmetryDetector(s, parameters, clusterParams);

// Static methods in QuatSymmetryDetector perform the calculation
QuatSymmetryResults overallResults = QuatSymmetryDetector.getGlobalSymmetry(s, parameters, clusterParams);

See also the test provided in BioJava for a real case working example.


Navigation: Home | Book 3: The Structure Modules | Chapter 14 : Protein Symmetry

Prev: Chapter 13 - Finding all Interfaces in Crystal: Crystal Contacts

Next: Chapter 15 : Protein Secondary Structure