Header image


Result analysis

Protein models generated by CS-GAMDy and their GeNMR scores are located in the sub-directory "results" inside the project directory. PDB files with model coordinates are placed in the folder "final_models". The file with model's names as well as corresponding GeNMR scores, iterations of genetic algorithm, generation time, and, optionally, RMSD to the reference model can be found in the folder "final_scores".


Evaluating success of individual CS-GAMDy runs:

To assess the success of a CS-GAMDy simulation, users are advised to employ two commonly used criteria for experimentally restrainted protein modelling in CS-Rosetta: the RMSD criterion and the score-drop criterion (1-3).

1) RMSD test:

a) Rank all output models by the GeNMR score and take a cluster of ten models with the best (i.e. lowest) GeNMR score.

b) Identify the best-score model in the cluster and calculate CA RMSD for non-coil regions of the remaining 9 models with respect to the best-score model. If the average CA RMSD of the 9 models is within 1.5Å from the best-scoring model, we consider the RMSD criterion satisfied (black lines, Figure 1).


2) Score-drop test:

Conduct simulations with the same parameters and inputs but exclude the experimental data. If the GeNMR score of the simulations with the experimental data is better (i.e. GeNMR score drops) than the GeNMR score of the simulations without experimental data, the score-drop criterion is satisfied (green lines, Figure 1).


The both RMSD and score-drop criteria need to be met for a simulation to be considered successful.


3) Additional correlation test for severely distorted models:

For severely distorted models with poor GeNMR score values (above 0), an additional indication of success or failure can be obtained from the Pearson coefficient of correlation of GeNMR score with CA RMSD to the best-scoring model (red lines,Figure 1). Successful simulations often have correlation coefficients above 0.5, whereas failed simulations have correlation coefficients near 0. While this criterion can be useful to evaluate CS-GAMDy success for models with significant 3D distortions (non-coil CA RMSD to the reference model > 3Å), it frequently fails for refinement of near-native models (non-coil CA RMSD to the reference model < 2Å).


Evaluating uncertainty of CS-GAMDy results:

Run 10 or more independent CS-GAMDy simulations. If an ensemble of best-scoring models from 5 successful runs (see the success criteria above) with the best average GeNMR scores has CA RMSD to the ensemble mean within 2Å, the uncertainty of CS-GAMDy results can be considered acceptable.

References:

1. Y. Shen, O. Lange, F. Delaglio et al., Proc Natl Acad Sci U S A 105 (12), 4685 (2008).
2. S. Raman, O. F. Lange, P. Rossi et al., Science 327 (5968), 1014 (2010).
3. J. M. Thompson, N. G. Sgourakis, G. Liu et al., Proc Natl Acad Sci U S A 109 (25),9875 (2012)


Problems? Suggestions? Please contact Wishart group