Announcing a fundamentally new way to superimpose structures: Maximum likelihood instead of least squares. http://www.theseus3d.org/ The Program: THESEUS is a unix command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While all conventional superpositioning methods use ordinary least-squares as the optimization criterion, THESEUS uses maximum likelihood, which provides superpositions with substantially improved accuracy (see the figure at http://www.theseus3d.org/ for an example). When superpositioning macromolecules with different residue sequences, other programs and algorithms currently discard residues that are aligned with gaps. THESEUS, however, uses a novel ML algorithm that includes all of the available data. The Rationale: Over 30 years ago, Cox, Diamond, McLachlan, Kabsch, and others investigated and solved the least-squares superposition problem for macromolecular structures (Flower 1999), and the least-squares method has been used effectively ever since for comparing structures. However, least-squares is not ideal. As a fitting criterion, least-squares is based theoretically on two strong assumptions: (1) that all atoms in a structure have the same variability and (2) that all atoms are independent and uncorrelated. We know that both of these assumptions are false. Some regions of a structure are more variable than others, and atoms are connected to each other via chemical bonds. The ML method used by THESEUS properly down-weights variable structural regions and corrects for correlations among atoms. The Benefits: ML superpositioning is robust and insensitive to the specific atoms included in the analysis. In current practice, regions of structures that are considered "unsuperimposable" or divergent are subjectively excluded from the superposition. However, when doing a ML superposition, you do not need to hand prune selected variable atomic coordinates, since the variability is already accounted for in the ML method. ML superpositioning will greatly improve our ability to accurately compare biological macromolecules in many applications, including analysis of NMR families, alternate crystal structures, evolutionarily homologous molecules, molecular dynamics simulations, and de novo structure predictions. Output from THESEUS includes both likelihood-based and frequentist statistics for evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. Residue ranges for excluding/including in the superposition can be specified on the command line. For ease of comparison, THESEUS will also calculates least-squares superpositions. Additionally, THESEUS performs principal components analysis (PCA) for analyzing the complex correlations found among the atoms and residues within a structural ensemble. Source code and binaries for several platforms are available from: http://www.theseus3d.org/ Refs: Theobald, D.L. and Wuttke, D.S. (2006) "THESEUS: Maximum likelihood superpositioning and analysis of macromolecular structures." Bioinformatics 22(17):2171 http://bioinformatics.oxfordjournals.org/cgi/content/abstract/22/17/2171 Overview of mathematical results and algorithm (supplementary materials from Theobald & Wuttke 2006): http://www.theseus3d.org/pdfs/ Theobald_Wuttke_2006_Bioinformatics_THESEUS_SuppMat.pdf Theobald, D. L. and Wuttke, D. S. (2006) "Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem." PNAS, in press Cox, J. M. (1967) "Mathematical methods used in the comparison of the quaternary structures." J Mol Biol, 28, 151–156. Diamond, R. (1966) "A mathematical model-building procedure for proteins." Acta Crystallogr, 21, 253–266. Diamond, R. (1976) "On the comparison of conformations using linear and quadratic transformations." Acta Crystallogr A, 32, 1–10. Flower, D. R. (1999) "Rotational superposition: A review of methods." J Mol Graph Model, 17, 238–244. Kabsch, W. (1978) "A discussion of the solution for the best rotation to relate two sets of vectors." Acta Crystallogr A, 34, 827–828. McLachlan, A. (1972) "A mathematical procedure for superimposing atomic coordinates of proteins." Acta Crystallogr A, 28, 656–657.