World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.


    High-throughput computational methods in X-ray protein crystallography are indispensable to meet the goals of structural genomics. In particular, automated interpretation of electron density maps, especially those at mediocre resolution, can significantly speed up the protein structure determination process. TEXTALTM is a software application that uses pattern recognition, case-based reasoning and nearest neighbor learning to produce reasonably refined molecular models, even with average quality data. In this work, we discuss a key issue to enable fast and accurate interpretation of typically noisy electron density data: what features should be used to characterize the density patterns, and how relevant are they? We discuss the challenges of constructing features in this domain, and describe SLIDER, an algorithm to determine the weights of these features. SLIDER searches a space of weights using ranking of matching patterns (relative to mismatching ones) as its evaluation function. Exhaustive search being intractable, SLIDER adopts a greedy approach that judiciously restricts the search space only to weight values that cause the ranking of good matches to change. We show that SLIDER contributes significantly in finding the similarity between density patterns, and discuss the sensitivity of feature relevance to the underlying similarity metric.


    • Protein Data Bank (PDB) Annual Report 2003, . Google Scholar
    • I.   Tsigelny (ed.) , Protein Structure Determination: Bioinformatic Approach ( International University Line , La Jolla , 2002 ) . Google Scholar
    • S. K. Burleyet al., Nature Gen. 232, 151 (1999). Google Scholar
    • J. S. Richardson and D. C. Richardson, Method Enzymol. 115, 189 (1985). Crossref, MedlineGoogle Scholar
    • S. L. Mowbrayet al., Acta Cryst. D55, 1309 (1999). Google Scholar
    • C. I. Branden and T. A. Jones, Nature 343, 687 (1990). CrossrefGoogle Scholar
    • A. Perrakis, R. Morris and V. Lamzin, Nature Struc. Biol. 6, 458 (1999). Crossref, MedlineGoogle Scholar
    • G. J. Kleywegt and T. A. Jones, Acta Cryst. D53, 179 (1997). Google Scholar
    • D. G. Levitt, Acta Cryst. D57, 1013 (2001). Google Scholar
    • Turk D., Towards automatic macromolecular crystal structure determination, in Turk D., Johnson L. (eds.), Methods in Macromolecular Crystallography, NATO Science Series I, Vol. 325, pp. 148–155, 2001 . Google Scholar
    • J.   Kolodner , Case-Based Reasoning ( Morgan Kaufmann Publishers , San Mateo , 1993 ) . CrossrefGoogle Scholar
    • D. B.   Leake (ed.) , Case-Based Reasoning — Experiences, Lessons and Future Directions ( MIT Press , Cambridge , 1996 ) . Google Scholar
    • Fix E., Hodges J., Discriminatory analysis, nonparametric discrimination: consistency properties, Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas, 1951 . Google Scholar
    • Y. Okaya and R. Pepinsky, Phys. Rev. 103, 1645 (1956). CrossrefGoogle Scholar
    • W. A. Hendrickson, Science 254, 51 (1991). Crossref, MedlineGoogle Scholar
    • D. M. Blow and F. H. C. Crick, Acta Cryst. 12, 794 (1959). CrossrefGoogle Scholar
    • H. Ke, Method Enzymol. 276, 448 (1997). Crossref, MedlineGoogle Scholar
    • E. A. Feigenbaum, R. S. Engelmore and C. K. Johnson, Acta Cryst. A33, 13 (1997). Google Scholar
    • Terry A., The CRYSALIS project: hierarchical control of production systems, Technical Report HPP-83-19, Stanford University, 1983 . Google Scholar
    • J.   Glasgow , S.   Fortier and F.   Allen , Artificial Intelligence and Molecular Biology , ed. L.   Hunter ( MIT Press , Cambridge , 1993 ) . Google Scholar
    • S. Fortieret al., Method Enzymol. 277, 131 (1997). Crossref, MedlineGoogle Scholar
    • T. A. Jones and S. Thirup, EMBO J. 5(4), 819 (1986). Crossref, MedlineGoogle Scholar
    • D. J. Dilleret al., Proteins 36, 526 (1999). Crossref, MedlineGoogle Scholar
    • L. Holm and C. Sander, J. Mol. Biol. 218, 183 (1991). Crossref, MedlineGoogle Scholar
    • T. A. Jones, J. Y. Zou and S. W. Cowtan, Acta Cryst. A47, 110 (1991). Google Scholar
    • T. C. Terwilliger, Acta Cryst. D56, 965 (2000). Google Scholar
    • T. J.   Oldfield , Crystallographic Computing 7, Proceedings from the Macromolecular Crystallography Computing School , eds. P. E.   Bourne and K.   Watenpaugh ( Oxford University Press , Corby, UK , 1996 ) . Google Scholar
    • T. A. Jones and M. Kjeldgaard, Method Enzymol. 277, 173 (1997). Crossref, MedlineGoogle Scholar
    • R. Diamond, Acta Cryst. A27, 436 (1971). Google Scholar
    • T. F. Smith and M. S. Waterman, J. Mol. Biol. 147, 195 (1981). Crossref, MedlineGoogle Scholar
    • T. R. Ioergeret al.R. M. Sweetet al., Method Enzymol. 374, 244 (2003). Crossref, MedlineGoogle Scholar
    • T. R. Ioerger and J. C. Sacchettini, Acta Cryst. D5, 2043 (2002). Google Scholar
    • T. R. Holtonet al., Acta Cryst. D46, 722 (2000). Google Scholar
    • Gopal K., Pai R., Ioerger T. R., Romo T. D., Sacchettini J. C., TEXTAL™: artificial intelligence techniques for automated protein structure determination, in Proceedings the 8th Conference on Innovative Applications of Artificial Intelligence, Acapulco, Mexico, pp. 93–100, 2003 . Google Scholar
    • Brünger AT, XPLOR manual, version 3.1, Yale University, 1992 . Google Scholar
    • J. Greer, Method Enzymol. 115, 206 (1985). Crossref, MedlineGoogle Scholar
    • S. M. Swanson, Acta Cryst. D50, 695 (1994). Google Scholar
    • Smyth B., Cunningham P., The utility problem analyzed: a case-based reasoning perspective, 3rd European Workshop on Case-Based Reasoning, Lausanne, Switzerland, Advances in Case-Based Reasoning, Lecture Notes in Computer Science, pp. 392–399, 1996 . Google Scholar
    • Gopal K., Romo T. D., Sacchettini J. C., Ioerger T. R., Evaluation of geometric & probabilistic measures of similarity to retrieve electron density patterns for protein structure determination, Proc ICAI-04, Las Vegas, pp. 427–432, 2004 . Google Scholar
    • Gopal K., Romo T. D., Sacchettini J. C., Ioerger T. R., Efficient retrieval of electron density patterns for modeling proteins by X-ray crystallography, Proceedings of the International Conference on Machine Learning & Applications, Louisville, pp. 380–387, 2004 . Google Scholar
    • P. D. Adamset al., J. Synchrotron. Rad. 11, 53 (2004). Crossref, MedlineGoogle Scholar
    • A. L. Blum and P. Langley, Artif. Int. 97, 245 (1997). CrossrefGoogle Scholar
    • R. O.   Duda , P. E.   Hart and D. G.   Stork , Pattern Classification ( John Wiley and Sons Inc , New York , 2001 ) . Google Scholar
    • Aha D. W., A study of instance-based algorithms for supervised learning tasks: mathematical, empirical, and psychological observations, Ph. D. Thesis, University of California, Irvine, 1990 . Google Scholar
    • H.   Liu and H.   Motoda (eds.) , Feature Extraction, Construction, and Selection: A Data Mining Perspective ( Kluwer , Boston , 1998 ) . CrossrefGoogle Scholar
    • D. W.   Aha , Feature Extraction, Construction and Selection: A Data Mining Perspective , eds. H.   Liu and H.   Motoda ( Kluwer , Boston , 1998 ) . Google Scholar
    • Langley P., Iba W., Average-case analysis of a nearest neighbor algorithm, Proc. IJCAI-93, Chambery, France, pp. 889–894, 1993 . Google Scholar
    • Kira K., Rendell L. A., A practical approach to feature selection, Proceedings of the 9th International Conference on Machine Learning, pp. 249–256, 1992 . Google Scholar
    • John G., Kohavi R., Pfleger K., Irrelevant features and the subset selection problem, Proceedings of the 11th International Conference on Machine Learning, pp. 121–129, 1994 . Google Scholar
    • Kohavi R., Langley P., Yun Y., The utility of feature weighting in nearest-neighbor algorithms, Proceedings of the European Conference on Machine Learning, 1997 . Google Scholar
    • P. Domingos, Artif. Int. Rev. 11, 227 (1997). CrossrefGoogle Scholar
    • N. Howe and C. Cardie, Lect. Notes. Artif. Int. 455 (1997). Google Scholar
    • R. Greiner, A. J. Grove and A. Kogan, Artif. Int. 97, 345 (1997). CrossrefGoogle Scholar
    • Jakulin A., Bratko I., Testing the significance of attribute interactions, Proceedings of the ICML-04, Banff, 2004 . Google Scholar
    • T. R. Holtonet al., Acta Cryst. D46, 722 (2000). Google Scholar
    • S.   Russel and P.   Norvig , Artificial Intelligence: A Modern Approach ( Prentice Hall , New Jersey , 1995 ) . Google Scholar
    • U. Hobohmet al., Protein Sci. 1, 409 (1992). Crossref, MedlineGoogle Scholar
    • H. M. Bermanet al., Nucleic Acids Res. 28, 235 (1992). Crossref, MedlineGoogle Scholar
    • C. Eickenet al., J. Mol. Biol. 333(4), 683 (2003). Crossref, MedlineGoogle Scholar
    • T. S. Peatet al., Structure 6, 1207 (1998). Crossref, MedlineGoogle Scholar
    • D. Yanget al., J. Biol. Chem. 277(11), 9462 (2002). Crossref, MedlineGoogle Scholar
    • C. C. Huanget al., J. Biol. Chem. 277, 11559 (2002). Crossref, MedlineGoogle Scholar
    • Aksoy S., Haralick R. M., Probabilistic vs. geometric similarity measures for image retrieval, Proceedings of the Computer Vision and Pattern Recognition (CPRV), pp. 112–128, 2001 . Google Scholar
    • Kontkanen P., Myllymaki P., Silander T., Tirri H., A Bayesian approach for retrieving relevant cases, in Smith P. (ed.), Artificial Intelligence Applications, Proceedings of the EXPERSYS-97, Sunderland, UK, pp. 67–72, 1997 . Google Scholar