Academia.eduAcademia.edu

Outline

A study of combined structure/sequence profiles

1996

https://doi.org/10.1016/S1359-0278(96)00061-2

Abstract

Background: For genome sequencing projects to achieve their full impact on biology and medicine, each protein sequence must be identified with its threedimensional structure. Fold assignment methods (also called profile and threading methods) attempt to assign sequences to known protein folds by computing the compatibility of sequence to fold.

References (37)

  1. Dayhoff, M.O., Barker, W.C. & Hunt, L.T. (1983). Establishing homolo- gies in protein sequences. Methods Enzymol. 91, 254.
  2. Vingron, M. & Waterman, M.S. (1994). Sequence alignment and penalty choices. Review of concepts, case studies and implications. J. Mol. Biol. 235(1), 1-12.
  3. Orcutt, B.C., George, D.G. & Dayhoff, M.O. (1983). Protein and nucleic acid sequence database systems. Annu. Rev. Biochem. Biophys. 12, 419-441.
  4. Gribskov, M., McLachlan, A.D. & Eisenberg, D. (1987). Profile analysis: detection of distantly related proteins. Proc. Natl. Acad. Sci. USA 84(13), 4355-4388.
  5. Abagyan, R., Frishman, D. & Argos, P. (1994). Recognition of distantly related proteins through energy calculations. Proteins 19(2), 132-140.
  6. Bowie, J.U., Lüthy, R. & Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164-170.
  7. Bryant, S.H. & Lawrence, C.E. (1993). An empirical energy function for threading protein sequences through the folding motif. Proteins 16(1), 92-112.
  8. Fetrow, J.S. & Bryant, S.H. (1993). New programs for protein tertiary structure prediction. Biotechnology 11(4), 479-484.
  9. Flöckner, H., Braxenthaler, M., Lackner, P., Jaritz, M., Ortner, M. & Sippl, M.J. (1995). Progress in fold recognition. Proteins 23(3), 376-386.
  10. Godzik, A., Kolinski, A. & Skolnick, J. (1992). Topology fingerprint approach to the inverse protein folding problem. J. Mol. Biol. 227, 227-238.
  11. Jones, D.T., Taylor, W.R. & Thornton, J.M. (1992). A new approach to protein fold recognition. Nature 358, 86-89.
  12. Kocher, J.-P.A., Rooman, M.J. & Wodak, S.J. (1994). Factors influenc- ing the ability of knowledge-based potentials to identify sequence-structure matches. J. Mol. Biol. 235, 1598-1613.
  13. Ouzounis, C., Sander, C., Scharf, M. & Schneider, R. (1993). Predic- tion of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles from three-dimensional struc- tures. J. Mol. Biol. 232(3), 805-825.
  14. Rooman, M.J., Kocher, J.-P.A. & Wodak, S.J. (1992). Extracting infor- mation from the amino acid sequence: accurate predictions for protein regions with preferred conformation in the absence of tertiary interac- tions. Biochemistry 31, 10226-10238.
  15. Zhang, K.Y.J. & Eisenberg, D. (1994). The three-dimensional profile method using residue preferences as a continuous function of residue environment. Protein Sci. 3, 687-695.
  16. Matsuo, Y. & Nishikawa, K. (1994). Protein structural similarities pre- dicted by a sequence-structure compatibility method. Protein Sci. 3, 2055-2063.
  17. Wilmanns, M. & Eisenberg, D. (1993). Three-dimensional profiles from residue-pair preferences: identification of sequences with ␤/␣-barrel fold. Proc. Natl. Acad. Sci. USA 90, 1379-1383.
  18. Wilmanns, M. & Eisenberg, D. (1995). Inverse protein folding by the residue pair preference profile method: estimating the correctness of alignments of structurally compatible sequences. Protein Eng. 8, 627-639.
  19. Yi, T.M. & Lander, E.S. (1994). Recognition of related proteins by itera- tive template refinement (IRT). Protein Sci. 3(8), 1315-1328.
  20. Fischer, D. & Eisenberg, D. (1996). Protein fold recognition using sequence-derived predictions. Protein Sci. 5, 947-955.
  21. Henikoff, S. & Henikoff, J. (1992). Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915-10919.
  22. Gonnet, G.H., Cohen, M.A. & Benner, S.A. (1994). Analysis of amino- acid substitution during divergent evolution: the 400 by 400 dipeptide substitution matrix. Biochem. Biophys. Res. Commun. 199(2), 489-496.
  23. Genetics Computer Group. (1982). GCG: Version 8.0. Genetics Computer Group, Madison, Wisconsin, USA.
  24. Hendlich, M., et al., & Sippl, M.J. (1990). Identification of native folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. J. Mol. Biol. 216, 167-180.
  25. Lathrop, R.H. (1994). The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng. 7(9), 1059-1068.
  26. Smith, T.F., Waterman, M.S. & Fitch, W.M. (1981). Comparative biose- quence metrics. J. Mol. Evol. 18(1), 38-46.
  27. Fischer, D., Elofsson, A., Rice, D.W. & Eisenberg, D. (1996). Assess- ing the performance of fold recognition methods by means of a com- prehensive benchmark. In Pacific Symposium on Biocomputing '96. (Hunter, L., Klein, T.E., eds), pp. 300-318, World Scientific Publish- ers, New Jersey.
  28. Pearson, W.R. (1995). Comparison of methods for searching protein databases. Protein Sci. 4, 1145-1160.
  29. Fischer, D., Tsai, C.J., Nussinov, R. & Wolfson, H. (1995). A 3D sequence-independent representation of the protein data bank. Protein Eng. 8(10), 981-997.
  30. Smith, T.F., Waterman, M.S. & Sadler, J.R. (1983). Statistical charac- terization of nucleic acid sequence functional domains. Nucleic Acids Res. 11(7), 2205-2220.
  31. Waterman, M.S., Smith, T.F. & Katcher, H.L. (1984). Algorithms for restriction map comparisons. Nucleic Acids Res. 11(12), 237-242.
  32. Bowie, J.U. & Eisenberg, D. (1994). An evolutionary approach to folding ␣-helical proteins that uses sequence information and an empirical guiding fitness function. Proc. Natl. Acad. Sci. USA 91(10), 4436-4440.
  33. Elofsson, A., Le Grand, S.M. & Eisenberg, D. (1995). Local moves: an efficient algorithm for simulation of protein folding. Proteins 23, 73-82.
  34. Murzin, A.G., Brenner, S.E., Hubbard, T. & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247(4), 536-540.
  35. Hobohm, U. & Sander, C. (1994). Enlarged representative set of pro- teins. Protein Sci. 3, 522-524.
  36. Lee, B. & Richards, F.M. (1971). The interpretation of protein struc- tures: estimation of static accessibility. J. Mol. Biol. 55(3), 379-400.
  37. Sippl, M.J. (1993). Boltzmann's principle, knowledge based mean fields and protein folding. J. Comput. Aided Mol. Des. 7, 473-501.