Academia.eduAcademia.edu

Outline

Protein Disorder Prediction

2003, Structure

https://doi.org/10.1016/J.STR.2003.10.002

Abstract

It is becoming increasingly clear that many functionally important protein segments occur outside of globu-Biocomputing Unit Meyerhofstr 1 lar domains (Wright and Dyson, 1999; Dunker et al., 2002). Protein structure and function space is parti-D-69117 Heidelberg Germany tioned in two subspaces. The first consist of globular units with binding pockets, active sites, and interaction 2 Max-Delbrü ck-Centre fü r Molecular Medicine Robert-Rö ssle-Strasse 10 surfaces. The second subspace contains nonglobular segments such as sorting signals, posttranslational modi-D-13092 Berlin Germany fication sites, and protein ligands (e.g., SH3 ligands). Globular units are built of regular secondary structure 3 CellZome GmbH Meyerhofstr 1 elements and contribute the majority of the structural data deposited in PDB. In contrast, the nonglobular sub-D-69117 Heidelberg Germany space encompasses disordered, unstructured and flexible regions without regular secondary structure. Functional sites within the nonglobular space are known as linear motifs (cataloged by ELM [http://elm.eu.org]) Summary (Puntervoll et al., 2003). There are also many recent reports of Intrinsically A great challenge in the proteomics and structural genomics era is to predict protein structure and func-Disordered Proteins (IDPs, also known as Intrinsically Unstructured Proteins). These are proteins or domains tion, including identification of those proteins that are partially or wholly unstructured. Disordered regions in that, in their native state, are either completely disordered or contain large disordered regions. More than proteins often contain short linear peptide motifs (e.g., SH3 ligands and targeting signals) that are important 100 such proteins are known including Tau, Prions, Bcl-2, p53, 4E-BP1, and eIF1A (see Figure 4) (Tompa, for protein function. We present here DisEMBL, a computational tool for prediction of disordered/unstruc-2002; Uversky, 2002). Protein disorder is important for understanding pro-tured regions within a protein sequence. As no clear definition of disorder exists, we have developed pa-tein function as well as protein folding pathways (Plaxco and Gross, 2001; Verkhivker et al., 2003). Although little rameters based on several alternative definitions and introduced a new one based on the concept of "hot is understood about the cellular and structural meaning of IDPs, they are thought to become ordered only when loops," i.e., coils with high temperature factors. Avoiding potentially disordered segments in protein expression bound to another molecule (e.g., CREB-CBP complex [Radhakrishnan et al., 1997]) or owing to changes in constructs can increase expression, foldability, and stability of the expressed protein. DisEMBL is thus the biochemical environment (Dunker et al., 2001, 2002; Uversky, 2002). useful for target selection and the design of constructs as needed for many biochemical studies, particularly The current view on disorder is that disordered proteins are disordered to allow for more interaction part-structural biology and structural genomics projects. The tool is freely available via a web interface (http:// ners and modification sites (Wright and Dyson, 1999; Liu et al., 2002; Tompa, 2002). It has also been suggested dis.embl.de) and can be downloaded for use in largescale studies. that disordered proteins exist to provide a simple solution to having large intermolecular interfaces while keeping smaller protein, genome and cell sizes (Gunasekaran

References (38)

  1. structure captures protein flexibility. Structure 10, Liu, J., Tan, H., and Rost, B. (2002). Loopy proteins appear con- served in evolution. J. Mol. Biol. 322, 53-64. 175-184.
  2. Plaxco, K., and Gross, M. (2001). Unfolded, yes, but random? Never! Aviles, F., Chapman, G., Kneale, G., Crane-Robinson, C., and Brad- Nat. Struct. Biol. 8, 659-660.
  3. bury, E. (1978). The conformation of histone H5. Isolation and char- acterisation of the globular segment. Eur. J. Biochem. 88, 363-371. Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. (2002). Numerical Recipes in Cϩϩ The Art of Scientific Computing. Cam- Bates, G. (2003). Huntingtin aggregation and toxicity in Huntington's bridge University Press, second edition. disease. Lancet 361, 1642-1644.
  4. Promponas, V., Enright, A., Tsoka, S., Kreil, D., Leroy, C., Hamodrakas, Battiste, J., Pestova, T., Hellen, C., and Wagner, G. (2000). The eIF1A S., Sander, C., and Ouzounis, C. (2000). CAST: an iterative algorithm solution structure reveals a large RNA-binding surface important for for the complexity analysis of sequence tracts. Complexity analysis scanning function. Mol. Cell 5, 109-119. of sequence tracts. Bioinformatics 16, 915-922.
  5. Brenner, S. (2000). Target selection for structural genomics. Nat. Puntervoll, P., Linding, R., Gemund, C., Chabanis-Davidson, S., Struct. Biol. Sppl. 7, 967-969.
  6. Mattingsdal, M., Cameron, S., Martin, D., Ausiello, G., Brannetti, B., Brooks, B., and Karplus, M. (1985). Normal modes for specific mo- Costantini, A., Ferre, F., Maselli, V., Via, A., Cesareni, G., Diella, F., tions of macromolecules: application to the hinge-bending mode of et al. (2003). ELM server: a new resource for investigating short lysozyme. Proc. Natl. Acad. Sci. USA 82, 4995-4999. functional sites in modular eukaryotic proteins. Nucleic Acids Res.
  7. Cornilescu, G., Delaglio, F., and Bax, A. (1999). Protein backbone 31, 3625-3630.
  8. angle restraints from searching a database for chemical shift and Radhakrishnan, I., Perez-Alvarado, G., Parker, D., Dyson, H., Mont- sequence homology. J. Biomol. NMR 13, 289-302. miny, M., and Wright, P. (1997). Solution structure of the KIX domain Dedmon, M., Patel, C., Young, G., and Pielak, G. (2002). FlgM gains of CBP bound to the trans-activation domain of CREB: a model for structure in living cells. Proc. Natl. Acad. Sci. USA 99, 12681-12684. activator:coactivator interactions. Cell 91, 741-752.
  9. Demarest, S., Martinez-Yamout, M., Chung, J., Chen, H., Xu, W., Romero, P., Obradovic, Z., Kissinger, C.R., Villafranca, J., and Dyson, H., Evans, R., and Wright, P. (2002). Mutual synergistic fold- Dunker, A. (1997). Identifying disordered proteins from amino acid ing in recruitment of CBP/p300 by p160 nuclear receptor coactiva- sequences. Proc. IEEE Int. Conf. Neural Networks 1, 90-95. tors. Nature 415, 549-553.
  10. Saqi, M., and Sternberg, M. (1994). Identification of sequence motifs Dunker, A., Brown, C., Lawson, J., Iakoucheva, L., and Obradovic, from a set of proteins with related function. Protein Eng. 7, 165-171.
  11. Z. (2002). Intrinsic disorder and protein function. Biochemistry 41, Schweers, O., Schonbrunn-Hanebeck, E., Marx, A., and Mandelkow, 6573-6582.
  12. E. (1994). Structural studies of tau protein and Alzheimer paired
  13. Dunker, A., Garner, E., Guilliot, S., Romero, P., Albrecht, K., Hart, helical filaments show no evidence for beta-structure. J. Biol. Chem.
  14. J., Obradovic, Z., Kissinger, C., and Villafranca, J. (1998). Protein 269, 24290-24297.
  15. disorder and the evolution of molecular recognition: theory, predic- Shortle, D., and Ackerman, M. (2001). Persistence of native-like to- tions and observations. Pac. Symp. Biocomput., 473-484. pology in a denatured protein in 8 M urea. Science 293, 487-489.
  16. Dunker, A., Lawson, J., Brown, C., Williams, R., Romero, P., Oh, J., Smith, D., Radivojac, P., Obradovic, Z., Dunker, A., and Zhu, G.
  17. Oldfield, C., Campen, A., Ratliff, C., Hipps, K., et al. (2001). Intrinsi- (2003). Improved amino acid flexibility parameters. Protein Sci. 12, cally disordered protein. J. Mol. Graph. Model. 19, 26-59. 1060-1072.
  18. Evans, P., and Owen, D. (2002). Endocytosis and vesicle trafficking. Smyth, E., Syme, C., Blanch, E., Hecht, L., Vasak, M., and Barron, Curr. Opin. Struct. Biol. 12, 814-821.
  19. L. (2001). Solution structure of native proteins with irregular folds
  20. Garner, E., Cannon, P., Romero, P., Obradovic, Z., and Dunker, A. from Raman optical activity. Biopolymers 58, 138-151.
  21. Predicting disordered regions from amino acid sequence. Tompa, P. (2002). Intrinsically unstructured proteins. Trends Bio- Common themes despite differing structural characterization. Ge- chem. Sci. 27, 527-533.
  22. nome Inform. Ser. Workshop Genome Inform. 9, 201-213.
  23. Uversky, V. (2002). Natively unfolded proteins: a point where biology
  24. Garner, E., Romero, P., Dunker, A., Brown, C., and Obradovic, Z. waits for physics. Protein Sci. 11, 739-756.
  25. Predicting binding regions within disordered proteins. Ge- Verkhivker, G., Bouzida, D., Gehlhaar, D., Rejto, P., Freer, S., and nome Inform. Ser. Workshop Genome Inform. 10, 41-50.
  26. Rose, P. (2003). Simulating disorder-order transitions in molecular Gunasekaran, K., Tsai, C., Kumar, S., Zanuy, D., and Nussinov, R. recognition of unstructured proteins: where folding meets binding.
  27. Extended disordered proteins: targeting function with less Proc. Natl. Acad. Sci. USA 100, 5148-5153. scaffold. Trends Biochem. Sci. 28, 81-85.
  28. Vihinen, M., Torkkila, E., and Riikonen, P. (1994). Accuracy of protein
  29. Hegger, R., Kantz, H., and Schreiber, T. (1999). Practical implementa- flexibility predictions. Proteins 19, 141-149.
  30. tion of nonlinear time series methods: The tisean package. CHAOS 9. Wootton, J. (1994). Non-globular domains in protein sequences: Jensen, L.J., Gupta, R., Blom, N., Devos, D., Tamames, J., Kesmir, automated segmentation using complexity measures. Comput.
  31. C., Nielsen, H., Staerfeldt, H.H., Rapacki, K., Workman, C., et al. Chem. 18, 269-285.
  32. Prediction of human protein function from post-translational Wright, P., and Dyson, H. (1999). Intrinsically unstructured proteins: modifications and localization features. J. Mol. Biol. 319, 1257-1265. re-assessing the protein structure-function paradigm. J. Mol. Biol.
  33. Kabsch, W., and Sander, C. (1983). Dictionary of protein secondary 293, 321-331. structure: pattern recognition of hydrogen-bonded and geometrical Zoete, V., Michielin, O., and Karplus, M. (2002). Relation between features. Biopolymers 22, 2577-2637. sequence and structure of HIV-1 protease inhibitor complexes: a
  34. Kaplan, B., Ratner, V., and Haas, E. (2003). alpha-Synuclein: Its model system for the analysis of protein flexibility. J. Mol. Biol. 315, biological function and role in neurodegenerative diseases. J. Mol. 21-52.
  35. Neurosci. 20, 83-92.
  36. Klein-Seetharaman, J., Oikawa, M., Grimshaw, S., Wirmer, J., Duchardt, E., Ueda, T., Imoto, T., Smith, L., Dobson, C., and Schwalbe, H. (2002). Long-range interactions within a nonnative protein. Science 295, 1719-1722.
  37. Li, X., Obradovic, Z., Brown, C., Garner, E., and Dunker, A. (2000). Comparing predictors of disordered protein. Genome Inform. Ser. Workshop Genome Inform. 11, 172-184.
  38. Linding, R., Russell, R.B., Neduva, V., and Gibson, T.J. (2003). Glob- Plot: exploring protein sequences for globularity and disorder. Nu- cleic Acids Res. 31, 3701-3708.