Papers by Alexey Bochkarev

Glyoxylate and Pyruvate Are Antagonistic Effectors of the Escherichia coli IclR Transcriptional Regulator
Journal of Biological Chemistry, 2007
The Escherichia coli isocitrate lyase regulator (IclR) regulates the expression of the glyoxylate... more The Escherichia coli isocitrate lyase regulator (IclR) regulates the expression of the glyoxylate bypass operon (aceBAK). Founding member of a large family of common fold transcriptional regulators, IclR comprises a DNA binding domain that interacts with the operator sequence and a C-terminal domain (C-IclR) that binds a hitherto unknown small molecule. We screened a chemical library of more than 150 metabolic scaffolds using a high-throughput protein stability assay to identify molecules that bind IclR and then tested the active compounds in in vitro assays of operator binding. Glyoxylate and pyruvate, identified by this method, bound the C-IclR domain with KD values of 0.9+/-0.2 and 156.2+/-7.9 microM, as defined by isothermal titration calorimetry. Both compounds altered IclR interactions with operator DNA in electrophoretic mobility shift assays but showed an antagonistic effect. Glyoxylate disrupted the formation of the IclR/operator complex in vitro by favoring the inactive dimeric state of the protein, whereas pyruvate increased the binding of IclR to the aceBAK promoter by stabilizing the active tetrameric form of the protein. Structures of the C-IclR domain alone and in complex with each effector were determined at 2.3 A, confirming the binding of both molecules in the effector recognition site previously characterized for the other representative of the family, the E. coli AllR regulator. Site-directed mutagenesis demonstrated the importance of hydrophobic patch formed by Met-146, Leu-154, Leu-220, and Leu-143 in interactions with effector molecules. In general, our strategy of combining chemical screens with functional assays and structural studies has uncovered two small molecules with antagonistic effects that regulate the IclR-dependent transcription of the aceBAK operon.

Membrane proteins constitute 30% of prokaryotic and eukaryotic genomes but comprise a small fract... more Membrane proteins constitute 30% of prokaryotic and eukaryotic genomes but comprise a small fraction of the entries in protein structural databases. A number of features of membrane proteins render them challenging targets for the structural biologist, among which the most important is the difficulty in obtaining sufficient quantities of purified protein. We have developed robust procedures to express and purify large numbers of prokaryotic membrane proteins. Using a set of standard conditions, expression can be detected in the membrane fraction for approximately 30% of cloned targets. To date, over 30 membrane proteins have been purified in quantities sufficient for structural studies, typically in just two chromatographic steps. Theses include several transporters/ channels, sensor kinases, and rhomboid intramembrane proteases. Using this system, we have recently crystallized and solved the structure of the CorA magnesium transporter, the primary Mg 2+ uptake system of most prokaryotes. Crystal structures of the full-length Thermotoga maritima CorA in an apparent closed state and its isolated cytoplasmic domain were determined at 3.9Å and 1.85Å resolution respectively. Our HTP strategy for membrane proteins, and the first structure from this effort, will be discussed.

Nature, 2006
The magnesium ion, Mg 21 , is essential for myriad biochemical processes and remains the only maj... more The magnesium ion, Mg 21 , is essential for myriad biochemical processes and remains the only major biological ion whose transport mechanisms remain unknown. The CorA family of magnesium transporters is the primary Mg 21 uptake system of most prokaryotes 1-3 and a functional homologue of the eukaryotic mitochondrial magnesium transporter 4 . Here we determine crystal structures of the full-length Thermotoga maritima CorA in an apparent closed state and its isolated cytoplasmic domain at 3.9 Å and 1.85 Å resolution, respectively. The transporter is a funnelshaped homopentamer with two transmembrane helices per monomer. The channel is formed by an inner group of five helices and putatively gated by bulky hydrophobic residues. The large cytoplasmic domain forms a funnel whose wide mouth points into the cell and whose walls are formed by five long helices that are extensions of the transmembrane helices. The cytoplasmic neck of the pore is surrounded, on the outside of the funnel, by a ring of highly conserved positively charged residues. Two negatively charged helices in the cytoplasmic domain extend back towards the membrane on the outside of the funnel and abut the ring of positive charge. An apparent Mg 21 ion was bound between monomers at a conserved site in the cytoplasmic domain, suggesting a mechanism to link gating of the pore to the intracellular concentration of Mg 21 .

From RPA to BRCA2: lessons from single-stranded DNA binding by the OB-fold
Current Opinion in Structural Biology, 2004
Recent years have witnessed tremendous progress in our structural and biophysical understanding o... more Recent years have witnessed tremendous progress in our structural and biophysical understanding of how replication protein A (RPA), a major nuclear ssDNA-binding protein (SSB), binds DNA. The four ssDNA-binding domains of RPA have the characteristic OB (oligonucleotide/oligosaccharide-binding) fold and contact DNA with specific polarity via a hierarchy-driven dynamic pathway. A growing mass of data suggest that many aspects of the ssDNA binding mechanism are conserved among SSBs of different origin. However, this conservation is not restricted to the SSB class. The concepts of ssDNA binding by the OB-fold, first derived from the RPA structure, have been successfully applied to the functional characterization of the BRCA2 (breast cancer susceptibility gene 2) protein. The BRCA2 structure, in its turn, has helped to better understand RPA function.

Crystal Structure of the DNA-Binding Domain of the Epstein–Barr Virus Origin-Binding Protein, EBNA1, Bound to DNA
The crystal structure of the DNA-binding and dimerization domains of the Epstein-Barr virus nucle... more The crystal structure of the DNA-binding and dimerization domains of the Epstein-Barr virus nuclear antigen 1 (EBNA1), which binds to and activates DNA replication from the latent origin of replication in Epstein-Barr virus, was solved at 2.5 A resolution. EBNA1 appears to bind DNA via two independent regions termed the core and the flanking DNA-binding domains. The core DNA-binding domain, which comprises both the dimerization domain and a helix predicted to bind the inner portion of the EBNA1 DNA recognition element, was remarkably similar to the structure of the papillomavirus E2 protein, despite a complete lack of sequence conservation. The flanking DNA-binding domain, only a portion of which is contained in the current structure, consists in part of an alpha helix whose N-terminus contacts the outer regions of the EBNA1 DNA recognition element.

High Throughput Crystallography at SGC Toronto: an Overview
Methods in Molecular Biology, 2008
The completion of the human genome allows the analysis, for the first time, of biological systems... more The completion of the human genome allows the analysis, for the first time, of biological systems in the context of entire gene families. For enzymes, this approach permits the exploration of complex substrate specificity networks that often exhibit considerable overlap within and between protein families. The case for a family-based approach to protein studies is compelling, given the prospect of exploiting these specificities for various purposes, such as the development of therapeutic reagents. The Structural Genomics Consortium (SGC) was created to determine the structures of proteins with relevance to human health and place the structures into the public domain without restriction on use. The SGC operates out of the Universities of Toronto and Oxford, and Karolinska Institutet, each working on nonoverlapping protein target lists. The SGC focus on human protein families requires a repertoire of crystallography methods that differ from those adopted by structural genomics projects that are focused on filling out protein fold space. The key differences are heavier reliance on in house x-ray sources for diffraction data collection and predominant use of molecular replacement for phase determination. As projects such as the US Protein Structure Initiative and others fill the PDB with representatives of most major fold families, the SGC approach will become an increasingly useful model for many structural biology laboratories in the future. Technical details of the flow of samples and data within the high throughput (HTP) environment at SGC Toronto are presented, and provide a useful paradigm for the organization of collaborative or shared x-ray instrumentation facilities.

Proteins: Structure, Function, and Bioinformatics, 2007
Human thiopurine S-methyltransferase (TPMT) exhibits considerable person-to-person variation in a... more Human thiopurine S-methyltransferase (TPMT) exhibits considerable person-to-person variation in activity to thiopurine drugs. We have produced an N-terminal truncation of human TPMT protein, crystallized the protein in complex with the methyl donor product S-adenosyl-L-homocysteine, and determined the atomic structure to the resolution of 1.58 and 1.89 Å , respectively, for the seleno-methionine incorporated and wild type proteins. The structure of TPMT indicates that the naturally occurring amino acid polymorphisms scatter throughout the structure, and that the amino acids whose alteration have the most influence on function are those that form intra-molecular stabilizing interactions (mainly van der Waals contacts). Furthermore, we have produced four TPMT mutant proteins containing variant alleles of TPMT*2, *3A, *3B, and *3C and examined the structure-function relationship of the mutant proteins based on their expression and solubility in bacteria and their thermostability profile. Proteins 2007; 67:198-208. V V C 2007 Wiley-Liss, Inc.
First crystallographic models of human TBC domains in the context of a family-wide structural analysis
Proteins: Structure, Function, and Bioinformatics, 2008
... First crystallographic models of human TBC domains in the context of a family-wide structural... more ... First crystallographic models of human TBC domains in the context of a family-wide structural analysis. Wolfram Tempel 1, ,; Yufeng Tong 1, ,; Svetoslav Dimov 1 ,; Alexey Bochkarev 1 ,; Heewon Park 1,2,*. ... A,Fedorov R,Alexandrov K,Albert S,Goody RS,Gallwitz D,Scheidig AJ. ...
An intact SAM-dependent methyltransferase fold is encoded by the human endothelin-converting enzyme-2 gene
Proteins: Structure, Function, and Bioinformatics, 2009
... 30 Vagin AA,Steiner RA,Lebedev AA,Potterton L,McNicholas S,Long F,Murshudov GN. REFMAC5 dicti... more ... 30 Vagin AA,Steiner RA,Lebedev AA,Potterton L,McNicholas S,Long F,Murshudov GN. REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr D 2004; 60: 21842195. ...
Proteins: Structure, Function, and Bioinformatics, 2006
Proteins: Structure, Function, and Bioinformatics, 2007

Proceedings of the National Academy of Sciences, 2005
One of many protein-protein interactions modulated upon DNA damage is that of the single-stranded... more One of many protein-protein interactions modulated upon DNA damage is that of the single-stranded DNA-binding protein, replication protein A (RPA), with the p53 tumor suppressor. Here we report the crystal structure of RPA residues 1-120 (RPA70N) bound to the N-terminal transactivation domain of p53 (residues 37-57; p53N) and, by using NMR spectroscopy, characterize two mechanisms by which the RPA͞p53 interaction can be modulated. RPA70N forms an oligonucleotide͞oligosaccharide-binding fold, similar to that previously observed for the ssDNA-binding domains of RPA. In contrast, the N-terminal p53 transactivation domain is largely disordered in solution, but residues 37-57 fold into two amphipathic helices, H1 and H2, upon binding with RPA70N. The H2 helix of p53 structurally mimics the binding of ssDNA to the oligonucleotide͞oligosaccharide-binding fold. NMR experiments confirmed that both ssDNA and an acidic peptide mimicking a phosphorylated form of RPA32N can independently compete the acidic p53N out of the binding site. Taken together, our data suggest a mechanism for DNA damage signaling that can explain a threshold response to DNA damage.
PLOS Biology, 2007
DNA replication is initiated upon binding of “initiators” to origins of replication. In simian vi... more DNA replication is initiated upon binding of “initiators” to origins of replication. In simian virus 40 (SV40), the core origin contains four pentanucleotide binding sites organized as pairs of inverted repeats. Here we describe the crystal structures of the origin binding domain (obd) of the SV40 large T-antigen (T-ag) both with and without a subfragment of origin-containing DNA. In the

PLoS Biology, 2007
The human cytosolic sulfotransfases (hSULTs) comprise a family of 12 phase II enzymes involved in... more The human cytosolic sulfotransfases (hSULTs) comprise a family of 12 phase II enzymes involved in the metabolism of drugs and hormones, the bioactivation of carcinogens, and the detoxification of xenobiotics. Knowledge of the structural and mechanistic basis of substrate specificity and activity is crucial for understanding steroid and hormone metabolism, drug sensitivity, pharmacogenomics, and response to environmental toxins. We have determined the crystal structures of five hSULTs for which structural information was lacking, and screened nine of the 12 hSULTs for binding and activity toward a panel of potential substrates and inhibitors, revealing unique ''chemical fingerprints'' for each protein. The family-wide analysis of the screening and structural data provides a comprehensive, high-level view of the determinants of substrate binding, the mechanisms of inhibition by substrates and environmental toxins, and the functions of the orphan family members SULT1C3 and SULT4A1. Evidence is provided for structural ''priming'' of the enzyme active site by cofactor binding, which influences the spectrum of small molecules that can bind to each enzyme. The data help explain substrate promiscuity in this family and, at the same time, reveal new similarities between hSULT family members that were previously unrecognized by sequence or structure comparison alone. Citation: Allali-Hassani A, Pan PW, Dombrovski L, Najmanovich R, Tempel W, et al. (2007) Structural and chemical profiling of the human cytosolic sulfotransferases. PLoS Biol 5(5): e97.

Journal of Molecular Biology, 2006
The interaction of Escherichia coli AllR regulator with operator DNA is disrupted by the effector... more The interaction of Escherichia coli AllR regulator with operator DNA is disrupted by the effector molecule glyoxylate. This is a general, yet uncharacterized regulatory mechanism for the large IclR family of transcriptional regulators to which AllR belongs. The crystal structures of the C-terminal effector-binding domain of AllR regulator and its complex with glyoxylate were determined at 1.7 and 1.8 A, respectively. Residues involved in glyoxylate binding were explored in vitro and in vivo. Altering the residues Cys217, Ser234 and Ser236 resulted in glyoxylate-independent repression by AllR. Sequence analysis revealed low conservation of amino acid residues participating in effector binding among IclR regulators, which reflects potential chemical diversity of effector molecules, recognized by members of this family. Comparing the AllR structure to that of Thermotoga maritima TM0065, the other representative of the IclR family that has been structurally characterized, indicates that both proteins assume similar quaternary structures as a dimer of dimers. Mutations in the tetramerization region, which in AllR involve the Cys135-Cys142 region, resulted in dissociation of AllR tetramer to dimers in vitro and were functionally inactive in vivo. Glyoxylate does not appear to function through the inhibition of tetramerization. Using sedimentation velocity, glyoxylate was shown to conformationally change the AllR tetramer as well as monomer and dimer resulting in altered outline of AllR molecules.

Journal of Molecular Biology, 2006
The crystal structure of the complex between neuraminidase (NA) of influenza virus A/Memphis/31/9... more The crystal structure of the complex between neuraminidase (NA) of influenza virus A/Memphis/31/98 (H3N2) and Fab of monoclonal antibody Mem5 has been determined at 2.1A resolution and shows a novel pattern of interactions compared to other NA-Fab structures. The interface buries a large area of 2400 A2 and the surfaces have high complementarity. However, the interface is also highly hydrated. There are 33 water molecules in the interface>or=95% buried from bulk solvent, but only 13 of these are isolated from other water molecules. The rest are involved in an intricate network of water-mediated hydrogen bonds throughout the interface, stabilizing the complex. Glu199 on NA, the most critical side-chain to the interaction as previously determined by escape mutant analysis and site-directed mutation, is located in a non-aqueous island. Glu199 and three other residues that contribute the major part of the antigen buried surface of the complex have mutated in human influenza viruses isolated after 1998, confirming that Mem5 identifies an epidemiologically important antigenic site. We conclude that antibody selection of NA variants is a significant component of recent antigenic drift in human H3N2 influenza viruses, supporting the idea that influenza vaccines should contain NA in addition to hemagglutinin.
Journal of Molecular Biology, 1998
... Epstein-Barr nuclear antigen 1 (EBNA1) activates the initiation of DNA replication once every... more ... Epstein-Barr nuclear antigen 1 (EBNA1) activates the initiation of DNA replication once every cell cycle from the Epstein-Barr virus (EBV) latent origin of DNA replication, oriP (reviewed by Yates, 1996). ... EBNA1 protein (pink) is shown as a ribbon. ...

Journal of Molecular Biology, 1995
Two-dimensional (2D) crystals of proteins on lipid monolayers can initiate the formation of large... more Two-dimensional (2D) crystals of proteins on lipid monolayers can initiate the formation of large three-dimensional (3D) crystals suitable for X-ray diffraction studies. The role of the 2D crystals in this process has not been firmly established. While it is likely that the 2D crystals serve as nuclei for epitaxial crystal growth, other mechanisms, such as non-specific nucleation induced by the high local concentration of the protein at the surface of the lipid layer, cannot be excluded. Using streptavidin as a model system, we have now firmly established that 3D crystal growth from 2D crystals on lipid layers occurs by epitaxy. We show that 2D crystals of streptavidin (space group C222) on biotinated lipid layers nucleate the growth of a 3D crystal form (space group I4I22) that possesses a structural similarity with the 2D crystal, but have no effect on the growth of 3D crystal forms (I222 and P2(1)) that are unrelated to the 2D crystal. At lower pH, a new 3D crystal form (space group P1), unrelated to the previously described 2D crystals, grew from lipid layers. This discovery initially raised concern about the validity of the epitaxial mechanism, but these concerns were alleviated with the subsequent discovery of a structurally related 2D P1 crystal that grew in similar solution conditions. Some parameters affecting epitaxial growth of both the P1 and I4I22 crystals were investigated, revealing several noteworthy features of the epitaxial growth. (1) 2D crystals are very effective nucleating agents; for instance, the P1 2D crystals can direct the growth of P1 3D crystals even under conditions that favour the growth of other crystal forms. (2) The epitaxial 3D crystal grow very rapidly and at amazingly low protein concentrations; P1 3D crystals can be grown from solutions as low as 10 microM streptavidin. (3) There is no obligate requirement for the deposition of pre-formed 2D crystals; lipid layers alone are equally effective at promoting epitaxial crystal growth.

Human HDAC7 Harbors a Class IIa Histone Deacetylase-specific Zinc Binding Motif and Cryptic Deacetylase Activity
Journal of Biological Chemistry, 2008
Histone deacetylases (HDACs) are protein deacetylases that play a role in repression of gene tran... more Histone deacetylases (HDACs) are protein deacetylases that play a role in repression of gene transcription and are emerging targets in cancer therapy. Here, we characterize the structure and enzymatic activity of the catalytic domain of human HDAC7 (cdHDAC7). Although HDAC7 normally exists as part of a multiprotein complex, we show that cdHDAC7 has a low level of deacetylase activity which can be inhibited by known HDAC inhibitors. The crystal structures of human cdHDAC7 and its complexes with two hydroxamate inhibitors are the first structures of the catalytic domain of class IIa HDACs and demonstrate significant differences with previously reported class I and class IIb-like HDAC structures. We show that cdHDAC7 has an additional class IIa HDAC-specific zinc binding motif adjacent to the active site which is likely to participate in substrate recognition and protein-protein interaction and may provide a site for modulation of activity. Furthermore, a different active site topology results in modified catalytic properties and in an enlarged active site pocket. Our studies provide mechanistic insights into class IIa HDACs and facilitate the design of specific modulators.

Glyoxylate and Pyruvate Are Antagonistic Effectors of the Escherichia coli IclR Transcriptional Regulator
Journal of Biological Chemistry, 2007
The Escherichia coli isocitrate lyase regulator (IclR) regulates the expression of the glyoxylate... more The Escherichia coli isocitrate lyase regulator (IclR) regulates the expression of the glyoxylate bypass operon (aceBAK). Founding member of a large family of common fold transcriptional regulators, IclR comprises a DNA binding domain that interacts with the operator sequence and a C-terminal domain (C-IclR) that binds a hitherto unknown small molecule. We screened a chemical library of more than 150 metabolic scaffolds using a high-throughput protein stability assay to identify molecules that bind IclR and then tested the active compounds in in vitro assays of operator binding. Glyoxylate and pyruvate, identified by this method, bound the C-IclR domain with KD values of 0.9+/-0.2 and 156.2+/-7.9 microM, as defined by isothermal titration calorimetry. Both compounds altered IclR interactions with operator DNA in electrophoretic mobility shift assays but showed an antagonistic effect. Glyoxylate disrupted the formation of the IclR/operator complex in vitro by favoring the inactive dimeric state of the protein, whereas pyruvate increased the binding of IclR to the aceBAK promoter by stabilizing the active tetrameric form of the protein. Structures of the C-IclR domain alone and in complex with each effector were determined at 2.3 A, confirming the binding of both molecules in the effector recognition site previously characterized for the other representative of the family, the E. coli AllR regulator. Site-directed mutagenesis demonstrated the importance of hydrophobic patch formed by Met-146, Leu-154, Leu-220, and Leu-143 in interactions with effector molecules. In general, our strategy of combining chemical screens with functional assays and structural studies has uncovered two small molecules with antagonistic effects that regulate the IclR-dependent transcription of the aceBAK operon.
Uploads
Papers by Alexey Bochkarev