Exportin-t (Xpot) transports mature 5'- and 3'-end processed tRNA from the nucleus to the... more Exportin-t (Xpot) transports mature 5'- and 3'-end processed tRNA from the nucleus to the cytoplasm by associating with a small G-protein Ran (RAs-related nuclear protein), in the nucleus. The release of tRNA in cytoplasm involves RanGTP hydrolysis. Despite the availability of crystal structures of nuclear and cytosolic forms of Xpot, the molecular details regarding the sequential events leading to tRNA release and subsequent conformational changes occurring in Xpot remain unknown. We have performed a combination of classical all-atom and accelerated molecular dynamics simulations on a set of complexes involving Xpot to study a range of features including conformational flexibility of free and cargo-bound Xpot and functionally critical contacts between Xpot and its cargo. The systems investigated include free Xpot and its different complexes, bound either to Ran (GTP/GDP) or tRNA or both. This approach provided a statistically reliable estimate of structural dynamics of Xpot...
Journal of Biomolecular Structure and Dynamics, 2001
Nanosecond scale molecular dynamics simulations have been performed on antiparallel Greek key typ... more Nanosecond scale molecular dynamics simulations have been performed on antiparallel Greek key type d(G 7 ) quadruplex structures with different coordinated ions, namely Na + and K + ion, water and Na + counter ions, using the AMBER force field and Particle Mesh Ewald technique for electrostatic interactions. Antiparallel structures are stable during the simulation, with root mean square deviation values of ~1.5 Å from the initial structures. Hydrogen bonding patterns within the G-tetrads depend on the nature of the coordinated ion, with the G-tetrad undergoing local structural variation to accommodate different cations. However, alternating syn-anti arrangement of bases along a chain as well as in a quartet is maintained through out the MD simulation. Coordinated Na + ions, within the quadruplex cavity are quite mobile within the central channel and can even enter or exit from the quadruplex core, whereas coordinated K + ions are quite immobile. MD studies at 400K indicate that K + ion cannot come out from the quadruplex core without breaking the terminal G-tetrads. Smaller grooves in antiparallel structures are better binding sites for hydrated counter ions, while a string of hydrogen bonded water molecules are observed within both the small and large grooves. The hydration free energy for the K + ion coordinated structure is more favourable than that for the Na + ion coordinated antiparallel quadruplex structure.
A large number of bacteria have been found to govern virulence and heat shock responses using tem... more A large number of bacteria have been found to govern virulence and heat shock responses using temperature sensing RNAs known as RNA thermometers (RNATs). They repress translation initiation by base pairing to the Shine-Dalgarno (SD) sequence at low temperature. Increasing the temperature induces the RNA duplex to unfold and expose the SD sequence for translation. A prime example is the ROSE thermometer module known to regulate the production of the ROSE heat shock protein in Bradyrhizobium japonicum. The unfolding of a 29-nucleotide long MicroROSE RNA element which forms a critical component encompassing the SD sequence, and three mutants that differ from it by deletion of a guanine nucleotide or mutations near the SD and stem regions have been studied using high temperature molecular dynamics simulations. The simulations reveal the progressive manner in which a biologically functional RNA thermometer unfolds. Our simulations reveal that deletion of the highly conserved G10 residue, opposite to the SD region leads to the formation of a stable RNA helix that has lost its thermosensing ability. Mutations of bases A5 / U5 and U25 / A25 near the stem increase the thermosensing ability due to the allosteric effect which leads to a global destabilization effect on the structure. The temperature-dependant regulation of this thermometer has been investigated by estimation of differences in the unfolding paths by calculating individual residue fluctuation, stacking energy, the contact map plot and the lifetime dynamics plot of non-Watson-Crick hydrogen bonds at three different temperatures. Results reveal that partial unfolding at higher temperature starts from the hairpin tetra loop end and terminates at the stem region through the SD associated region. Two canonical hydrogen bonds between U9-A22 and four non-canonical hydrogen bonds between G10-G21 and U6-U24 around the internal loop play an important role in partial melting of the RNA helix. These results demonstrate how small alterations in RNA structure can regulate gene expression and illuminate the molecular basis of the function of an important bacterial regulatory motif.
Energy minimization has been carried out on three poly(purine). poly(pyrimidine) sequences-d(G),,... more Energy minimization has been carried out on three poly(purine). poly(pyrimidine) sequences-d(G),, . d(C),,, d(A),, . d("),,, and d(AG), . d(CT),--using the molecular mechanics program AMBER (Assisted Model Building and Energy Refinement). In order to extensively scan the conformational space avaliable, five different helical models were studied, three of them being right-handed helices while the other two were left helical. For all three sequences the right-handed A-and B-type helices are energetically slightly preferred over the left helices, but the energy difference between the various right-handed helices is only marginal. A detailed analysis has been carried out to characterize the local structural variability in the refined structures, both in terms of torsion angles as well as other parameters such as base-pair tilt, wedge roll, and wedge tilt, etc. All three sequences exhibit similar structural features for a particular form, but both the forms A and B show significyt deviations from fiber models. In particular, the A-form structures have higher unit rise (2.7 A), and lower unit twist (3lO0) and base-pair tilt (12O), compared to the fiber model, which has corresponding values of 2.56 A, 32.7", and 20°, respectively. All these changes indicate that the refined models are closer to the A-form structure observed in crystals of oligonucleotides. In the refined B-for models, the helical parameters are close to the fiber B-form, although the torsion angles show considerable variations. None of the three sequences examined, including the d(A), . d(T), sequence, show any pronounced curvature for the B-form structure.
Bioinformatics in the Era of Post Genomics and Big Data
Regulation of gene expression is achieved by the presence of cis regulatory elements; these signa... more Regulation of gene expression is achieved by the presence of cis regulatory elements; these signatures are interspersed in the noncoding region and also situated in the coding region of the genome. These elements orchestrate the gene expression process by regulating the different steps involved in the flow of genetic information. Transcription (DNA to RNA) and translation (RNA to Protein) are controlled at different levels by different regulatory elements present in the genome. Current chapter describes the structural and functional elements present in the coding and noncoding region of the genome. Further we discuss role of regulatory elements in regulation of gene expression in prokaryotes and eukaryotes. Finally, we also discuss DNA structural properties of regulatory regions and their role in gene expression. Identification and characterization of cis regulatory elements would be useful to engineer the regulation of gene expression.
With almost no consensus promoter sequence in prokaryotes, recruitment of RNA polymerase (RNAP) t... more With almost no consensus promoter sequence in prokaryotes, recruitment of RNA polymerase (RNAP) to precise transcriptional start sites (TSSs) has remained an unsolved puzzle. Uncovering the underlying mechanism is critical for understanding the principle of gene regulation. We attempted to search the hidden code in ~16500 promoters, of twelve prokaryotes representing two kingdoms, in their structure and energetics. Twenty eight fundamental parameters of DNA structure including backbone angles, base pair axis, inter base pair and intra base pair parameters were used and information was extracted from X-ray crystallography (XRC) data. Three parameters (solvation energy, hydrogen bond energy and stacking energy) were selected for creating energetics profiles using in-house programs. DNA was found to be inherently designed to undergo a change in every parameter undertaken, from some distance upstream of TSSs to adopt a signature state at these locations in all prokaryotes. These signatu...
In higher eukaryotes, gene architecture and structural properties of promoters have emerged as si... more In higher eukaryotes, gene architecture and structural properties of promoters have emerged as significant factors influencing variation in number of transcripts (expression level) and specificity of gene expression in a tissue (expression breadth), which eventually shape the phenotype. In this study, transcriptome data of different tissue types at various developmental stages of A. thaliana, O. sativa, S. bicolor and Z. mays have been used to understand the relationship between properties of gene components and its expression. Our findings indicate that in plants, among all gene architecture and structural properties of promoters, compactness of genes in terms of intron content is significantly linked to gene expression level and breadth, whereas in human an exactly opposite scenario is seen. In plants, for the first time we have carried out a quantitative estimation of effect of a particular trait on expression level and breadth, by using multiple regression analysis and it confirms that intron content of primary transcript (as %) is a powerful determinant of expression breadth. Similarly, further regression analysis revealed that among structural properties of the promoters, stability is negatively linked to expression breadth, while DNase1 sensitivity strongly governs gene expression breadth in monocots and gene expression level in dicots. In addition, promoter regions of tissue specific genes are found to be enriched with TATA box and Y-patch motifs. Finally, multi copy orthologous genes in plants are found to be longer, highly regulated and tissue specific.
DNA is a complex molecule with phenomenal inherent plasticity and the ability to form different h... more DNA is a complex molecule with phenomenal inherent plasticity and the ability to form different hydrogen bonding patterns of varying stabilities. These properties enable DNA to attain a variety of structural and conformational polymorphic forms. Structurally, DNA can exist in single-stranded form or as higher-order structures, which include the canonical double helix as well as the noncanonical duplex, triplex and quadruplex species. Each of these structural forms in turn encompasses an ensemble of dynamically heterogeneous conformers depending on the sequence composition and environmental context. In vivo, the widely populated canonical B-DNA attains these noncanonical polymorphs during important cellular processes. While several investigations have focused on the structure of these noncanonical DNA, studying their dynamics has remained nontrivial. Here, we outline findings from some recent advanced experimental and molecular simulation techniques that have significantly contribute...
Transcription is an intricate mechanism and is orchestrated at the promoter region. The cognate m... more Transcription is an intricate mechanism and is orchestrated at the promoter region. The cognate motifs in the promoters are observed in only a subset of total genes across different domains of life. Hence, sequence-motif based promoter prediction may not be a holistic approach for whole genomes. Conversely, the DNA structural property, duplex stability is a characteristic of promoters and can be used to delineate them from other genomic sequences. In this study, we have used a DNA duplex stability based algorithm 'PromPredict' for promoter prediction in a broad range of eukaryotes, representing various species of yeast, worm, fly, fish, and mammal. Efficiency of the software has been tested in promoter regions of 48 eukaryotic systems. PromPredict achieves recall values, which range from 68 to 92% in various eukaryotes. PromPredict performs well in mammals, although their core promoter regions are GC rich. 'PromPredict' has also been tested for its ability to predict...
The self-complementary DNA fragment CCGGCGC- CGG crystallizes in the rhombohedral space group R3 ... more The self-complementary DNA fragment CCGGCGC- CGG crystallizes in the rhombohedral space group R3 with unit cell parameters a = 54.07 A and c = 44.59 0 A. The structure has been determined by X-ray diffraction methods at 2.2 A resolution and refined to an R value of 16.7%. In the crystal, the decamer forms B-DNA double helices with characteristic groove dimensions: compared with B-DNA of random sequence, the minor groove is wide and deep and the major groove is rather shailow. Local base pair geometries and stacking patterns are within the range commonly observed in B- DNA crystal structures. The duplex bears no resemblance to A-form DNA as might have been expected for a sequence with only GC base pairs. The shallow major groove permits an unusual crystal packing pattern with several direct intermolecular hydrogen bonds between phosphate oxygens and cytosine amino groups. In addition, decameric duplexes form quasi-infinite double helices in the crystal by end-to-end stacking. The groove geometries and accessibilities of this molecule as observed in the crystal may be important for the mode of binding of both proteins and drug molecules to G/C stretches in DNA.
ABSTRACTUnderstanding dinucleotide sequence directed structures of nuleic acids and their variabi... more ABSTRACTUnderstanding dinucleotide sequence directed structures of nuleic acids and their variability from experimental observation remained ineffective due to unavailability of statistically meaningful data. We have attempted to understand this from energy scan along twist, roll, and slide degrees of freedom which are mostly dependent on dinucleotide sequence using ab initio density functional theory. We have carried out stacking energy analysis in these dinucleotide parameter phase space for all ten unique dinucleotide steps in DNA and RNA using DFT‐D by ωB97X‐D/6‐31G(2d,2p), which appears to satisfactorily explain conformational preferences for AU/AU step in our recent study. We show that values of roll, slide, and twist of most of the dinucleotide sequences in crystal structures fall in the low energy region. The minimum energy regions with large twist values are associated with the roll and slide values of B‐DNA, whereas, smaller twist values correspond to higher stability to R...
DNA sequences containing a stretch of several A:T basepairs without a 5′-TA-3′ step are known as ... more DNA sequences containing a stretch of several A:T basepairs without a 5′-TA-3′ step are known as A-tracts and have been the subject of extensive investigation because of their unique structural features such as a narrow minor groove and their crucial role in several biological processes. One of the aspects under investigation has been the influence of the 5-methyl group of thymine on the properties of A-tracts. Detailed molecular dynamics simulation studies of the sequences d(CGCAAAUUUGCG) and d(CGCAAATTTGCG) indicate that the presence of the 5-methyl group in thymine increases the frequency of a narrow minor groove conformation, which could facilitate its specific recognition by proteins, and reduce its susceptibility to cleavage by DNase I. The bias toward a wider minor groove in the absence of the thymine 5-methyl group is a static structural feature. Our results also indicate that the presence of the thymine 5-methyl group is necessary for calibrating the backbone conformation and the basepair and dinucleotide step geometry of the core A-tract as well as the flanking CA/TG and the neighboring GC/GC steps, as observed in free and protein-bound DNA. As a consequence, it also fine-tunes the curvature of the longer DNA fragment in which the A-tract is embedded.
Guanine tetrads are formed spontaneously by guanine rich sequences in the presence of certain cat... more Guanine tetrads are formed spontaneously by guanine rich sequences in the presence of certain cations. Various quadruplex helical structures, stabilized by such tetrads, apparently play an important biological role in vivo. To understand the importance of the cations, a 6 ns molecular dynamics simulation has been performed on a 7-mer G-quadruplex, surrounded by Na + counterions and explicit water molecules, but without any ions in the initial structure. Interestingly, the quadruplex structure does not fall apart, but undergoes small structural changes, which enable the solvent molecules, including Na + ions, to enter the empty central channel of structure. This channel is fully hydrated within the first 100 ps and two ions move into the central channel between 0.5 and 2 ns of MD simulation, by replacing some of the water molecules. The ions once trapped within the quadruplex channel are not expelled even during 1.5 ns of MD at 400 K. In fact they penetrate deeper into the channel to facilitate entry of additional ions, though all coordination sites within the quadruplex are not occupied even after 6.1 ns of MD simulation. The entry of cations into the central channel leads to a quadruplex structure with more favorable free energy of hydration, which is comparable to that of a fully coordinated quadruplex.
Proteins: Structure, Function, and Bioinformatics, 2003
The C protein, a middle gene product of bacteriophage Mu, is the determinant of the transition fr... more The C protein, a middle gene product of bacteriophage Mu, is the determinant of the transition from middle to late gene expression. C activates transcription from four late gene promoters, Plys, PI, PP, and Pmom by binding to a site overlapping their −35 elements. Site‐specific, high‐affinity binding of C to its recognition sequence results in both axial and torsional distortion of DNA at Pmom, which appears to play a role in recruitment of RNA polymerase to the promoter for mom gene transactivation. To identify the regions of C protein important for its function, deletion and site‐directed mutagenesis were carried out. We demonstrate here that a helix‐turn‐helix (HTH) motif located toward the carboxy terminal end of the protein is the DNA‐binding domain and amino acid residues involved in transactivation overlap the HTH motif. Mutagenesis studies also aided in the identification of the region important for dimerization. Structure‐based sequence alignment and molecular modeling in c...
The cis-regulatory regions on DNA serve as binding sites for proteins such as transcription facto... more The cis-regulatory regions on DNA serve as binding sites for proteins such as transcription factors and RNA polymerase. The combinatorial interaction of these proteins plays a crucial role in transcription initiation, which is an important point of control in the regulation of gene expression. We present here an analysis of the performance of an in silico method for predicting cis-regulatory regions in the plant genomes of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) on the basis of free energy of DNA melting. For protein-coding genes, we achieve recall and precision of 96% and 42% for Arabidopsis and 97% and 31% for rice, respectively. For noncoding RNA genes, the program gives recall and precision of 94% and 75% for Arabidopsis and 95% and 90% for rice, respectively. Moreover, 96% of the false-positive predictions were located in noncoding regions of primary transcripts, out of which 20% were found in the first intron alone, indicating possible regulatory roles. The ...
Guanine rich sequences adopt a variety of four stranded structures, which differ in strand orient... more Guanine rich sequences adopt a variety of four stranded structures, which differ in strand orientation and conformation about the glycosidic bond even though they are all stabilised by Hoogsteen hydrogen bonded guanine tetrads. Detailed model building and molecular mechanics calculations have been carried out to investigate various possible conformations of guanines along a strand and different possible orientations of guanine strands in a G-tetraplex structure. It is found that for an oligo G stretch per se, a parallel four stranded structure with all guanines in anti conformation is favoured over other possible tetraplex structures. Hence an alternating syn-anti arrangement of guanines along a strand is likely to occur only in folded back tetraplex structures with antiparallel G strands. Our study provides a theoretical rationale for the observed alternation of glycosidic conformation and the inverted stacking arrangement arising from base flipover, in antiparallel G-tetraplex structures and also highlights the various structural features arising due to different types of strand orientations. The molecular mechanics calculations help in elucidating the various interactions which stabilize different G-tetraplex structures and indicate that screening of phosphate charge by counterions could have a dramatic effect on groove width in these four stranded structures.
Analysis of available B-DNA type oligomeric crystal structures as well as protein-bound DNA fragm... more Analysis of available B-DNA type oligomeric crystal structures as well as protein-bound DNA fragments (solved using data with resolution <2.6 A Ê ) indicates that in both data sets, a majority of the (3 H -Ade) H2..O2(3 H -Thy/Cyt) distances in AA.TT and GA.TC dinucleotide steps, are considerably shorter than their values in a uniform ®bre model, and are smaller than their optimum separation distance. Since the electropositive C2-H2 group of adenine is in close proximity of the electronegative keto oxygen atoms of both pyrimidine bases in the antiparallel strand of the double-helical DNA structures, it suggests the possibility of intrabase-pair as well as cross-strand C-H..O hydrogen bonds in the minor groove. The C2-H2..O2 hydrogen bonds within the A.T base-pairs could be a natural consequence of Watson-Crick pairing. However, the close cross-strand interactions between the bases at the 3 H -ends of the AA.TT and GA.TC steps arise due to the local sequence-dependent geometry of these steps. While the base-pair propeller twist in these steps is comparable to the ®bre model, some of the other local parameters such as basepair opening angle and inter-base-pair slide show coordinated changes, leading to these shorter C2-H2..O2 distances. Hence, in addition to the well-known minor groove hydration, it appears that favourable C2-H2..O2 cross-strand interactions may play a role in imparting a characteristic geometry to AA.TT and GA.TC steps, as well as A n .T n and GA n .T n C tracts, which leads to a narrow minor groove in these regions.
Journal of Biomolecular Structure and Dynamics, 2005
Deciphering sequence information from sugar-phosphate backbone is finely tuned through the confor... more Deciphering sequence information from sugar-phosphate backbone is finely tuned through the conformational substates of DNA. BII conformation, one of the conformational substates of B-DNA, is known to play a key role in DNA-protein recognition. BI and BII are identified by the ε-ζ difference, which is negative in BI and positive in BII. Our analysis of MD and crystal structures shows that BII conformation is sequence specific and dinucleotides GC, CG, CA, TG, TA show high preference to take up BII conformation, while TT, TC, CT, CC dinucleotides rarely take up this conformation. Significant changes were observed in the dinucleotide parameters viz. twist, roll, and slide for the steps having BII conformation. Interestingly, the magnitude of variation in the dinucleotide parameters is seen to depend mainly on two factors, the magnitude of ε-ζ difference and the presence or absence of BII conformation in the second strand, across the WC base-paired dinucleotide step. Based on these two factors, the conformational substate of a dinucleotide step can be further classified as BI.BI (BI conformation in both strands), BI.BII (BI conformation in one strand and BII conformation in the other), and BII.BII (BII conformation in both strands). The occurrence of BII in both strands was found to be quite rare and thus, it can be concluded that BI.BI and BI.BII hybrid steps are more favorable than a BII.BII step. In conformity with the sequence preference seen for dinucleotides in each strand, BII.BII combination of backbone conformation was observed only for GC, CG, CA, and TG containing dinucleotide steps. We further classified BII.BII step as strong BII and weak BII depending on the magnitude of the average ε-ζ difference. The dinucleotide steps which belong to the category of strong BII, have large twist, high positive slide and negative roll values, while those in the weak BII group have roll, twist, and slide values similar to that of hybrid BI.BII steps. This conformational property could be contributing to the groove opening/closing and thus can modulate protein-DNA interaction.
Uploads
Papers by Manju Bansal