Academia.eduAcademia.edu

Outline

Species-independent MicroRNA Gene Discovery

2012

https://doi.org/10.25781/KAUST-W6777

Abstract

Species-independent MicroRNA Gene Discovery Timothy Kevin Kuria Kamanu microRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer's. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21, 000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19, 000 unknown miRNA genes in nine species whereby approximately 15, 300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is speciesindependent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which

References (180)

  1. 7 Genomic organization of palindromic miRNA sequences and their subsequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  2. 8 Shared motif structures between the let-7 and mir-3596 miRNA families.
  3. 9 Visualization of the cross tabular distribution of miRNA gene in 193 species and 23 miRNA families. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  4. 10 Chromosomal distribution of family-annotated miRNA genes in the human, monkey and orangutan genomes (miRBase R18). . . . . . . . . . . . . . . .
  5. 11 Computation time (hours) during species-independent miRNA gene discovery.
  6. 12 Illustration of kernel density estimation during estimation of implicit or 'practical' miRNA gene boundaries. . . . . . . . . . . . . . . . . . . . . . .
  7. 13 Comparison of estimated and explicit theory-based (miRBase) miRNA gene boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  8. 14 Predicted miRNA family mir-515 genes on chromosome 19 in five primate genomes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  9. 15 Chromosomal distribution of CTP in miRNA families mir-548 and mir-467.174
  10. 16 Putative cis-natural sense/antisense transcript (cis-NATs) miRNA genes. . . .
  11. 17 Dragon miRNA Discovery System (DMDS v1.0) web interface. . . . . . . .
  12. A1 Populous paralogous miRNA genes in different species. . . . . . . . . . . . . A2 Distribution of family sizes in miRBase R18 and R19. . . . . . . . . . . . . . A3 Computational variables that summarize a digital feature during genetic algorithms (GA) optimization . . . . . . . . . . . . . . . . . . . . . . . . . . KEYWORDS
  13. Genetic algorithm, evolutionary computing
  14. Pattern matching and discovery
  15. Regular expressions
  16. Gene boundaries
  17. RNA secondary structures
  18. Visualization, multiple category data
  19. Performance, sensitivity, specificity, positive predictive value (PPV) JEL Classification: C13, C14, C22, C32, C61, C63 Mathematical Subject Classification: 62P10, 92B05 BIBLIOGRAPHY
  20. Abraham, A., Nedjah, N., and de Macedo Mourelle, L. (2006). Evolutionary computation: from genetic algorithms to genetic programming. Studies in Computational Intelligence, 13:1-20.
  21. Akutsu, T., Arimura, H., and Shimozono, S. (2000). On approximation algorithms for local multiple alignment. New York, USA, http://dl.acm.org/citation.cfm.
  22. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., and Walter, P. (2008). Molecular biology of the cell. Garland Science, Taylor & Francis, Newyork, USA.
  23. Altman, D. G. and Bland, M. J. (1994). Diagnostic tests 2: predictive values. British Medical Journal, 309(6947):102.
  24. Altuvia, Y., Landgraf, P., Lithwick, G., Elefant, N., Pfeffer, S., Aravin, A., Brownstein, M. J., Tuschl, T., and Margalit, H. (2005). Clustering and conservation patterns of human microRNAs. Nucleic Acids Research, 33(8):2697-2706.
  25. Ambros, V., Bartel, B., Bartel, D. P., Burge, C. B., Carrington, J. C., Chen, X., Dreyfuss, G., Eddy, S. R., Griffiths-Jones, S., Marshall, M., Matzke, M., Ruvkun, G., and Tuschl, T. (2003). A uniform system for microRNA annotation. RNA, 9(3):277-279.
  26. Artzi, S., Kiezun, A., and Shomron, N. (2008). miRNAminer: A tool for homologous microRNA gene search. BMC Bioinformatics, 9(39):1-7.
  27. Bajić, V. B. (2000). Comparing the success of different prediction software in sequence analysis: A review. Briefings in Bioinformatics, 1(3):214-228.
  28. Bajić, V. B., Tan, S. L., Suzuki, Y., and Sugano, S. (2004). Promoter prediction analysis on the whole human genome. Nature Biotechnology, 22(11):1467-1473.
  29. Bandyopadhyay, S. and Bhattacharyya, M. (2009). Analyzing miRNA co-expression networks to explore tf-miRNA regulation. BMC Bioinformatics, 10(163):1-16.
  30. Barrera, L. O., Li, Z., Smith, A. D., Arden, K. C., Cavenee, W. K., Zhang, M. Q., Green, R. D., and Ren, B. (2008). Genome-wide mapping and analysis of active promoters in mouse embryonic stem cells and adult organs. Genome Research, 18:46-59.
  31. Bartel, D. P. (2004). MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell, 116(2):281-297.
  32. Bartel, D. P. (2009). MicroRNAs: Target recognition and regulatory functions. Cell, 136(2):215-233.
  33. Batuwita, R. and Palade, V. (2009). micropred: effective classification of pre-mirnas for human mirna gene prediction. Bioinformatics, 25(8):989-995.
  34. Berezikov, E., Chung, W., Willis, J., Cuppen, E., and Lai, E. C. (2007). Mammalian mirtron genes. Molecular Cell, 28(2):328-336.
  35. Berezikov, E., Cuppen, E., and Plasterk, R. H. A. (2006). Approaches to microRNA discovery. Nature Genetics Supplement, 38(S2-7):1-7.
  36. Berezikov, E., Guryev, V., van de Belt, J., Wienholds, E., Plasterk, R. H. A., and Cuppen, E. (2005). Phylogenetic shadowing and computational identification of human microRNA genes. Cell, 120(1):21-24.
  37. Borchert, G. M., Holton, N. W., Williams, J. D., Hernan, W. L., Bishop, I. P., Dembosky, J. A., Elste, J. E., Gregoire, N. S., Kim, J., Koehler, W. W., Lengerich, J. C., Medema, A. A., Nguyen, M. A., Ower, G. D., Rarick, M. A., Strong, B. N., Tardi, N. J., Tasker, N. M., Wozniak, D. J., Gatto, C., and Larson, E. D. (2011). Comprehensive analysis of microRNA genomic loci identifies pervasive repetitive-element origins. Mobile Genetic Elements, 1(1):8-17.
  38. Brantl, S. (2007). Regulatory mechanisms employed by cis-encoded antisense RNAs. Current Opinion in Microbiology, 10(2):102-109.
  39. Buck, M. J. and Lieb, J. D. (2004). ChIP-chip: Considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics, 83:349-360.
  40. Celniker, S. E., Dillon, L. A. L., Gerstein, M. B., Gunsalus, K. C., Henikoff, S., Karpen, G. H., Kellis, M., Lai, E. C., Lieb, J. D., MacAlpine, D. M., Gos, M., Piano, F., Snyder, M., Stein, L., White, K. P., Waterston, R. H., and the modENCODE Consortium (2009). Unlocking the secrets of the genome. Nature, 459(7249):927-930.
  41. Chan, S. and Slack, F. J. (2007). And now introducing mammalian mirtrons. Developmental Cell, 13(5):605-607.
  42. Chen, J., Sun, M., Hurst, L. D., Carmichael, G. G., and Rowley, J. (2005). Genome-wide analysis of coordinate expression and evolution of human cis-encoded sense-antisense transcripts. Trends in Genetics, 21(6):326-329.
  43. Cherry, J. M., Ball, C., Weng, S., Juvik, G., Schmidt, R., Adler, C., Dunn, B., Dwight, S., Riles, L., Mortimer, R. K., and Botstein, D. (1997). Genetic and physical maps of Saccharomyces cerevisiae. Nature, 387(6632):67-73.
  44. Chiang, H. R., Schoenfeld, L. W., Ruby, J. G., Auyeung, V. C., Spies, N., Baek, D., Johnston, W. K., Russ, C., Luo, S., Babiarz, J. E., Blelloch, R., Schroth, G. P., Nusbaum, C., and Bartel, D. P. (2010). Mammalian microRNAs: Experimental evaluation of novel and previously annotated genes. Genes and Development, 24(10):992-1009.
  45. Chimpanzee Sequencing and Analysis Consortium, T. (2005). Initial sequence of the chimpanzee genome and comparison with the human genome. Nature, 437(7055):69-87.
  46. Christiansen, T. and Torkington, N. (2007). Perl Cookbook. O'Reilly, Sebastopol, California, USA, Second edition.
  47. Claverie, J. and Audic, S. (1996). The statistical significance of nucleotide position-weight matrix matches. Bioinformatics, 12(5):431-439.
  48. Crawley, M. J. (2007). The R Book. John Wiley & Sons, Inc, West Sussex, England.
  49. Cullen, B. R. (2006). Viruses and micrornas. Nature Genetics, 38(S25-30):S25-S30.
  50. Das, M. K. and Dai, H. (2007). A survey of DNA motif finding algorithms. BMC Bioinformatics, 8(S21):1-13.
  51. Daub, J., Gardner, P. P., Tate, J., Ramskld, D., Manske, M., Scott, W. G., Weinberg, Z., Griffiths-Jones, S., and Bateman, A. (2008). The RNA WikiProject: Community annotation of RNA families. RNA, 14(12):2462-2464.
  52. De Jong, K. A. (2006). Evolutionary computation: a unified approach. The MIT Press, Hayward Street, Cambridge, Massachusetts, USA, First edition.
  53. Delisi, C. and Crothers, D. M. (1971). Prediction of RNA secondary structure. PNAS, 68(11):2682-2685.
  54. Dinel, S., Bolduc, C., Belleau, P., Boivin, A., Yoshioka, M., Calvo, E., Piedboeuf, B., Snyder, E. E., Labrie, F., and St-Amand, J. (2005). Reproducibility, bioinformatic analysis and power of SAGE method to evaluate changes in transcriptome. Nucleic Acids Research, 33(3):1-7.
  55. Ding, J., Zhou, S., and Guan, J. (2011). miRFam: An effective automatic miRNA classification method based on n-grams and a multiclass SVM. BMC Bioinformatics, 12(216):1-11.
  56. Duvenage, E. (2008). miRNAMatcher-high throughput miRNA discovery using regular expressions obtained via a genetic algorithm. Master's thesis, University of the Western Cape, Cape Town, South Africa.
  57. Ecker, J. R. and Davis, R. W. (1986). Inhibition of gene expression in plant cells by expression of antisense rna. Proceedings of the National Academy of Sciences of USA, 83(15):5372- 5376.
  58. Eddy, S. R. (2004). How do RNA folding algorithms work? Nature Biotechnology, 22(11):1457-1458.
  59. Eiben, A. E. and Smith, J. E. (2007). Introduction to Evolutionary Computing. Springer, Natural Computing Series, Berlin, Germany, second edition.
  60. Elliott, W. H. and Elliott, D. C. (2009). Biochemistry and molecular biology. Oxford University Press, Newyork, USA.
  61. ENCODE Project Consortium, T. (2007). Identification and analysis of functional elements in 1genome by the ENCODE pilot project. Nature, 447(7146):799-816.
  62. ENCODE Project Consortium, T. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414):57-74.
  63. Erson, A. E. and Petty, E. M. (2008). MicroRNAs in development and disease. Clinical Genetics, 74(4):296-306.
  64. Flicek, P., Amode, M. R., Barrell, D., Beal, K., Brent, S., Chen, Y., Clapham, P., Coates, G., Fairley, S., Fitzgerald, S., Gordon, L., Hendrix, M., Hourlier, T., Johnson, N., Andreas, K., Keefe, D., Keenan, S., Kinsella, R., Kokocinski, F., Kulesha, E., Larsson, P., Longden, I., McLaren, W., Overduin, B., Pritchard, B., Riat, H. S., Rios, D., Ritchie, G. R. S., Ruffier, M., Schuster, M., Sobral, D., Spudich, G., Tang, Y. A., Trevanion, S., Vandrovcova, J., Vilella, A. J., White, S., Wilder, S. P., Zadissa, A., Zamora, J., Aken, B. L., Birney, E., Cunningham, F., Dunham, I., Durbin, R., Fernández-Suarez, X. M., Herrero, J., Hubbard, T. J. P., Parker, A., Proctor, G., Vogel, J., and Searle, S. M. J. (2011). Ensembl 2011. Nucleic Acids Research, 39(1):D800-D806.
  65. Friedl, J. E. F. (2002). Mastering regular expressions. O'Reilly, Sebastopol, California, USA, Second edition.
  66. Fujita, P. A., Rhead, B., Zweig, A. S., Hinrichs, A. S., Karolchik, D., Cline, M. S., Goldman, M., Barber, G. P., Clawson, H., Coelho, A., Diekhans, M., Dreszer, T. R., Giardine, B. M., Harte, R. A., Hillman-Jackson, J., Hsu, F., Kirkup, V., Kuhn, R. M., Learned, K., Li, C. H., Meyer, L. R., Pohl, A., Raney, B. J., Rosenbloom, K. R., Smith, K. E., Haussler, D., and Kent, W. J. (2011). The UCSC Genome Browser database: update 2011. Nucleic Acids Research, 39(Database):D876-D882.
  67. Gardner, P. P., Daub, J., Tate, J. G., Nawrocki, E. P., Kolbe, D. L., Lindgreen, S., Wilkinson, A. C., Finn, R. D., Griffiths-Jones, S., Eddy, S. R., and Bateman, A. (2009). Rfam: Updates to the RNA families database. Nucleic Acids Research, 37(D):D136-D140.
  68. Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B., and Dunson, D. B. (2004). Bayesian data analysis. Chapman & Hall, Florida, USA, Second edition.
  69. Gibson, G. and Muse, S. V. (2009). A primer of genome science. Sinauer Associates, Massachusetts, USA, third edition.
  70. Gilson, E., Clément, J., Brutlag, D., and Hofnung, M. (1984). A family of dispersed repetitive extragenic palindromic DNA sequences in e. coli. The EMBO Journal, 3(6):1417-1421.
  71. Glaros, A. G. and Kline, R. B. (1988). Understanding the accuracy of tests with cutting scores: the sensitivity, specificity, and predictive value model. Journal of Clinical Psychology, 44(6):1013-1023.
  72. Goh, C., Ong, Y., and Tan, K. C. (2009). Multi-Objective Memetic Algorithms. Springer- Verlag, Berlin, Germany, Studies in Computational Intelligence, Vol. 171 edition.
  73. Goyvaerts, J. and Levithan, S. (2009). Regular expressions cookbook. O'Reilly, Sebastopol, California, USA, first edition.
  74. Gregory, R. I. and Shiekhattar, R. (2005). microRNA biogenesis and cancer. Cancer Research, 65(9):3509-3512.
  75. Griffiths-Jones, S., Grocock, R. J., van Dongen, S., Bateman, A., and Enright, A. J. (2006). miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Research, 34(D):140-144.
  76. Griffiths-Jones, S., Saini, H. K., van Dongen, S., and Enright, A. J. (2008). miRBase: tools for microRNA genomics. Nucleic Acids Research, 36(D):D154-D158.
  77. Ha, M., Pang, M., Agarwal, V., and Chen, Z. J. (2008). Interspecies regulation of microRNAs and their targets. BBA: Biochimica et Biophysica Acta, 1779(11):735-742.
  78. Hamada, M., Kiryu, H., Sato, K., Mituyama, T., and Asai, K. (2009). Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics, 25(4):465-473.
  79. Hannon, G. J. (2002). RNA interference. Nature, 418:244-251.
  80. Harbers, M. and Carninci, P. (2005). Tag-based approaches to transcriptome research and genome annotation. Nature Methods, 2(7):495-502.
  81. Harmanci, A. O., Sharma, G., and Mathews, D. H. (2007). Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics, 8(130):1-21.
  82. Hartwell, L. H., Hood, L., Goldberg, M. L., Reynolds, A. E., Silver, L. M., and Veres, R. C. (2008). Genetics: From genes to genomes. McGraw-Hill, Newyork, USA, Third edition.
  83. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The elements of statistical learning: Data mining, inference and prediction. Springer, Newyork, USA.
  84. Helvik, S. A., Snøve Jr, O., and Saerom, P. (2007). Reliable prediction of drosha processing sites improves microRNA gene prediction. Bioinformatics, 23(2):142-149.
  85. Hertel, J., Lindemeyer, M., Missal, K., Fried, C., Tanzer, A., Flamm, C., Hofacker, I. L., Stadler, P. F., and Students of Bioinformatics Computer Labs 2004 and 2005, T. (2006). The expansion of the metazoan microRNA repertoire. BMC Genomics, 7(25):1-15.
  86. Hertel, J. and Stadler, P. F. (2006). Hairpins in a haystack: Recognizing microRNA precursors in comparative genomics data. Bioinformatics, 22(14):197-202.
  87. Hertz, G. Z. and Stormo, G. D. (1999). Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics, 15(7/8):563-577.
  88. Hestand, M. S., Klingenhoff, A., Scherf, M., Ariyurek, Y., Ramos, Y., van Workum, W., Suzuki, M., Werner, T., van Ommen, G. B., den Dunnen, J. T., Harbers, M., and tHoen, P. A. C. (2010). Tissue specific transcript annotation and expression profiling with complementary next-generation sequencing technologies. Nucleic Acids Research, 38(16):e165.
  89. Hill, M. M., Broman, K. W., Stupka, E., Smith, W. C., Jiang, D., and Sidow, A. (2008). The C. savignyi genetic map and its integration with the reference sequence facilitates insights into chordate genome evolution. Genome Research, 18(8):1369-1379.
  90. Huang, E., Liang, Y., Chowdhary, R., and Kassim, A. (2005). An algorithm for ab-initio DNA motif detection, chapter 4, pages 611-614. World Scientific, Imperial College Press, London, UK.
  91. International Chicken Genome Sequencing Consortium, T. (2004). Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature, 432(7018):695-716.
  92. Jeanmougina, F., Thompsona, J. D., Gouyb, M., Higginsc, D. G., and Gibsond, T. J. (1998). Multiple sequence alignment with clustal x. Trends in Biochemical Sciences, 23(10):403- 405.
  93. Jiang, Q., Wang, Y., Hao, Y., Juan, L., Teng, M., Zhang, X., Li, M., Wang, G., and Liu, Y. (2009). miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Research, 37(S1):98-104.
  94. Jones-Rhoades, M. W., P., B. D., and Bartel, B. (2006). MicroRNAs and their regulatory roles in plants. Annual Review of Plant Biology, 57:19-53.
  95. Jurka, J. (2000). Repbase update: a database and an electronic journal of repetitive elements. Trends in Genetics, 16(9):418-420.
  96. Kaczkowski, B., Torarinsson, E., Reiche, K., Havgaard, J. H., Stadler, P. F., and and, G. J. (2009). Structural profiles of human miRNA families from pairwise clustering. Bioinformatics, 25(3):291-294.
  97. Kai, Z. S. and Pasquinelli, A. E. (2010). MicroRNA assassins: factors that regulate the disappearance of miRNAs. Nature Structural & Molecular Biology, 17(1):5-10.
  98. Kamanu, T. K. K. (2006). Location-based estimation of the autoregressive coefficient in ARX(1) models. Master's thesis, University of the Western Cape, Cape Town, South Africa.
  99. Kernighan, B. W. and Ritchie, D. M. (1988). The C programming language. Prentice Hall PTR, Upper Saddle River, New Jersey, USA, Second edition.
  100. Kiriakidou, M., Nelson, P. T., Kouranov, A., Fitziev, P., Bouyioukos, C., Mourelatos, Z., and Hatzigeorgiou, A. (2004). A combined computational-experimental approach predicts human microRNA targets. Genes and Development, 18(18):1165-1178.
  101. Kleinberg, J. and Tardos, E. (2006). Algorithm Design. Pearson Addison Wesley, Boston, USA, international edition.
  102. Kloosterman, W. P., Lagendijk, A. K., Ketting, R. F., Moulton, J. D., and Plasterk, R. H. A. (2007). Targeted inhibition of miRNA maturation with morpholinos reveals a role for miR-375 in pancreatic islet development. PLoS Biology, 5(8):1738-1749.
  103. Kozomara, A. and Griffiths-Jones, S. (2011). miRBase: Integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39(D):D152-D157.
  104. Krek, A., Grün, D., Poy, M. N., Wolf, R., Rosenberg, L., Epstein, E. J., MacMenamin, P., da Piedade, I., Gunsalus, K. C., Stoffel, M., and Rajewsky, N. (2005). Combinatorial microRNA target predictions. Nature Genetics, 37(5):495-500.
  105. Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T. (2001). Identification of novel genes coding for small expressed RNAs. Science, 294(5543):853-858.
  106. Lai, E. C., Tomancak, P., Williams, R. W., and Rubin, G. M. (2003). Computational identification of drosophila microRNA genes. Genome Biology, 4(7):1-20.
  107. Lau, N. C., Lim, L. P., Weinstein, E. G., and Bartel, D. P. (2001). An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science, 294(5543):858- 862.
  108. Lawrence, C. E. and Reilly, A. A. (1990). An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins, 7(1):41-51.
  109. Lee, P. M. (2004). Bayesian statistics: An introduction. Hodder Arnold, London, UK, Second edition.
  110. Lee, R. C. and Ambros, V. (2001). An extensive class of small RNAs in Caenorhabditis elegans. Science, 294(5543):862-864.
  111. Lee, R. C., Hammell, C. M., and Ambros, V. (2006). Interacting endogenous and exogenous RNAi pathways in Caenorhabditis elegans. RNA, 12(4):589-597.
  112. Lesk, A. M. (2007). Introduction to Genomics. Oxford University Press, Newyork, USA.
  113. Lewis, B. P., Shih, I., Jones-Rhoades, M. W., Bartel, D. P., and Burge, C. B. (2003). Prediction of mammalian microRNA targets. Cell, 115(7):787-798.
  114. Lim, L. P., Glasner, M. E., Yekta, S., Burge, C. B., and Bartel, D. P. (2003a). Vertebrate microRNA genes. Science, 299(5612):1540.
  115. Lim, L. P., Lau, N. C., Garrett-Engele, P., Grimson, A., Schelter, J. M., Castle, J., Bartel, D. P., Linsley, P. S., and Johnson, J. M. (2005). Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature, 433(7027):769-773.
  116. Lim, L. P., Lau, N. C., Weinstein, E. G., Abdelhakim, A., Yekta, S., Rhoades, M. W., Burge, C. B., and Bartel, D. P. (2003b). The microRNAs of Caenorhabditis elegans. Genes and Development, 17(8):991-1008.
  117. Lindblad-Toh, K., Wade, C. M., Mikkelsen, T. S., Karlsson, E. K., Jaffe, D. B., Kamal, M., Clamp, M., Chang, J. L., Kulbokas III, E. J., Zody, M. C., Mauceli, E., Xie, X., Breen, M., Wayne, R. K., Ostrander, E. A., Ponting, C. P., Galibert, F., Smith, D. R., deJong, P. J., Kirkness, E., Alvarez, P., Biagi, T., Brockman, W., Butler, J., Chin, C., Cook, A., Cuff, J., Daly, M. J., DeCaprio, D., Gnerre, S., Grabherr, M., Kellis, M., Kleber, M., Bardeleben, C., Goodstadt, L., Heger, A., Hitte, C., Kim, L., Koepfli, K., Parker, H. G., Pollinger, J. P., Searle, S. M. J., Sutter, N. B., Thomas, R., Webber, C., Platform, B., and Lander, E. S. (2005). Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature, 438(7069):803-819.
  118. Locke, D. P., Hillier, L. W., Warren, W. C., Worley, K. C., Nazareth, L. V., Muzny, D. M., Yang, S., Wang, Z., Chinwalla, A. T., Minx, P., Mitreva, M., Cook, L., Delehaunty, K. D., Fronick, C., Schmidt, H., Fulton, L. A., Fulton, R. S., Nelson, J. O., Magrini, V., Pohl, C., Graves, T. A., Markovic, C., Cree, A., Dinh, H. H., Hume, J., Kovar, C. L., Fowler, G. R., Lunter, G., Meader, S., Heger, A., Ponting, C. P., Marques-Bonet, T., Alkan, C., Chen, L., Cheng, Z., Kidd, J. M., Eichler, E. E., White, S., Searle, S., Vilella, A. J., Chen, Y., Flicek, P., Ma, J., Raney, B., Suh, B., Burhans, R., Herrero, J., Haussler, D., Faria, R., Fernando, O., Darre, F., Farre, D., Gazave, E., Oliva, M., Navarro, A., Roberto, R., Capozzi, O., Archidiacono, N., Valle, G. D., Purgato, S., Rocchi, M., Konkel, M. K., Walker, J. A., Ullmer, B., Batzer, M. A., Smit, A. F. A., Hubley, R., Casola, C., Schrider, D. R., Hahn, M. W., Quesada, V., Puente, X. S., Ordonez, G. R., Lopez-Otin, C., Vinar, T., Brejova, B., Ratan, A., Harris, R. S., Miller, W., Kosiol, C., Lawson, H. A., Taliwal, V., Martins, A. L., Siepel, A., RoyChoudhury, A., Ma, X., Degenhardt, J., Bustamante, C. D., Gutenkunst, R. N., Mailund, T., Dutheil, J. Y., Hobolth, A., Schierup, M. H., Ryder, O. A., Yoshinaga, Y., de, J. P. J., Weinstock, G. M., Rogers, J., Mardis, E. R., Gibbs, R. A., and Wilson, R. K. (2011). Comparative and demographic analysis of orang-utan genomes. Nature, 469(7331):529-533.
  119. Lu, M., Zhang, Q., Deng, M., Miao, J., Guo, Y., Gao, W., and Cui, Q. (2008). An analysis of human microRNA and disease associations. PLoS ONE, 3(10):e3420.
  120. MacKay, D. J. C. (2004). Information theory, inference and learning algorithms. Cambridge University Press, Cambridge, UK.
  121. Man, M. Z., Wang, X., and Wang, Y. (2000). Power SAGE: Comparing statistical tests for SAGE experiments. Bioinformatics, 16(11):953-959.
  122. Manoharan, M. (2003). RNA interference and chemically modified siRNAs. Nucleic Acids Symposium Series, 3(1):115-116.
  123. Marchand, B., Bajić, V. B., and Kaushik, D. K. (2011). Highly scalable ab-initio genomic motif identification, pages 1-10. Association for Computing Machinery, Inc, New York, USA.
  124. Maziére, P. and Enright, A. J. (2007). Prediction of microRNA targets. Drug Discovery Today, 12(11-12):452-458.
  125. McCaskill, J. S. (1990). The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 6(7):1105-1119.
  126. Mendes, N. D., Freitas, A. T., and Sagot, M. F. (2009). Current tools for the identification of miRNA genes and their targets. Nucleic Acids Research, 37(8):2419-2433.
  127. Metzker, M. L. (2010). Sequencing technologies -the next generation. Nature Reviews Genetics, 11:31-45.
  128. Murphy, W. J., Agarwala, R., Schäffer, A. A., Stephens, R., Smith Jr., C., Crumpler, N. J., David, V. A., and O'Brien, S. J. (2005). A rhesus macaque radiation hybrid map and comparative analysis with the human genome. Genomics, 86(4):383-395.
  129. Nam, J., Shin, K., Han, J., Lee, Y., Kim, N. V., and Zhang, B. (2005). Human microRNA prediction through a probabilistic co-learning model of sequence and structure. Nucleic Acids Research, 33(11):3570-3581.
  130. Namsrai, O., Jung, K. S., Kim, S., and Ryu, K. H. (1999). An improved algorithm for RNA secondary structure prediction. Technical report, School of Electrical & Computer Engineering, Weizmann Institute of Science, Cheongju, Chungbuk Korea.
  131. Nobel Assembly, T. (2006). RNA interference. Technical report, Kaloliska Institute, Stockholm, Sweden.
  132. Nozawa, M., Miura, S., and Nei, M. (2010). Origins and evolution of microRNA genes in Drosophila species. Genome Biology Evolution, 2:180-189.
  133. Oulas, A., Reczko, M., and Poirazi, P. (2009). MicroRNAs and cancer -the search begins! IEEE Transactions on Information Technology in Biomedicine, 13(1):67-77.
  134. Pavesi, G., Mauri, G., and Pesole, G. (2001). Methods for pattern discovery in unaligned biological sequences. Briefings in Bioinformatics, 2(4):1-14.
  135. Piriyapongsa, J. and Jordan, I. K. (2007). A family of human microRNA genes from miniature inverted-repeat transposable elements. PLoS ONE, 2(2):1-11.
  136. Pollard, T. D., Earnshaw, W. C., and Lippincott-Schwartz, J. (2008). Cell biology. Saunders Elsevier, Philadelphia, USA, Second edition.
  137. Rao, S. R., Trivedi, S., Emmanuel, D., and Merita, K. (2010a). DNA repetitive sequences- types, distribution and function: A review. Journal of Cell and Molecular Biology, 7/8(1/2):1-11.
  138. Rao, S. R., Trivedi, S., Emmanuel, D., Merita, K., and Hynniewta, M. (2010b). DNA repetitive sequences-types, distribution and function: A review. Journal of Cell and Molecular Biology, 7(2):1-11.
  139. Riley, K. F., Hobson, M. P., and Bence, S. J. (2003). Mathematical methods for Physics and Engineering. Cambridge University Press, Cambridge, UK, Second edition.
  140. Robert, C. P. (2007). The Bayesian choice: From decision-theoretic foundation to computational implementation. Springer, Newyork, USA.
  141. Royce, T. E., Rozowsky, J. S., Bertone, P., Samanta, M., Stolc, V., Weissman, S., Snyder, M., and Gerstein, M. (2005). Issues in the analysis of oligonucleotide tilling microarrays for transcription mapping. Trends in Genetics, 21(8):466-475.
  142. Ruby, J. G., Jan, C. H., and Bartel, D. P. (2007). Intronic microRNA precursors that bypass drosha processing. Nature, 448(7149):83-86.
  143. Sandve, G. K. and Drabløs, F. (2006). A survey of motif discovery methods in an integrated framework. Biology Direct, 1:1-16.
  144. Scally, A., Dutheil, J. Y., Hillier, L. W., Jordan, G. E., Goodhead, I., Herrero, J., Hobolth, A., Lappalainen, T., Mailund, T., Marques-Bonet, T., McCarthy, S., Montgomery, S. H., Schwalie, P. C., Tang, Y. A., Ward, M. C., Xue, Y., Yngvadottir, B., Alkan, C., Andersen, L. N., Ayub, Q., Ball, E. V., Beal, K., Bradley, B. J., Chen, Y., Clee, C. M., Fitzgerald, S., Graves, T. A., Gu, Y., Heath, P., Heger, A., Karakoc, E., Kolb-Kokocinski, A., Laird, G. K., Lunter, G., Meader, S., Mort, M., Mullikin, J. C., Munch, K., O'Connor, T. D., Phillips, A. D., Prado-Martinez, J., Rogers, A. S., Sajjadian, S., Schmidt, D., Shaw, K., Simpson, J. T., Stenson, P. D., Turner, D. J., Vigilant, L., Vilella, A. J., Whitener, W., Zhu, B., Cooper, D. N., P. de J., Dermitzakis, E. T., Eichler, E. E., Flicek, P., Goldman, N., Mundy, N. I., Ning, Z., Odom, D. T., Ponting, C. P., Quail, M. A., Ryder, O. A., Searle, S. M., Warren, W. C., Wilson, R. K., Schierup, M. H., Rogers, J., Tyler-Smith, C., and Durbin, R. (2012). Insights into hominid evolution from the gorilla genome sequence. Nature, 483(7388):169-175.
  145. Schleif, R. F. (1993). Genetics and Molecular Biology. The Johns Hopkins University Press, London, UK, second edition.
  146. Schroeder, S. J. (2009). Advances in RNA structure prediction from sequence: New tools for generating hypotheses about viral RNA structure-function relationships. Journal of Virology, 83(13):6326-6334.
  147. Schulze, A. and Downward, J. (2001). Navigating gene expression using microarrays -a technology review. Nature Cell Biology, 3:190-195.
  148. Sewer, A., Paul, N., Landgraf, P., Aravin, A., Pfeffer, S., Brownstein, M. J., Tuschl, T., van Nimwegen, E., and Zavolan, M. (2005). Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinformatics, 6(267):1-15.
  149. Shivdasani, R. A. (2006). MicroRNAs: regulators of gene expression and cell differentiation. Blood, 108(12):3646-3653.
  150. Sivanandam, S. N. and Deepa, S. N. (2008). Introduction to Genetic Algorithms. Springer- Verlag, Berlin, Germany.
  151. Skaletsky, H., Kuroda-Kawaguchi, T., Minx, P. J., Cordum, H. S., Hillier, L., Brown, L. G., Repping, S., Pyntikova, T., Ali, J., Bieri, T., Chinwalla, A., Delehaunty, A., Delehaunty, K., Du, H., Fewell, G., Fulton, L., Fulton, R., Graves, T., Hou, S., Latrielle, P., Leonard, S., Mardis, E., Maupin, R., McPherson, J., Miner, T., Nash, W., Nguyen, C., Ozersky, P., Pepin, K., Rock, S., Rohlfing, T., Scott, K., Schultz, B., Strong, C., Tin-Wollam, A., Yang, S., Waterston, R. H., Wilson, R. K., Rozen, S., and Page, D. C. (2003). The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature, 423(6942):825-837.
  152. Skalsky, R. L. and Cullen, B. R. (2010). Viruses, micrornas, and host interactions. Annual Review of Microbiology, 64:123-141.
  153. Smalheiser, N. R. and Torvik, V. I. (2005). Mammalian microRNAs derived from genomic repeats. TRENDS in Genetics, 21(6):322-326.
  154. Srikantan, S., Marasa, B. S., Becker, K. G., Gorospe, M., and Abdelmohsen, K. (2011). Paradoxical microRNAs: Individual gene repressors, global translation enhancers. Cell Cycle, 10(5):751-759.
  155. Stark, A., Brennecke, J., Russell, R. B., and Cohen, S. M. (2003). Identification of drosophila microRNA targets. PLoS Biology, 1(3):397-409.
  156. Stubblebine, T. (2007). Regular expression pocket reference. O'Reilly, Sebastopol, California, USA, second edition.
  157. Tanzer, A., Amemiya, C. T., Kim, C., and Stadler, P. F. (2005). Evolution of microRNAs located within Hox gene clusters. Journal of Experimental Zoology: Molecular and Developmental Evolution, 304B(1):75-85.
  158. Tanzer, A. and Stadler, P. F. (2004). Molecular evolution of a microRNA cluster. JMB: Journal of Molecular Biology, 339(2):327-335.
  159. tHoen, P. A. C., Ariyurek, Y., Thygesen, H. H., Vreugdenhil, E., Vossen, R. H. A. M., de Menezes, R. X., Boer, J. M., van Ommen, G. B., and den Dunnen, J. T. (2008). Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Research, 36(21):e141.
  160. Torarinsson, E. and Lindgreen, S. (2008). WAR: Webserver for aligning structural RNAs. Nucleic Acids Research, 36(W79-W84):1-2.
  161. van der Burgt, A., Fiers, M. W. J. E., Nap, J., and van Ham, R. C. H. J. (2009). In silico miRNA prediction in metazoan genomes: Balancing between sensitivity and specificity. BMC Genomics, 10(204):1-24.
  162. van Helden, J. (2003). Prediction of transcriptional regulation by analysis of the non-coding genome. Current Genomics, 4:1-8.
  163. van Helden, J. (2005). The analysis of regulatory sequences. Libre University, Belgium,.
  164. van Helden, J., Rios, A. F., and Callado-Vides, J. (2000). Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Research, 28(8):1808- 1818.
  165. Wand, M. P. and Jones, C. M. (1995). Kernel smoothing. Chapman & Hall, London, UK.
  166. Wang, G., Wang, Y., Feng, W., Wang, X., Yang, J. Y., Zhao, Y., Wang, Y., and Liu, Y. (2009). Transcription factor and microRNA regulation in androgen-dependent and -independent prostate cancer cells. BMC Bioinformatics, 9(S22):1-12.
  167. Wang, X., Zhang, J., Li, F., Gu, J., He, T., Zhang, X., and Li, Y. (2005). MicroRNA identification based on sequence and structure alignment. Bioinformatics, 21(18):3610- 3614.
  168. Wheeler, B. M., Heimberg, A. M., Moy, V. N., Sperling, E. A., Holstein, T. W., Heber, S., and Peterson, K. J. (2009). The deep evolution of metazoan microRNAs. Evolution & Development, 11(1):50-68.
  169. Will, S., Reiche, K., Hofacker, I. L., Stadler, P. F., and Backofen, R. (2007). Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Computational Biology, 3(4):680-691.
  170. Williams, T. and Kelley, C. (2011). gnuplot 4.5: An interactive plotting program. Web address, http://gnuplot.sourceforge.net/.
  171. Xue, C., Li, F., He, T., Liu, G., Li, Y., and Zhang, X. (2005). Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics, 6(310):1-7.
  172. Yoon, B. and Vaidyanathan, P. P. (2007). Computational identification and analysis of noncoding RNAs. IEEE Signal Processing Magazine, 65:1-11.
  173. Yoon, S. and De Micheli, G. (2006). Computational identification of microRNAs and their targets. Birth Defects Research (Part C), 78:118-128.
  174. Yu, X., Lin, J., Zack, D. J., Mendell, J. T., and Qian, J. (2008). Analysis of regulatory network topology reveals functionally distinct classes of microRNAs. Nucleic Acids Research, 36(20):6494-6503.
  175. Yuan, Z., Sun, X., Hongde, L., and Xie, J. (2011). microRNA genes derived from repetitive elements and expanded by segmental duplication events in mammalian genomes. PLoS ONE, 6(3):1-13.
  176. Zaslavsky, E. and Singh, M. (2006). A combinatorial optimization approach for diverse motif finding applications. Algorithms for Molecular Biology, 1:1-13.
  177. Zhang, L., Zhou, W., Velculescu, V. E., Kern, S. E., Hruban, R. H., Hamilton, S. R., Vogelstein, B., and Kinzler, K. W. (1997). Gene expression profiles in normal and cancer cells. Science, 276:1268-1272.
  178. Zhang, Y., Liu, X. S., Liu, Q., and Wei, L. (2006). Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-nats) in ten species. Nucleic Acids Research, 34(12):3465-3475.
  179. Zhou, Y., Ferguson, J., Chang, J. T., and Kluger, Y. (2007). Inter-and intra-combinatorial regulation of transcription factors and microRNAs. BMC Bioinformatics, 8(396):1-10.
  180. Zuker, M., Mathews, D. H., and Turner, D. H. (1999). Algorithms and thermodynamics for RNA secondary structure prediction: A practical guide. NATO ASI Series, Kluwer Academic Publishers (now Springer-Verlag), Dordrecht, The Netherlands. F065 mir-132 12 87.9-99.2 91.7-100 92.4-99.5 91.7-100 F086 mir-210 12 92.3-99.6 91.7-100 87.1-99.7 91.7-100