Academia.eduAcademia.edu

Outline

Quantitative analyses of typological data

2006

Abstract

The current computational era heralds a multitude of challe nges for linguists, mathematicians and computer scientists alike. The investi gation of linguistic feature consistencies, their implications in historical ling uistics and the detection of possible geographical and phylogenetical influences in f eature behaviors will steadily unveil the importance of each language characteri stic and its associated historical role in language evolution. The wish to develop n ew methods has made the integration of various disciplines necessary. Consequ ently, the role of computer science (and more specifically bioinformatics) has ri sen in the study of linguistic processes. By combining the strengths of bioinform atics algorithms, mathematical procedures, statistical knowledge and databases developing methodologies, various strategies have been developed and applied to the phylogenetic reconstruction of language evolution, but up till now only on s mall scales. The analysis of a worldwide ty...

References (77)

  1. MySql AB. Mysql database systemv5.1, 2006. www.mysql.com.
  2. M. Albu and A. Chivu. Asketch -Assistant Sketch for Architects. Disserta- tion. Politechnica University of Bucharest, Faculty of Computer Science, 1999.
  3. M. Albu, C. Davauchelle, A. Dress, and A. Grossmann. A rank based ap- proach to phylogenetics. unpublished manuscipt, 2006.
  4. M. Albu and A. Dress. The evolution of texts: confronting stemmatological and genetical methods. proceedings of the international workshop held in louvain-la-neuve. Linguistica Computazionale, 2004.
  5. J.D. Apresjan. Algoritmy postroenija klassov po matritse rasstojaiij (algo- rithms for building classes on the matrix of distances). Masinnyi Perevof i Prikladnaja Lingvistika, 9:3-18, 1966.
  6. J.D. Apresjan. Experimental'noe issledovanie semantiki russkogo glagola (Experimental investigation of the Semantics of the Russian Verb). Nauka, 1967.
  7. W.R. Atchley, J. Zhao, A. Fernandes, and T. Drueke. Solving the sequence metric problem. Proc. Natl. Acad. Sci, pages 6395-6400, 2005.
  8. B. Bickel. Typology in the 21st century: major current developments, 2006.
  9. B. Bickel and J. Nichols. The autotyp database network. electronic database, 1966. http://www.uni-leipzig.de/ autotyp.
  10. B. Bickel and J. Nichols. Typological enclaves. paper presented at the 5th biannual conference of the association for linguistic typology, 2003.
  11. B. Bickel and J. Nichols. Inclusive/exclusive as person vs. number categories worldwide. clusivity, 2005.
  12. S. Boecker and A. Dress. Maximal hierarchies. Adv. Math., 151:270-282, 2000.
  13. G. Bonfante. Ideas on the kinship of the european languages from 1200 to 1800. Cahiers d'Histoire Mondiale, pages 679-699, 1954.
  14. P. Buneman. The recovery of trees from measures of dissimilarity. Math. in Archaeological and Historical Sciences, pages 387-395, 1971.
  15. L. Campbell, V. Bubenik, and L. Saxon. Word order universals: refinements and clarifications. Canadian Journal of Linguistics, pages 209-230, 1988.
  16. E. Coseriu. Adam Smith und die Anfange deer Sprachtypologie. Herbert E. Brekle -Leonhard Lipka, 1968.
  17. W. Croft. Typology and Universals. (Cambridge Textbooks in Linguistics.). Cambridge University Press, 1990.
  18. W. Croft. Typology and universals. Cambridge: Cambridge University Press, 2002.
  19. W. Croft and K.T. Poole. Inferring universals from grammatical variation: multidimensional scaling for typological analysis. Ms., Center for Advanced Studies in the Behavioral Sciences, Stanford, 2004.
  20. M. Cysouw. Against implicational universals. Linguistic Typology, pages 89-10, 2003.
  21. M. Cysouw. Quantitative methods in typology. Quantitative linguistics: an international handbook. Berlin: Mouton de Gruyter, 2006.
  22. M. Cysouw, M. Albu, and A. Dress. Analyzing feature consistency using dissimilarity matrices. Sprachtypologie und Universalienforschung, 2006. submitted.
  23. M. Cysouw, J. Good, M. Albu, and H.J. Bibiko. Retrofitting an ontol- ogy onto the world atlas of language structures. proceedings of the e- meld workshop 2005: Morphosyntactic annotation and terminology: Lin- guistic ontologies and data categories for linguistic resources. 2005. http://emeld.org/workshop/2005/papers/good-paper.doc.
  24. C. Darwin. On the Origin of Species by Means of Natural Selection. John Murray, 1859.
  25. C. Devauchelle, A. Dress, A Grossman, S. Gruenewald, and A. Henaut. Con- structing hierarchical set systems. Annals of Combinatorics, 8:441 -456, 2004.
  26. A. Dress, B. Holland, K.T. Huber, J.H. Koolen, V. Moulton, and J. Weyer- Menkhoff. Delta additive and delta ultra-additive maps, gromov's trees, and the farris transform. Discrete Applied Mathematics, 146:51-73, 2005.
  27. M. Dryer, editor. Why statistical universals are better than absolute univer- sals. Papers from the 33rd Annual Meeting of the Chicago Linguistic Society. Chicago Linguistic Society, 1998.
  28. M. S. Dryer. Large linguistic areas and language sampling. Studies in Lan- guage, pages 257-292, 1989.
  29. M. S. Dryer. The greenbergian word order correlations. Language, pages 81-138, 1992.
  30. M. Dunn, A. Terrill, G. Reesink, R. A. Foley, and S. C. Levinson. Structural phylogenetics and the reconstruction of ancient language history. Science, 309:2072-2075, 2005.
  31. S. Farrar and T. Langendoen. A linguistic ontology for the semantic web. GLOT International, 7:97-100, 2003.
  32. J Felsenstein, 2001. PHYLIP: Phylogeny Inference Package. Version 3.6.
  33. J. Fisiak. Linguistic reconstruction and typology. Mouton de Gruyter. Berlin, 1997.
  34. V. Flajshans. Pisenictvi ceske slovem i obrazem od najdavnsejsuch dob az po nase casy[Czech literature in word and picture from the earliest days until our times]. Prague: Grosman & Svoboda, 1901.
  35. R.G. Gordon. Ethnologue: Languages of the world, 2005. http://www.ethnologue.com/.
  36. GraphViz. Graphviz -graph visualization software, February 2006. http://www.graphviz.org/.
  37. R.D. Gray and Q. Atkinson. Language-tree divergence times support the anatolian theory of indo-european origins. Nature, pages 435-439, 2003.
  38. J.H Greenberg. Some universals of grammar with particular reference to the order of meaningful elements. In Universals of Language. Cambridge, Mass.: MIT Press., 1963.
  39. E. Haeckel. Generelle Morphologie der Organismen. G. Reimer, 1866.
  40. R.W. Hamming. Error-detecting and error-corecting codes. Bell System ZTechnicaal Journal, 29(2):147-160, 1950.
  41. M. Haspelmath, M.S. Dryer, D. Gil, and B. Comrie. The world atlas of language structures. Oxford: Oxford University Press, 2005.
  42. J.A. Hawkins. Word order universals. New York: Academic Press, 1983.
  43. D. H. Huson and D. Bryant. Application of phylogenetic networks in evolu- tionary studies. Mol. Biol. Evol., 23(2):254-267, 2006.
  44. A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, 1988.
  45. R. Jakobson. K xarakteristike evrazijskogo jazykovogo sojuza. Selected Writ- ings. The Hague: Mouton, 2005.
  46. D. Janssed, B. Bickel, and F. Zuniga. Randomization tests in language typology. Ms. University of Leipzig (www.uni- leipzig.de/ bickel/research/papers), 2005.
  47. B. Kaltz. Les vrais principes de la langue francoise de l'abbe Girard devant la critique du XVIIIe siecle a nos jours. Konrad Koerner, 1980.
  48. M. Kendall. Theory and Applications of Rank Order Statistics. Charles Griffins & Co Ltd, 4th edition, 1970.
  49. P.V. Kirch and R. C. Green. Hawaiki, ancestral polynesia: An essay. Histor- ical Anthropology, 2001.
  50. P. Legendre and F.-J. Lapointe. Assessing the congruence among distance matrices: single malt scotch whiskies revisited. Australian and New Zealand Journal of Statistics, 46:615-629, 2004.
  51. P. Legendre and F.J. Lapointe. Cadm software : Congruence among distance matrices, 2004. http://www.bio.umontreal.ca/casgrain/en/labo/cadm.html.
  52. S.C. Levinson. Space in language and cognition. Cambridge: Cambridge University Pres, 2003.
  53. W. H. Li, M. Tanimura, and P. M. Sharp. An evaluation of the molecular clock hypothesis using mammalian dna sequences. J. Mol. Evol., 4:330-42, 1987.
  54. J.R.A. Little and D.B. Rubin. Statistical Analysis with Missing Data. John Wiley and Sons. New York, 1987.
  55. N. M. Luscombe, D. Greenbaum, and M. Gerstein. What is bioinformatics? a proposed definition and overview of the field. Method Inform Med, pages 346-58, 2001.
  56. J. Lynch, M. Ross, and T. Crowley. The Oceanic Languages. Curzon Press, Richmond, UK, 2002.
  57. E. Maslova. A dynamic approach to the verification of distributional univer- sals. Linguistic Typology, page 33, 2000.
  58. J. Nichols. Language diversity in space and time. Chicago: The University of Chicago Press, 1992.
  59. J. Nichols. Modeling ancient population structures and population move- ment in linguistics and archeology. Annual Review of Anthropology, page 84, 1997.
  60. J. Nichols and D.A. Peterson. The amerind personal pronouns. Language, page 71, 1996.
  61. J. Noordegraaf. A few remarks on adam smith's 'dissertation'(1761). Histo- riographia Linguistica, pages 59-67, 1977.
  62. R.D.M. Page and E.C.Holmes. Molecular Evolution. A Phylogenetic Ap- proach. Oxford University Press, 2005.
  63. R.D. Perkins. Statistical techniques for determining language sample size. Studies in Language, page 315, 1989.
  64. F. Planck. Typology by the end of the 18th century. History of the Language Sciences: An International Handbook on the Evolution of the Study of Lan- guage from the Beginnings to the Present, volume 2, pages 400-401. Berlin: de Gruyter, thirteenth edition, 1982.
  65. A.F. P. Rhodes and R.M. Needham. A reduction method for non-arithmetic data, and its application to thesauric translation. Proceedings of an Inter- national UNESCO Conference on Information Processing, pages 321-325, 1960.
  66. M. Ross. Proto oceanic and the austronesian languages of western melane- sia. Pacific Linguistics, 1988.
  67. N. Saitou and M. Nei. The neighbour-joining method: A new method for re- construction of phylogenetic trees. Molecular Biology and Evolution, 4:406- 425, 1987.
  68. J.J. Scaliger. "Diatriba de Europaeorum linguae". Appendix to Opuscula varia, antehac non edita. Paris: H. Beys, 1610.
  69. R. W. Sinnott. Virtues of the haversine. Sky and Telescope, 68:159, 1984.
  70. SPSS. Statistical package for the social sciences, 2005. www.spss.com.
  71. K. Strimmer and A. Von Haeseler. Quartet puzzling: A quartet maximum- likelihood method for reconstructing tree topologies. Molecular Biology And Evolution, 13(7):964 -, 1996.
  72. M. Swadesh. Lexico-statistical dating of prehistoric ethnic contacts: With special reference to north american indians and eskimos. Proceedings of the American Philosophical Society, pages 452-463, 1952.
  73. J. van der Auwera. Revisiting the balkan and meso-american linguistic areas. Language Sciences, page 70, 1998.
  74. S. Wichmann and D. Kamholz. Evaluating the strength of typological fea- tures for phylogenetic analyses. manuscript, 2005.
  75. S. Wichmann and A. Saunders. How to use typological databases in histori- cal linguistic research. manuscript, 2005. Selbständigkeitserklärung
  76. Hiermit erkläre ich, die vorliegende Dissertation selbständig und ohne unzulässige fremde Hilfe angefertigt zu haben. Ich habe keine anderen als die angeführten Quellen und Hilfsmittel benutzt und sämtliche Textstellen, die wörtlich oder sin- ngemäß aus veröffentlichten oder unveröffentlichten Schriften entnommen wur- den, und alle Angaben, die auf mündlichen Auskünften beruhen, als solche ken- ntlich gemacht. Ebenfalls sind alle von anderen Personen bereitgestellten Materi- alien oder erbrachten Dienstleistungen als solche gekennzeichnet. Leipzig, den 11.09.2006
  77. Mihai Albu