Computational grammar induction for linguists

Menno  van Zaanen

Outline

Computational grammar induction for linguists

Menno van Zaanen

2004

Abstract
AI

Computational Grammar Induction (CGI) aims to create computational models for identifying infinite sets of language based on finite examples, intersecting linguistics, cognitive neuroscience, and computation. This article primarily addresses first language acquisition, highlighting limited interactions between linguistics and CGI, and advocating for collaboration to explore grammar learnability, real-life data analysis, and probabilistic approaches. The conclusion emphasizes the potential for significant advancements through integrative research efforts across these disciplines.

References (39)

Adriaans, P. (2001). Learning shallow context-free languages under simple distribu- tions. In Copestake, A. and (eds.), K. V., editors, Algebras, Diagrams and Decisions in Language, Logic and Computation. CSLI/CUP.
Adriaans, P. and Vervoort, M. (2002). The EMILE 4.1 grammar induction toolbox. In Adriaans, P., Fernau, H., and van Zaanen, M., editors, Grammatical Inference: Algorithms and Applications; 6th International Colloquium, ICGI 2002, volume 2484 of LNCS/LNAI, pages 293-295. Springer.
Adriaans, W. P. (1992). Language Learning from a Categorial Perspective. PhD thesis, Universiteit van Amsterdam.
Angluin (1980). Inductive inference of formal languages from positive data. Informa- tion and Control, 45:117-135.
Angluin, D. (1982). Inference of reversible languages. Journal of the Association for Computing Machinery, 29(3):741-765.
Angluin, D. (1987). Learning k-bounded context-free grammars. Technical Report YALEU/DCS/TR-557, Yale University.
Angluin, D. (1988). Queries and concept learning. machine learning. Machine Learn- ing, 2:319-342.
Angluin, D., Krikis, M., Sloan, R. H., and Turán, G. (1997). Malicious omissions and errors in answers to membership queries. Machine Learning, 28(2-3):211-255.
Charniak, E. (1993). Statistical Language Learning. Massachusetts Institute of Tech- nology Press, Cambridge:MA, USA and London, UK.
Clark, E. V. (2002). First language acquisition. Cambridge University Press, Cam- bridge, UK.
de la Higuera, C., Adriaans, P., van Zaanen, M., and Oncina, J., editors (2003). Pro- ceedings of the Workshop and Tutorial on Learning Context-Free Grammars held at the 14th European Conference on Machine Learning (ECML) and the 7th Euro- pean Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD);
Dubrovnik, Croatia.
Faloutsos, M., Faloutsos, P., and Faloutsos, C. (1999). On power-law relationships of the internet topology. In SIGCOMM, pages 251-262.
Gold, E. M. (1967). Language identification in the limit. Information and Control, 10:447-474.
Hagoort, P. (2003). How the brain solves the binding problem for language: a neuro- computational model of syntactic processing. Neuroimage, 20(Supplement 1):S18- S29.
Horning, J. J. (1969). A study of grammatical inference. PhD thesis, Stanford Uni- versity, Stanford:CA, USA.
Hume, D. (1909). An Enquiry Concerning Human Understanding, volume Vol. XXXVII, Part 3 of The Harvard Classics. P.F. Collier & Son.
Huybrechts, R. (1984). The weak adequacy of context-free phrase structure grammar. In de Haan, G., Trommelen, M., and Zonneveld, W., editors, Van periferie naar kern, pages 81-99. Foris, Dordrecht, the Netherlands.
Ishiszaka, H. (1990). Polynomial time learnability of simple deterministic languages. Machine Learning, 5:151.
Kanazawa, M. (1995). Learnable classes of categorial grammars. PhD thesis, Stanford University.
Lang, K., Pearlmutter, B., and Price, R. (1998). Results of the Abbadingo One DFA learning competition and a new evidence-driven state merging algorithm. In Honavar, V. and Slutzki, G., editors, Grammatical Inference; 4th International Colloquium, ICGI-98, volume 1433 of LNCS/LNAI, pages 1-12. Springer.
Li, M. and Vitányi, P. M. B. (1991). Learning simple concepts under simple distri- butions. SIAM Journal of Computing, 20(5):911-935.
Osherson, D., de Jongh, D., Martin, E., and Weinstein, S. (1997). Handbook of Logic and Language, chapter Formal Learning Theory, pages 737-775. Elsevier Science B.V. Pinker, S. (1999). Words and Rules: The Ingredients of Language. Weidenfeld and Nicolson, London.
Pitt, L. and Warmuth, M. (1988). Reductions among prediction problems: On the difficulty of predicting automata. In 3rd Conference on Structure in Complexity Theory, pages 60-69.
Sakakibara, Y. (1992). Efficient learning of context-free grammars from positive struc- tural examples. Information and Computation, 97:23-60.
Seginer, Y. (2003). Learning context free grammars in the limit aided by the sample distribution. In de la Higuera et al. [2003], pages 77-88.
Shieber, S. (1985). Evidence against the context-freeness of natural language. Lin- guistics and Philosophy, 8(3):333-343.
Sokolov, J. and Snow, C. (1994). The changing role of negative evidence in the- ories of language development. In Gallaway, C. and Richards, B., editors, Input and Interaction in Language Acquisition, pages 38-55. Cambridge University Press, Cambridge, UK.
Solomonoff, R. J. (1997). The discovery of algorithmic probability. Journal of Com- puter and System Sciences, 55(1):73-88.
Valiant, L. G. (1984). A theory of the learnable. Communications of the Association for Computing Machinery, 27(11):1134-1142.
van Kampen, J. (2004). Language specific bootstraps for UG categories. International Journal of Bilingualism. To appear.
van Zaanen, M. (2002). Bootstrapping Structure into Language: Alignment-Based Learning. PhD thesis, University of Leeds, Leeds, UK.
van Zaanen, M. and Adriaans, P. (2001). Alignment-Based Learning versus EMILE: A comparison. In Proceedings of the Belgian-Dutch Conference on Artificial Intel- ligence (BNAIC);
Amsterdam, the Netherlands, pages 315-322.
Vervoort, M. (2000). Games, walks and Grammars. PhD thesis, University of Ams- terdam.
Vosse, T. and Kempen, G. (2000). Syntactic structure assembly in human parsing: a computational model on competitive inhibition and lexicalist grammar. Cognition, 75:105-143.
Wolff, J. G. (1977). The discovery of segments in natural language. British Journal of Psychology, 68:97-106.
Wolff, J. G. (2003). Information compression by multiple alignment, unification and search as a unifying principle in computing and cognition. Journal of Artificial Intelligence Research, 19:193-230.
Yokomori, T. (2003). Polynomial-time identification of very simple grammars from positive data. Theoretical Computer Science, 1(298):179-206.

Computational grammar induction for linguists

Sign up for access to the world's latest research

AbstractAI

Related papers

References (39)

Related papers

Related topics

Abstract
AI