The origins of human language, with its extraordinarily complex structure and multitude of functi... more The origins of human language, with its extraordinarily complex structure and multitude of functions, remains among the most challenging problems for evolutionary biology and the cognitive sciences. Although many will agree progress on this issue would have important consequences for linguistic theory, many remain sceptical about whether the topic is amenable to rigorous, scientific research at all. Complementing recent developments toward better empirical validation, this thesis explores how formal models from both linguistics and evolutionary biology can help to constrain the many theories and scenarios in this field. I first review a number of foundational mathematical models from three branches of evolutionary biology -- population genetics, evolutionary game theory and social evolution theory -- and discuss the relation between them. This discussion yields a list of ten requirements on evolutionary scenarios for language, and highlights the assumptions implicit in the various formalisms. I then look in more details at one specific step-by-step scenario, proposed by Ray Jackendoff, and consider the linguistic formalisms that could be used to characterise the evolutionary transitions from one stage to the next. I conclude from this review that the main challenges in evolutionary linguistics are to explain how three major linguistic innovations -- combinatorial phonology, compositional semantics and hierarchical phrase-structure -- could have spread through a population where they are initially rare. In the second part of the thesis, I critically evaluate some existing formal models of each of these major transitions and present three novel alternatives. In an abstract model of the evolution of speech sounds (viewed as trajectories through an acoustic space), I show that combinatorial phonology is a solution for robustness against noise and the only evolutionary stable strategy (ESS). In a model of the evolution of simple lexicons in a noisy environment, I show that the optimal lexicon uses a structured mapping from meanings to sounds, providing a rudimentary compositional semantics. Lexicons with this property are also ESS's. Finally, in a model of the evolution and acquisition of context-free grammars, I evaluate the conditions under which hierarchical phrase-structure will be favoured by natural selection, or will be the outcome of a process of cultural evolution. In the last chapter of the thesis, I discuss the implications of these models for the debates in linguistics on innateness and learnability, and on the nature of language universals. A mainly negative point to make is that formal learnability results cannot be used as evidence for an innate, language-specific specialisation for language. A positive point is that with the evolutionary models of language, we can begin to understand how universal properties and tendencies in natural languages can result from the intricate interaction between innate learning biases and a process of cultural evolution over many generations.
A fundamental, universal property of human language is that its phonology is combinatorial. That ... more A fundamental, universal property of human language is that its phonology is combinatorial. That is, one can identify a set of basic, distinct units (phonemes, syllables) that can be productively combined in many different ways. In this paper, we review a number of theories and models that have been developed to explain the evolutionary transition from holistic to combinatorial signal systems, but find that in all problematic linguistic assumptions are made, or crucial components of evolutionary explanations are omitted. We present a novel model to investigate the hypothesis that combinatorial phonology results from optimising signal systems for perceptual distinctiveness. Our model differs from previous models in two important respects. First, signals are modelled as trajectories through acoustic space. Hence, both holistic and combinatorial signals have a temporal structure. Second, we use the methodology from evolutionary game theory. Crucially, we show a path of ever increasing fitness from holistic to combinatorial signals, where every innovation represents an advantage even if no-one else in a population has yet obtained it.
According to a controversial hypothesis, a characteristic unique to human language is recursion. ... more According to a controversial hypothesis, a characteristic unique to human language is recursion. Contradicting this hypothesis, it has been claimed that the starling, one of the two animal species tested for this ability to date, is able to distinguish acoustic stimuli based on the presence or absence of a center-embedded recursive structure. In our experiment we show that another songbird species, the zebra finch, can also discriminate between artificial song stimuli with these structures. Zebra finches are able to generalize this discrimination to new songs constructed using novel elements belonging to the same categories, similar to starlings. However, to demonstrate that this is based on the ability to detect the putative recursive structure, it is critical to test whether the birds can also distinguish songs with the same structure consisting of elements belonging to unfamiliar categories. We performed this test and show that seven out of eight zebra finches failed it. This suggests that the acquired discrimination was based on phonetic rather than syntactic generalization. The eighth bird, however, must have used more abstract, structural cues. Nevertheless, further probe testing showed that the results of this bird, as well as those of others, could be explained by simpler rules than recursive ones. Although our study casts doubts on whether the rules used by starlings and zebra finches really provide evidence for the ability to detect recursion as present in “context-free” syntax, it also provides evidence for abstract learning of vocal structure in a songbird.
Proceedings of the 12th Conference of the …, Jan 1, 2009
We present several algorithms for assigning heads in phrase structure trees, based on different l... more We present several algorithms for assigning heads in phrase structure trees, based on different linguistic intuitions on the role of heads in natural language syntax. Starting point of our approach is the observation that a head-annotated treebank defines a unique lexicalized tree ...
We develop an approach to automatically identify the most probable multi-word constructions used ... more We develop an approach to automatically identify the most probable multi-word constructions used in children’s utterances, given syntactically annotated utterances from the Brown corpus of CHILDES. The found constructions cover many interesting linguistic phenomena from the language acquisition literature, and show a progression from very concrete towards abstract constructions. We show quantitatively that for all children of the Brown corpus grammatical abstraction, defined as the relative number of variable slots in the productive units of their grammar, increases globally with age.
This paper explores a parsimonious approach to Data-Oriented Parsing. While allowing, in principl... more This paper explores a parsimonious approach to Data-Oriented Parsing. While allowing, in principle, all possible subtrees of trees in the treebank to be productive elements, our approach aims at finding a manageable subset of these trees that can accurately describe empirical distributions over phrase-structure trees. The proposed algorithm leads to computationally much more tracktable parsers, as well as linguistically more informative grammars. The parser is evaluated on the OVIS and WSJ corpora, and shows improvements on efficiency, parse accuracy and testset likelihood.
Language acquisition is a special kind of learning problem because the outcome of learning of one... more Language acquisition is a special kind of learning problem because the outcome of learning of one generation is the input for the next. That makes it possible for languages to adapt to the particularities of the learner. In this paper, I show that this type of language change has important consequences for models of the evolution and acquisition of syntax.
Uploads
Books by Willem Zuidema
In the second part of the thesis, I critically evaluate some existing formal models of each of these major transitions and present three novel alternatives. In an abstract model of the evolution of speech sounds (viewed as trajectories through an acoustic space), I show that combinatorial phonology is a solution for robustness against noise and the only evolutionary stable strategy (ESS). In a model of the evolution of simple lexicons in a noisy environment, I show that the optimal lexicon uses a structured mapping from meanings to sounds, providing a rudimentary compositional semantics. Lexicons with this property are also ESS's. Finally, in a model of the evolution and acquisition of context-free grammars, I evaluate the conditions under which hierarchical phrase-structure will be favoured by natural selection, or will be the outcome of a process of cultural evolution.
In the last chapter of the thesis, I discuss the implications of these models for the debates in linguistics on innateness and learnability, and on the nature of language universals. A mainly negative point to make is that formal learnability results cannot be used as evidence for an innate, language-specific specialisation for language. A positive point is that with the evolutionary models of language, we can begin to understand how universal properties and tendencies in natural languages can result from the intricate interaction between innate learning biases and a process of cultural evolution over many generations.
Papers by Willem Zuidema