Patterns bit by bit. An Entropy Model for Linguistic Generalizations
Summary
When confronted with the challenge of learning their native language, children manage impressively fast to infer generalized rules from a limited set of examples, and apply those rules to strings of words never heard before. This paper addresses the puzzle of what triggers and what limits the inductive leap from memorizing specific linguistic items to extracting general rules. An innovative entropy model for linguistic generalization is proposed, which is designed to bridge the gap between previous findings on the factors that modulate the process of making linguistic generalizations and to unify them under one consistent account based on an information-theoretic approach to rule induction. The prediction made by this model is that generalization is a cognitive mechanism that results from the interaction of input complexity (entropy) and the processing limitations of the human brain, expressed as limited channel capacity. In a pilot experiment with adults, a miniature artificial grammar was designed to probe the effect of input complexity on the process of generalization. The number and frequency of linguistic items was manipulated to obtain different degrees of input complexity. Entropy was used as a measure of input complexity, given that entropy varies as a function of the number of items in the input and their probability of occurrence. Results showed that the more complex the linguistic environment, the higher the tendency to create generalized rules in response to the input complexity.