Some thoughts on economy within linguistics

Algumas observações sobre economia dentro da lingüística

Abstracts

One of the cornerstones of Chomsky's Minimalist Program is the role played by economy. This paper discusses different ways in which Chomsky's notion of economy in linguistics can be understood, given current views on dynamic systems and, in particular, on evolution in biological systems.

Economy in Linguistics; Minimalism; Exaptation; Dynamic Systems


Um dos pontos principais do Programa Minimalista de Chomsky é o papel desempenhado pela noção de economia. Este trabalho discute várias maneiras como essa noção de economia em lingüística pode ser entendida em face de recentes concepções sobre sistemas dinâmicos e, em particular, sobre evolução nos sistemas biólogicos.

Economia em Lingüística; Minimalismo; Exaptação; Sistemas Dinâmicos


Some Thoughts on Economy Within Linguistics* * The research behind this note was partly funded by NSF grant SBR9601559. I am indebted to Elena Herburger, Jairo Nunes, Carlos Otero, and Phil Resnik for specific comments on an earlier draft.

(Algumas Observações sobre Economia dentro da Lingüística)

Juan URIAGEREKA

(University of Maryland at College Park)

ABSTRACT: One of the cornerstones of Chomsky's Minimalist Program is the role played by economy. This paper discusses different ways in which Chomsky's notion of economy in linguistics can be understood, given current views on dynamic systems and, in particular, on evolution in biological systems.

KEY WORDS: Economy in Linguistics, Minimalism, Exaptation, Dynamic Systems

RESUMO: Um dos pontos principais do Programa Minimalista de Chomsky é o papel desempenhado pela noção de economia. Este trabalho discute várias maneiras como essa noção de economia em lingüística pode ser entendida em face de recentes concepções sobre sistemas dinâmicos e, em particular, sobre evolução nos sistemas biólogicos.

PALAVRAS-CHAVE: Economia em Lingüística, Minimalismo, Exaptação, Sistemas Dinâmicos

1. Three (more or less) reasonable takes on Minimalism

I'll start with a quote by Stephen Jay Gould, who I take to have understood the significance of generative grammar when accounting for faculty psychology within the confines of evolution. "The traits," he writes in (1991: 59), "that Chomsky (1986) attributes to language ¾ universality of the generative grammar, lack of ontogeny, . . . highly peculiar and decidedly non-optimal structure, formal analogy to other attributes, including our unique numerical faculty with its concept of discrete infinity ¾ fit far more easily with an exaptive, rather than a adaptive, explanation." By an exaptation Gould means an individual feature that did not emerge adaptively for its current purpose, but was co-opted by the individual; for example, for Gould brain size is not the consequence of "intelligence" related to language, rather the brain got big for whatever reason (e.g. circulatory benefits), which somehow caused linguistic competence.

Another thinker who eloquently presents a view like Gould's is David Berlinski, who writes in (1986: 130): "mathematicians thought they might explain the constraints of grammar on such grounds as effectiveness or economy. . . In fact, the rules of English grammar appear to owe nothing to any principles of economy in design." This line is well-known from the work of, in particular, Jerry Fodor, who has systematically built arguments against an unstructured mind, constructivism, connectionism, gradualism, or behaviorism, on the basis of clever, yet sub-obtimal quirks of language that linguists have found.

Such comments emphasizing the "decidedly non-optimal structure" of language or "grammar ... ow[ing] nothing to ... economy in design" seem at odds with Chomsky's Minimalist program. I can only see four possible routes one can take in light of that. The most reasonable one is that the Minimalist Program is just too good to be true; maybe we, linguists, have planted the elegance that we now harvest, unaware of our acts. Even Chomsky constantly admits that the program is partly a bold speculation, rather than a specific hypothesis. So it could just be all wrong.

A second possibility is that, right though the program may be, it is not meant to square its basic tenets with the Gouldian rhetoric behind them. Suppose that the computational system of Human language is in some interesting sense optimal; why this should be is the obvious question. Schoemaker (1991) analyzes optimality as a possible organizing principle of nature, instantiated in terms of least action in physics, entropy in chemistry, survival of the fittest in biology, and utility maximization in economics, among the more or less "natural" sciences. If we play in this key, one could try to argue that economy in linguistics should reduce to survival of the fittest.

Such a view is reasonably defended by important scientists, perhaps most popularly by Pinker in his 1994 best-seller:

Selection could have ratcheted up language abilities by favoring the speakers in each generation that the hearers could best decode, and the hearers who could best decode the speakers. . . Grammars of intermediate complexity. . . could have symbols with a narrower range, rules that are less reliably applied, modules with fewer rules. . . I suspect that evolving humans lived in a world in which language was woven into the intrigues of politics, economics, technology, family, sex, and friendship that played key roles in individual reproductive success. (p. 365-9)

Of course, this adaptation story still raises non-trivial questions about what is selective about, say, having thousands of languages, or a parser which doesn't decipher some simple grammatical sentences like the mouse the cat the dog bit chased left, or why parsable and sound sentences ¾ like *who do you think that left ¾ are ungrammatical to begin with. Nonetheless, the view is very reasonable, and would take (generic) Minimalism as significant evidence: if you claim language has evolved adaptively, you expect it to show its success right up its sleeve.

A third possibility (for why Minimalism) could be that all this linguistic elegance is nothing but mathematical clout. In one sense, this is trivially acceptable: if interpreted in purely methodological terms. One can be interested in language for its value as a tool to better a logical system, an algebraic apparatus, a computer program, or some such thing. There is meaning to the notion of (often various ways of) optimizing a function, and it could just be that linguistic functions (whatever those turn out to be) happen to be an interesting sub-case of that. Notice, this isn't saying much about language as a natural object, but one doesn't have to. However, many linguists and philosophers do not take any of this as just methodological (cf. Richard Montague's famous (1974: 188) rejection of "the contention that an important theoretical difference exists between formal and natural languages"). That poses very different questions.

The most immediate issue is what underlies formal and natural languages, which from the evolutionary perspective we are now considering should be what has evolved, and furthermore optimally, by hypothesis. As Partee (1996: 26) notes, "[o]ne can easily understand Chomsky's negativity towards Montague's remark that he failed to see any interest in syntax other than as preliminary to semantics." But Montague's perspective is reasonable, particularly if as he thought syntax should be homomorphic with semantics; what would be the naturalistic (evolutionary) point in having both syntax and semantics evolve as separate systems that (presumably, then) get connected? It is more sound to expect one of these systems to piggy-back on the other, as the aerodynamical structure of a wing allegedly piggy-backs on its flying function. To the extent that there is a function to language, it surely must have to do with such things as communicating truths or denoting referents ¾ -semantic stuff. Clearly, from this perspective, the third view reduces to the second: syntax is just the epiphenomenologial form that semantic function has met in the course of evolution.

Less reasonably, I suppose, view number three stays distinct if one goes metaphysical, and claims that English as a formal language actually exists out there in some primitive sense. I don't understand anything about metaphysics, so I won't venture much beyond this point; but I guess this view is also entertained, and I imagine once one is in that world of logic, mathematics, or ideas, to find that they should be elegant a priori ¾ Platonic, after all ¾ is perhaps not that amazing.

Judging from his writings, none of these alternatives is what Chomsky is entertaining, which moves me to the fourth possibility I said I see ¾ the most unreasonable of all.

2. A different take

Chomsky (1995) declares it "of considerable importance that we can at least formulate [minimalist] questions today, and even approach them in some areas with a degree of success" (p. 9). He furthermore takes the matter to be far reaching, daring to claim that if on track "a rich and exciting future lies ahead for the study of language and related disciplines." This isn't just rhetoric; later on (p. 169) Chomsky admits that "[s]ome basic properties of language are unusual among biological systems, notably the property of discrete infinity, . . . that the language faculty is nonredundant, in that particular phenomena are not 'overdetermined' by principles of language, . . . [and] the role of 'principles of economy'." To the remark about biological oddity, he appends that the basic linguistic properties are "more like what one expects to find (for unexplained reasons) in the study of the inorganic world." None of these ideas should be taken lightly. "There is," Chomsky thinks, "good reason to believe that [such considerations] are fundamental to the design of language, if properly understood."

The design of language? What could that possibly mean if not any of the things just discussed? This is the main question that will concern me here.

Of course, the emergence of language is not the only difficulty for standard optimistic stories in the theory of evolution. There to be explained are also incredible convergences in organ structures without any shared functions (e.g. Fibonacci patterns in cordate skins, mollusc shells, jellyfish tentacle arrangements, plant phyllotaxis, microtubules within cytoskeletons in every eukaryotic cell), the very concept of speciation (how do you go from a mutant to an array of them that stay close enough to matter for reproduction?), or individual (what makes a prokaryotic cell turn into an eukaryotic one, in the process subsuming nucleus, mitochondria, organelles, presumably through symbiosis with other micro-organisms, and then start aggregating to what we now see?), not to speak about the competence underlying the observable behavior of animals ¾ including, in some, altruism.

To all those questions, the standard answer is the Neo-Darwinian synthesis (of Darwinism and neo-Mendelian genetics) which in its highlights speaks of selfish genes using individuals as their mere vehicle for survival and reproduction (Dawkings 1987). How one goes from a couple of (our) selfish genes to our exchanging thoughts this very minute, nobody knows.

These matters have a long history within biology, and have now been retaken by researchers dissatisfied with the party line. This is, for instance, what Goodwin (1994: xiii) has to say about the pioneering On Growth and Form, D'Arcy Thompson's 1917 classic:

[H]e single-handedly defines the problem of biological form in mathematical terms and re-establishes the organism as the dynamic vehicle of biological emergence. Once this is included in an extended view of the living process, the focus shifts from inheritance and natural selection to creative emergence as the central quality of the evolutionary process. And, since organisms are primary loci of this distinctive quality of life, they become again the fundamental units of life, as they were for Darwin. Inheritance and natural selection . . . become parts of a more comprehensive dynamical theory of life which is focused on the dynamics of emergent processes.

Of course, the devil is in the details, and one wants to know what is meant by the dynamics of emergent processes. I'll spare the reader the specifics, but I would like to give at least some sketch of the general picture from Stuart Kauffman's work (1995: 18):

[M]uch of the order seen in development arises almost without regard for how the networks of interacting genes are strung together. Such order is robust and emergent, a kind of collective crystallization of spontaneous structure. ... Here is spontaneous order that selection then goes on to mold. ... Examples that we shall explore include the origin of life as a collective emergent property of complex systems of chemicals, the development of the fertilized egg into the adult as an emergent property of complex networks of genes controlling one another's activities, and the behavior of coevolving species in ecosystems that generate small and large avalanches of extinction and speciation. ... [T]he order that emerges depends on robust and typical properties of the systems, not on the details of structure and function.

Kauffman aptly sums up this view: "Under a vast range of different conditions, the order can barely help but express itself."

What I have reported may sound like prestidigitation, but Kauffman's book seeks to show that it is not ¾ I cannot go into that here, although see the fractal example below. My point is this: If any of this is independently argued for, or at least plausible, the question of language design doesn't have to reduce, in the course of evolution, to "the natural response of an organism looking for the most efficient way in which to transmit its thoughts", as Berlinski jokes in 1986: 130. It could be that linguistic order can barely help to express itself, in whatever sense other kinds of biological order do. If that is the case, we do expect it to "involve a bewildering pattern without much by way of obvious purpose," to again borrow from Berlinski's prose, and pace those who find a grand purpose to syntactic principles.

3. Arguments against the unreasonable view

Chomsky often appeals to the metaphor that Kauffman uses: crystallization. He seems to think that grammar could have emerged in roughly the way a crystal does, only at a more complex and arcane level of physics for, as he puts it, "[w]e have no idea, at present, how physical laws apply when 1010 neurons are placed in an object the size of a basketball, under the special conditions that arose during human evolution." This passage (see also Chomsky 1993 and 1994) is cited by Pinker, who then adds (p. 363) "the possibility that there is an undiscovered corollary of the laws of physics that causes brains of human size and shape to develop the circuitry for Universal Grammar seems unlikely for many reasons." He gives two. To start with, "what sets of physical laws could cause a surface molecule guiding an axon. . . to cooperate with millions of other such molecules to solder together just the kinds of circuits that would compute. . . grammatical language?"

The presuppositions of that question are curious. On the one hand, that neurons regulate mind functions (and not something more basic, as suggested for instance by Penrose (1994), or something else entirely) is just a hypothesis, even if this isn't always remembered or even admitted. On the other hand, if there ever is an answer to the rhetorical question Pinker poses, it will most likely arise not from a reduction of mind to whatever the known laws of physics happen to be when the answer is sought, but rather from a serious unification between the two (or more) empirical sciences involved. It is perfectly possible that physics will have to widen or deepen or strengthen (or whatever) its laws as understood at a given time, precisely in order to accomodate the phenomenon of mind ¾ just as they had to be modified to accomodate the phenomenon of chemistry, and to some extent are being streched when contemplating the phenomenon of life in an entropic universe. I realize I'm appealing to caution in the presence of ignorance ¾ but that's shown better results than letting ignorance dictate.

In relation to Pinker's question to Chomsky, an intriguing instance that comes to mind and seems significant is the discovery by Barbara Shipman that von Friesch's arcane observations regarding bee-dances can be best described in terms of mapping objects existing in six-dimensional flag manifolds to a two-dimensional expression. This already interesting formal fact becomes fascinating when Shipman, a physicist and mathematician studying quarks ¾ which also happen to be aptly described in terms of six-dimensional flag manifolds ¾ speculates that bees might be sensitive to quantum fields. At some level, that they are is already known: given their orientation system (which is not object-driven, like ours, but geodesycally grounded), bees apparently can be "fooled" by placing them in the presence of heavy magnetic fields. But what Shipman is exploring is in a sense extraordinary: that a creature may be able to use sensitivity to quantum fields as a system to communicate information. Now imagine we had posed Pinker's question to von Friesch instead of Chomsky, at a time when quantum physics was either not developed or even very well-known, in light of the bizarre behavior of bees. What physical laws could possibly cause molecules in bee neurons to cooperate with millions of other such molecules to compute bee dances? Well, who knew; of course, who knows now as well, but at the very least the Shipman take on these matters puts things in perspective: perhaps the little creatures are sensing something on the basis of the laws of physics, crucially as presently understood.

Pinker's second problem (with Chomsky's crystallization metaphor, or more generally Gould's argument that large brains predate speaking humans) insists on a common-place: that large brains are, per se, maladaptive, and hence they could have emerged only as a result of some good associated function.

Suppose we grant that initial premise (in terms of metabolic cost, for instance). Still, the general reasoning misses the point I'm trying to establish, which Kauffman so poetically expressed: some order can barely help to express itself, maladaptive or not. Of course, if some expressed order turns out to be so maladaptive that you won't transmit your genes, then your kind dies out. That is tough to prove, though; play in animals, for instance, is somewhat maladaptive: you waste time and energy, you get injured, you expose yourself to predators... Does play kill you? Obviously not, or scores of species would have vanished; does that mean there is a tremendous alternative, systematic benefit in play, that so many species have it? If there is, it isn't obvious. And incidentally, in the case of language, if what one is seeking is a function as a way out of the puzzle of maladaptive brains, just about any function does the job (e.g. Gould's circulatory gains). You certainly don't need the whole "benefit" of language for that, which simply highlights the general problem with doing "reverse engineering": my reason is as good as yours, so nobody wins.

Those are the "many reasons" the "circuitry for Universal Grammar" should be blamed on Darwinian adaptationism. Perhaps, but the force of the argument is nowhere to be seen. Yet Pinker's reasoning does certainly go with the mainstream in biology, which seems ready to presuppose answers to these fascinating questions on form ¾ on the basis of the dogma of function. Witness in this respect the critique that Givnish (1994) gives of the extremely interesting work by Roger Jean (1994), who attempts an analysis and explanation of Fibonacci patterns in plant phyllotaxis. After asserting that Jean's explanation for plants displaying geometrical patterns is not compelling, Givnish writes (p. 1591):

He raises no adaptive explanation for phyllotactic patterns.. . and fails to cite relevant papers on the adaptive value of specific leaf arrangements. Worse, the author espouses Lima-de-Faria's bizarre concept of autoevolution, arguing that phyllotaxis is nonadaptive and reflects a pattern of self-assembly based on prebiotic evolution of chemical and physical matter. . . recapitulating the natural philosophy of D'Arcy Thompson that led many biologists to abandon phyllotaxis as a subject of study. . . [N]othing in biology makes sense except in the light of evolution.

The last sentence is a famous prayer by Dobzhansky, which exhorts the listener to follow the (here useless) party line. But in this instance (perhaps even more so that in the case of language) no imaginable adaptive story could serve to explain the (mathematically) exact same form that arises allover the natural world (cf. for instance a viral coating vs. the feather display in a peacock's tail, both arrangements of the sort seen in plants).

4. Basic minimalist properties and how unusual they are

I have argued elsewhere (e.g. (1995)) that Fibonacci patterns present all three of Chomsky's basic unusual properties among biological systems: discrete infinity, underdetermination, and economy. To demonstrate that this is not an isolated instance, I'd like to present another case, which as it turns out also exhibits the property of self-similarity, or fractality. Fractals are recursive structures (hence discretely infinite) of extreme elegance (the economy bit, which in fractals is easily expressible); as for their underdetermination, it usually shows up in the system not coding some of its overt properties, such as handedness or various details about systemic implementation.

The example I have in mind comes from a piece by West, Brown, and Enquist (1997), who analyzed the vertebrate cardiovascular system as a fractal, space filling network of branching tubes, under the economy assumption that the energy dissipated by this transportation system is minimized, and supposing the size of terminal tubes (reaching sub-tissue levels) does not significantly vary across species. In so doing, they deduce scaling laws (among vertebrates) that have been known to exist for quite some time, but hadn't been accounted for as of yet.

What was known already is that biological diversity ¾ from metabolism to population dynamics ¾ correlates with body size (itself varying over twenty one orders of magnitude). Allometric scaling laws typically relate some biological variable to body mass M, by elevating M to some exponent b, and multiplying that by a constant characteristic of a given organism. This leads one to thinking that b should be a multiple of 1/3, so that the cubic root of an organism's mass relates to some of its internal functions in the way that a tank with 1000 cubic feet of water has multiples of 10 as a natural scale to play tricks with. Instead, what researchers have found is that b involves not cubic roots, but rather quarter roots, unexpectedly, at least if one is dealing with standard geometric constraints on volume. For example, the embryonic growth of an organism scales as M1/4, or the quarter root of its mass (the larger the mass of the organism, the slower its embryonic growth, but as mass increases, embryonic growth differences decrease). These quarter-power scalings are apparently present all throughout the living kingdoms

I'll spare the reader most of the geometrical details of why a fractal network does involve quarter powers as the scaling factor. The gist can be seen by entertaining the exercise of systematically producing holes in a cylindrical, solid Manchego cheese. Suppose you isolate an outer layer from an inner core, with the intention of producing holes, first, in the outside. Call C the volume of the entire cheese and L the volume of the outer layer; obviously, the relation between C and L is cubic, corresponding to the three dimensions of lenght, width, and heighth. But now consider a further dimension: that by which we systematically produce holes in the outer layer. Call H the volume of L minus the holes. To express the relation between C and H, we need a more complex exponential function than the cubic one: we must add the contribution of the fourth dimension. More generally, if we continue producing layers inside the cheese, theoretically ad infinitum (if the holes get smaller and smaller, up to some limit), the basic dimensions won't have to change. That will create a fractal structure of holes in the cheese ¾ a Swiss cheese.

The fractal model was cleverly used to describe the inner "guts" of an organism, where tubules of various sorts play the role of the holes, and of course the entire organism is the cheese. The model predicts facts with an incredible degree of accuracy: (where the P[redicted] and O[bserved] numbers express the scaling exponent, as is obvious a multiple of 1/4) aorta radius P3/8=.375, O=.36; circulation time P1/4=.25, O=.25; cardiac frequency P=-1/4=-.25, O=.25; metabolic rate P=3/4 =.75; O=.75. The list goes on. West, Brown, and Enquist observe that "the predicted scaling properties do not depend on most details of system design, including the exact branching pattern, provided it has a fractal structure" (p. 126).

That last sentence, apart from directly illustrating systemic underspecification, resonates directly with Kauffman's contention that some kinds of order arise without regard for how underlying gene networks are put together ¾ which is well, considering that we may be dealing with species that have few genes in common, particularly when extending these observations to other kingdoms.

In sum, a kind of biology is beginning to gain momentum; it is focused on systemic properties that arise via principles of reality which are more elementrary than adaptations. Needless to say, the emergence of one of these core systems may well have been adaptive to an organism, but crucially not (at least not necessarily) for whatever it is eventually put to use.

I think this is relevant to Chomsky's recent (or old) ideas in two respects. First, it directly shows, at least to my mind, that if Chomsky has gone supernova, he has together with a very exciting branch of biology. One may have biases against whatever is biological and non-adaptive or touches on weird physics, but there is no crisis here for standard linguistics as we know and love it, even if one is as skeptical about adaptative explanations of language as Gould, Berlinski, and Fodor all strung together. One doesn't then have to turn to metaphysics, mathematics, functionalism, or deny the facts. Or to put it bluntly: Chomsky isn't doing now what he hasn't done before.

Second, if properly understood, the fact that fractals appear so central to organic nature may give us another argument for the autonomy of syntax. At first, this doesn't seem so. After all, am I not saying the language faculty is, in relevant, according to Chomsky, "basic" respects, like the scaling system? It is, I think, in those basic properties (of discrete infinitude via recursivity, underspecified or gene-independent plasticity, and structural economy with no direct functional correlate); but it's plainly the case that language has properties that, for example, Fibonacci patterns do not. For instance, some phyllotactic patterns, underspecified for handedness, branch rightwards or leftwards depending on whether the previous branch was in some definable sense heavy or light (branching opposite to a heavy branch's direction; see Jean 1994: chapter 3); phrasal linguistic structures are also underspecified for handedness, and if Kayne is right in his (1994) proposal, they branch in the direction that codes command, or the history of their merging process, in Epstein's (1995) interpretation. As I now proceed to show, the effects of this minor difference are drastic.

5. From a minor change to a major consequence

Imagine the last sentence in the previous paragraph had branched according to the Fibonacci display in vegetable trees, instead of Kayne's. That is, rather than (1), we would have (2), assuming the "heavy" branch is the one with more letter symbols and that the first heavy branch goes right:

This is an adequate way to linearize plants (indeed a more "balanced" way than the one seen in (1): here the right branches have a total of 22 symbols, for the 23 of the left branches, whereas in (1) the right branches summed 17 symbols, against the 28 of the left branches). But (2) creates a hopeless instability for linguistic objects. Thus, imagine substituting drastic and utterly incomprehensible for drastic in these structures. Now that material is heavier than the rest (including 26 new symbols), and hence would seek its place in the linearized structure to the right of the effects of this very minor difference.

In other words, depending on how large a predicate is, it may be pronounced before or after the subject. As a system of communication of the sort we have, a general procedure of that sort would be insane. Then again, Kayne's linearization applied to natural trees would yield heavily inclined trunks, which is probably also insane for adequate photosynthesis, the clorophilic function, pollinization, and what not. Differently put, nobody can seriously deny the role of use and others (e.g., learnability in the case of language) in certain structural decisions ¾ and nobody does, to my knowledge. But to borrow Kauffman's expression, here is where selection goes on to mold spontaneous order.

I should clarify that. I'm not saying that communicative reasons directly yield Kayne's procedure, as opposed to the plant one. It's hard to imagine how at the stage where evolving hominids went from not having the linearization procedure to finding it (assuming that is actually what happened) anything other than the crudest proto-language could have been in place. If the picture Chomsky paints in his (1995) book is remotely close to accurate, even word-formation a la Hale and Keyser (1993), or any such variant, is dependent on a transformational process that has to involve Kayne's linearization procedure. This is easy to show.

A transformation involves a phrase-marker K and a target symbol T from inside K that is added to the root of K. As a consequence, a dependency or chain is formed between two pairs: the moved T and its (after movement) immediate syntactic context K, and T's copy, call it (T), and its immediate syntactic context X: {{T, K}, {(T), X}} (see Chomsky (1995: 252)). The minute movement takes place, you involve at least four symbols, all appropriately arrayed into some phrase-marker. Remember, the phrase-marker per se codes no linear order, but mere hierarchical arrangements that the linearization procedure lays out in an appropriate phonetic row. Now, if you just have one symbol, you only have one linearization possible; with two symbols, you obviously have two, in principle; with three symbols, six possible linearizations ensue; more generally, with n symbols you have n! linearizations possible (these are the possible permutations of those symbols). For example, the sentence Tarzan loves Jane can be linearized in six ways ¾ without counting any possible movements. If you add those (at least a couple of A-movements, a couple of head movements, and perhaps others), the linearizations jump to over five thousand; even if you took a mere millisecond to consider each of those orderings, it would take you five seconds to parse the sentence unequivocally. As far as anybody knows, that's just unworkable as a communication system; it is as unstable as a Calder mobile on a windy day, with the complicating factor that we actually see a mobile, but we only hear words one at a time...

Plainly put, no linearization equals no overt movement ¾ hence nothing remotely close to human language ¾ and furthermore not even (at any rate, appropriately complex) words (formed by movement). Maybe that lets you go by with Me-Tarzan stuff, but you can hardly speak of any transitive actions, for instance, which presuppose movement in Chomsky's system. Nonetheless, Me-Tarzan stuff certainly worked much better in Saturday afternoon classics than Cheetah's 'chimpanzeese', and it's not trivial to go from Tarzan to us right here just by the alleged selective pressure that lack of communication with Jane-like figures would arguably impose. It is more likely, it seems to me, that Tarzan's child just stumbled onto something like Kayne's linearization, the way one probably stumbled onto the linearizing device that translates hierarchical musical structures to a whistling tune. Once an accident like that took place within the evolving human brain, the benefits of linearization could all be harvested, perhaps from whistling to word-formation. God only knows.

Admittedly, that was a just-so story of my own, but Chomsky's elegant system invites just this sort of speculation, for anyone who cares to look at the details. The Minimalist syntax is so subtle and far reaching that a minor change in one of its components can carry you from something like language to something like a plant. This might seem like autonomous syntax by a hair's length, but isn't everything concerning form out there, and by even smaller lengths? The difference between a "sparsely connected" network and one less or more connected appears to be one, so Kauffmann shows in 1995, between nothing at all, utter chaos, and complex order; curiously, binarity seems to do the trick among possible relations: less leads to nothing, more to chaos ¾ two does it. Of course, we don't need to go into the arcane issues I'm talking about here to illustrate this point about subtlety in this universe. Change the one in one trillion imbalance between protons and antiprotons and the known universe vanishes, matter annihilating antimatter (thus no familiar forms: stars, atoms, we or anything).

Why is this relevant to the general point of this exercise? I cannot put it better than Fodor (1998: 12), in his recent critical review of Pinker's new book (1998):

[W]hat matters with regard to . . . whether the mind is an adaptation is not how complex our behaviour is, but how much change you would have to make in an ape's brain to produce the cognitive strucutre of a human mind. And about this, exactly nothing is known. That's because nothing is known about the way the structure of our minds depends on the structure of our brains. . . Unlike our minds, our brains are, by any gross measure, very like those of apes. So it looks as though relatively small alterations of brain structure must have produced very large behavioural discontinuities in the transition from the ancestral apes to us. If that's right, then you don't have to assume that cognitive complexity is shaped by the gradual action of Darwinian selection on pre-human behavioural phenotypes. . . [M]ake an ape's brain just a little bigger (or denser, or more folded, or, who knows, greyer) and it's anybody's guess what happens to the creature's behavioural repertoire.

The subtly dynamic system that the Minimalist Program implies illustrates Fodor's point from a different angle. When all is said and done (?!) about the ultimate physical support of the syntax of natural language, you may well find something as deeply surprising as the honey-bees story I reported above.

6. Autonomous syntax redux

Even if autonomous syntax comes from a remote corner of structuring ¾ albeit one whose consequence is the possibility of forming words (hence a lexicon, hence anything socially useful about linguistic structuring) ¾ we should really welcome it. This is not just a matter of turf. The only reasonable alternative is functionalist, and it reduces to some variant of Montague's skepticism noted before. Personally, I'm willing to wait and see what Montague grammarians have to say about why local movement, expletive replacement, agreement, and all the rest. Unfortunately, the answer so far has been nothing much.

In Chomsky's (1995) view, particularly in chapter 4, the stuff that transformations manipulate is features, understood as properties of lexicon units (the equivalent of charge or spin in a sub-atomic particle). The movement of T to K that I sketched above is broken down into smaller parts, with a feature F of T being the trigger. It's as if K tries to attract F (Chomsky's actual term) and the item that contains F is forced to move as a result, much as a box full of nails moves when the nails are pulled by a magnet. From this perspective, we expect that a feature F' which is closer to K than F should interfere with K's relation to F, just as a magnet cannot "ignore" a paper clip, say, to attract a nail that is further away. This "dumbness" of the linguistic system is not even surprising if matters are, in some appropriate sense, the way I have been presenting them.

In an interesting paper, Fukui (1996) extends these technical points to an equally technical point about physics. He emphasizes an analogy between Chomsky's economy of derivations and Maupertuis's principle of Least Action. One of Chomsky's main ideas is that alternative derivations (in some precise sense that I describe immediately) compete in grammaticality. That recalls various scenarios in mechanics and optics where, of several alternative paths that an object or a beam of light may follow, only the optimal one is chosen.

It is curious to note, as Schoemaker (1991) observes regarding these matters, that optimality principles in physics raised, virtually from the time they were proposed, the same sorts of questions that Chomsky's idea has, in the recent critical literature. Perhaps nobody expresses this so well as Feynman in his lectures, which Schoemaker appropriately cites (p. 209):

The principle of least time is a completely different philosophical principle about the way nature works. Instead of saying it is a causal thing, . . . it says this: we set up the situation, and light decides which is the shortest time, or the extreme one, and chooses the path. But what does it do, how does it find out? Does it smell the nearby paths, and check them against each other? The answer is, yes.

Needless to say, Feynman's little joke at the end comes from the fact that he has a quantum-mechanical explanation.

I only wish I had such a quantum-mechanical explanation about Chomsky's optimal derivation smelling the nearby alternatives; unfortunately I don't. I wouldn't be surprised, however, if there is one, even outside the reach of present-day science.

In fact, Chomsky's derivations are behaving somewhat like beams of light in an even more obvious way. In his treatment of the impossible *there seems someone to be in the room, as opposed to there seems to be someone in the room (see his (1995) pp. 344 and ff. and 366 and ff.), Chomsky wants the derivation leading to (3b') below to outrank the derivation leading to (3a'). But how can that be, if the sentences involve exactly the same words and exactly the same numbers of mergers and movements?

(3) [to [be [someone [in [the room]]]]] {there, seems} a. [someone [to [be [(someone) [in [the room]]]]]] {there, seems} b. [there [to [be [someone [in [the room]]]]]] {seems} a'. [there [seems [someone [to [be [(someone) [in [the room]]]]]]]] b'. [there [seems [(there) [to [be [someone [in [the room]]]]]]]]

Topmost is the chunk of structure that both derivations share, with the remaining words to be used in a lexical array (which Chomsky calls a numeration). In (3a) we see how someone moves, leaving a parenthesized copy or trace, while in (3b) there is inserted instead. Assuming (non-trivially) that movement is more expensive than merging, then it is clear that (3a) is outranked by (3b). But now consider (3a'), the continuation of (3a); here, there is merged, while in (3b'), the continuation of (3b), there moves leaving a trace behind. So now it seems that, after all, both derivations are equally costly: one takes an extra step early on; the other takes it later ¾ but both take the extra step. ..

It doesn't matter. Chomsky invites us to think of derivations as unfolding in successive cascades of structural dependency, narrowing down the "derivational horizon", as it were, as further decisions are made. Intuitively, the horizon is completely open when no words are arranged into a phrase-marker, and it shrinks down as some words are attached (e.g. as in the top-most structure in (3)). Only derivations with the same derivational horizon compete, like (3a) and (3b). By the time we're asking (3a') and (3b') to compete they are already part of two entirely different derivational histories, like those science fiction characters that get killed in a parallel universe but still make it in this one.

To see that light behaves in similar ways, consider an illustration of Feynman's that is meant to show why often the path of least action is not the shortest. You're lying on the beach and suddenly somebody starts drowning two hundred yards to your left. What do you do, run in a straight line? Not so, because swimming is harder than running; you run the shore until a critical point, and then you swim. Light acts somewhat similarly when going from air to water, maximizing the "easy" path vis-a-vis the "difficult" one (across a denser material), even if the combined path is not the shortest. But now imagine a more complicated scenario. You're still on the beach, but suddenly you see somebody trapped inside a building on fire; you could run directly to the building, or actually take a small detour to the water and then go inside the building. Here, obviously, you first get wet, and then run to save the person on fire. Light doesn't have such a "look ahead". You can construct scenarios where it would have to transverse three media, say air, oil, and water, in such a way that you could optimize the total path by doing this, that, or the other. But what light does instead is optimize the transition from air to oil, and as its traveling horizon narrows (that is, whatever the result of that first transition is), a new optimization takes place for the transition from oil to water. The trick is as dumb as the one played by the syntactic derivation because neither is really smelling anything ¾ they are just a bunch of photons or words going about their business, bumping against other stuff.

I make much of this syntactic dumbness (as opposed to the interpretive smartness of semantics, say), as an argument not just for the autonomy of syntax, but in fact for its primacy as well. Many, if not all of Chomsky's (1995) core principles could be seen in this light. His Inclusiveness Condition, that "any structure formed by the computation. . . is constituted of elements already present in the lexical items selected for [the numeration]; no new objects are added in the course of computation" (p. 228), coupled with the Recoverability Condition ensuring "that no information be lost by [an] operation" (p. 44), immediately recalls Conservation Laws in physics and chemistry (except those deal with quantities, and syntax deals with qualities). His Last Resort condition, "that computational operations must be driven by some condition on representations, as a 'last resort' to overcome failure to meet such a condition" (p. 28), resembles Haken's Slaving Principle in synergetics: stable modes of the old states of a system are dominated by unstable modes (see Mainzer (1994)). The Condition on Chain Uniformity, that "the chain C is uniform with respect to P. . . if each ?i [a link of C] has property P" (p. 91) is best treated as a condition on the stability of an object constructed by the derivation, thus relating to the stability of wave functions in quantum mechanics; collapsing (and hence interpreting) a chain, in the sense of Martin's (1996) developments of Chomsky's ideas, could be then akin to collapsing a quantum wave ¾ to my mind a fascinating prospect, suggesting that interpretation amounts to observation of a quantum state.

Is all of this metaphoric reminiscence or day dreaming ¾ or folly? Perhaps. Then again, the alternatives (denying the facts, blaming adaptations, going metaphysical) don't seem all that promising. What does it all mean, though? Well, I don't know ¾ how could anyone? I do know, however, what it does not mean. In this respect, I think it is rather interesting that, if something makes the Minimalist Program different from the Principles and Parameters model, which it springs from, that is the new reliance on economy, rather than the modules of the predecessor. Where one found Theta, Case, Binding, and similar modules, one now seeks just economy in different guises (or pushing the phenomenon out of narrow syntax). This is rather crucial.

Again, Fodor puts it well in his 1998 piece:

A module is a more or less autonomous, special-purpose, computational system. It's built to solve a very restricted class of problems and the information it can use to solve them is proprietary. . . If the mind is massively modular, then maybe the notion of computation that Turing gave us is, after all, the only one that cognitive science needs. It would be nice to believe that. . . But, really, one can't. For, eventually, the mind has to integrate the results of all those modular computations and I don't see how there could be a module for doing that.

It is not surprising that Fodor was never too happy with the modular property of the Principles and Parameters system, since it basically postulated modules (Theta, Case, Binding modules) within a module, the Language Faculty. The modules ruled some particular local interactions, and such notions as government were thought to determine non-local, interactive relations among modules ¾ a sort of "central system" in Fodor's terminology. This is, clearly, not the sort of architecture that Fodor initially (and plausibly) sought. Quite simply, if you allow modules within modules, you may then have modules within modules within modules (e.g. Conditions A and B and C, which were also modularly defined), and it's then modules all the way down. Which is a form of connectionism. The new architecture is much more in consonance with Fodor's view: there is a syntax, its own module, which interfaces with other modules ¾ whatever those are. Period. Now, this has a consequence.

The modular view of mind lends itself nicely to the adaptationist view of linguistic evolution (putting aside the problem of our ignorance about the brain support). The more modules you have that connect in reasonable ways, the more you expect the connection to be adaptive. Even Fodor would accept that, I think, so long as the module itself is left untouched. Note, in fact, that his argument immediately above takes no issue with the Turing interpretation of the module ¾ what he sees implausible is a Turing interpretation of the central system. By parity of reasoning, the Principles and Parameters model could have been interpreted (not necessarily, but somewhat plausibly) in similar evolutionary terms: you have Theta and Case modules, for instance, that evolved for whatever reason (even a crazy reason), but the way they got connected, through government, let's say, is not implausibly adaptive. No Case/Theta connection, no visibility of arguments, no interpretable structures. But all of that is now gone, and with it goes another possible adaptationist argument for language in its glorious complexity. You're left with structural economy of the sort we've seen, and good luck connecting that to any direct function of the usual sort.

7. By way of a conclusion

In his Fall 1997 class lectures, and again in unpublished work, Chomsky has pushed some of the ideas discussed above even further, going into what I like to think of as a more "dynamically derivational" way ¾ particularly when seriously exploring the possibility of multiple applications of Spell-Out, or various consequences of accessing to the initial numeration cyclically. All this talk of dynamic systems, of course, is very much intended in the sense of Goodwin's "dynamics of emergent processes", mentioned above. As far as I'm concerned, the more research goes in this direction (and there is a long way to go), the closer we are to speaking in terms that complexity theorists can relate to, thus moving the syntax project in a new direction.

There, I should say, lie two presently serious problems. One is that (although there is no "complexity theory") many of the "complexity" pioneers come from very different assumptions from the ones linguists usually make, and in particular from the connectionist arena that is alien to Chomskyan concerns ¾ particularly if interpreted in the modular ways that Fodor has naturally advocated. A second problem is that, up to now at least, these people are usually profoundly ignorant of linguistic facts, and even when the best among them try in good faith to discuss language, the result is often gibberish (see e.g. the deep misunderstandings of the otherwise intriguing book by Cohen and Stewart (1997), particularly around p. 247). I don't think either of these are fundamental problems, but they should be kept in mind.

At any rate, mine has been a mildly ontological take on the Minimalist program. I say "mild" because I'll be the last one to want to fall onto "hard" ontological commitments; my argument hasn't been that at all. Rather, the issue is simple: stuff out there, in the natural world of physics, chemistry, or if one looks, organisms, has the core properties that Chomsky thinks language exhibits. Whatever that "optimal" form is, it is far away from a simple consequence of some (unclear) function.

  • BERLINSKI, D. (1986) Black Mischief. The Mechanics of Modern Science New York: William Morrow and Co.
  • CHOMSKY, N. (1981) Lectures on Government and Binding Dordrecth: Foris.
  • žžžžž (1986) Knowledge of Language New York: Praeger.
  • žžžžž (1993) Language and thought Wakefield: Moyer Bell.
  • žžžžž (1994). Language and nature. Mind 104: 1-61.
  • žžžžž (1995) The Minimalist Program Cambridge: MIT Press.
  • COHEN, J. & I. STEWART (1994) The collapse of chaos London: Penguin.
  • DAWKINS, R. (1987) The blind watchmaker Harlow: Longmans.
  • EPSTEIN, S. D. (1999) Un-principled syntax: the derivation of syntactic relations. In: Working minimalism, (ed.) Samuel D. Epstein and Norbert Hornstein, 317-345. MIT Press, Cambridge, Mass.
  • FODOR, R. (1998) The trouble with psychological darwinism. London Review of Books, January 22: 11-13.
  • FUKUI, N. (1996) On the Nature of Economy in Language. Cognitive Studies 3: 5171.
  • GIVNISH, T. J. (1994) The golden bough. A review of Jean 1994. Science 266: 1590-1591.
  • GOODWIN, B. (1994) How the leopard changed its spots London: Weidenfeld & Nicolson.
  • GOULD, S. J. (1991) Exaptation: A crucial tool for evolutionary psychology. Journal of Social Issues 47: 43-65.
  • HALE, K. & S. J. KEYSER. (1993) On argument structure and the lexical expression of syntactic relations. In: Hale and Keyser (eds.) The View from Building Twenty Cambridge: MIT Press.
  • JEAN, R. V. (1994) Phyllotaxis: A systematic study in plant morphogenesis Cambridge: Cambridge University Press.
  • KAUFMAN, S. (1995) At home in the universe: The search for laws of self-organization and complexity New York: Oxford University Press.
  • KAYNE, R. (1994) The antisymmetry of syntax Cambridge, Mass.: MIT Press.
  • MEINZER, K. (1994) Thinking in complexity Berlin: Springer Verlag.
  • MARTIN, R. (1996) A Minimalist Theory of PRO and control. Doctoral dissertation, University of Connecticut, Storrs..
  • MONTAGUE, R. (1974) Formal Philosophy New Haven: Yale University Press.
  • PARTEE (1996) The Development of Formal Semantics in Linguistic Theory. In: S. Lappin (ed.) Contemporary Semantic Theory Oxford: Blackwell 1996.
  • PENROSE, R. (1994) Shadows of the Mind Oxford: Oxford University Press.
  • PINKER, S. (1994) The language instinct New York: Morrow.
  • žžžžž (1998) How The Mind Works New York: W.W. Norton Co.
  • SCHOEMAKER, P. (1991) The Quest for Optimality: A Positive Heuristic of Science? Brain and Behavioral Studies 14: 205-245.
  • THOMPSON, D. W. (1917) On growth and form Cambridge: Cambridge University Press.
  • URIAGEREKA, J. (1998) Rhyme and reason: an introduction to minimalist syntax MIT Press, Cambridge, Mass.
  • WEST, G., J. BROWN & B. ENQUIST (1997) A general model for the origin of allopmetric scaling laws in biology. Science 280: 122-125.

  • *
    The research behind this note was partly funded by NSF grant SBR9601559. I am indebted to Elena Herburger, Jairo Nunes, Carlos Otero, and Phil Resnik for specific comments on an earlier draft.

Publication Dates

  • Publication in this collection
    11 Dec 2001
  • Date of issue
    2000
Pontifícia Universidade Católica de São Paulo - PUC-SP PUC-SP - LAEL, Rua Monte Alegre 984, 4B-02, São Paulo, SP 05014-001, Brasil, Tel.: +55 11 3670-8374 - São Paulo - SP - Brazil
E-mail: delta@pucsp.br