Acessibilidade / Reportar erro

Is Greenberg's "Macro-Carib" viable?

É viável a hipótese "Macro-Carib" de Greenberg?

Abstracts

In his landmark work Language in the Americas, Greenberg (1987) proposed that Macro-Carib was one of the major low-level stocks of South America, which together with Macro-Panoan and Macro-Ge-Bororo were claimed to comprise the putative Ge-Pano-Carib Phylum. His Macro-Carib includes the isolates Andoke and Kukura, and the Witotoan, Peba-Yaguan, and Cariban families. Greenberg's primary evidence came from person-marking paradigms in individual languages, plus scattered words from individual languages collected into 79 Macro-Carib 'etymologies' and another 64 Amerind 'etymologies'. The goal of this paper is to re-evaluate Greenberg's Macro-Carib claim in the light of the much more extensive and reliable language data that has become available largely since 1987. Based on full person-marking paradigms for Proto-Cariban, Yagua, Bora and Andoke, we conclude that Greenberg's morphological claims are unfounded. For our lexical comparison, we created lexical lists for Proto-Cariban, Proto-Witotoan, Yagua and Andoke, for both Greenberg's 143 putative etymologies and for the Swadesh 100 list. From both lists, a total of 23 potential cognates were found, but no consonantal correspondences were repeated even once. We conclude that our greatly expanded and improved database does not provide sufficient evidence to convince the skeptic that the Macro-Carib hypothesis is viable

Greenberg; Long-range linguistics relationship; Cariban; Bora-Witotoan; Andoke; Peba-Yaguan


No seu trabalho referencial Language in the Americas, Greenberg (1987) propôs que Macro-Carib seria um dos troncos lingüísticos importantes da América do Sul, o qual combinaria com os troncos Macro-Pano e Macro-Ge-Bororo para formar o suposto filo Ge-Pano-Carib. Seu Macro-Carib inclui as línguas isoladas Andoke e Kukura e as famílias Witoto, Peba-Yagua e Caribe. A principal evidência de Greenberg veio de paradigmas de marcação de pessoa em línguas individuais e também de palavras de línguas individuais coletadas em 79 'etimologias' de Macro-Carib, e mais 64 'etimologias' de Ameríndio. A meta deste artigo é reavaliar a proposta Macro-Carib de Greenberg à luz dos dados lingüísticos mais extensivos e confiáveis que se tornaram disponíveis em grande parte a partir de 1987. Baseados em paradigmas pessoais completos de Proto-Caribe, Yagua, Bora e Andoke, concluímos que as afirmações de Greenberg sobre morfologia não podem ser sustentadas. Para nossa comparação lexical, criamos listas lexicais para Proto-Caribe, Proto-Witoto, Yagua e Andoke, tanto para as 143 etimologias criadas por Greenberg quanto para a lista de 100 palavras de Swadesh. No total, encontramos 23 cognatos potenciais das duas listas. Contudo, nenhuma correspondência consonantal foi repetida. Concluímos que nosso banco de dados expandido e aperfeiçoado não fornece evidência suficiente para convencer o cético de que a hipótese Macro-Carib seja viável

Greenberg; Parentesco lingüístico remoto; Família Karib; Bora-Witotoan; Andoke; Peba-Yagua


ARTICLES

Is Greenberg's "Macro-Carib" viable?

É viável a hipótese "Macro-Carib" de Greenberg?

Spike GildeaI; Doris PayneII

IUniversity of Oregon. Department of Linguistics. Eugene, Oregon, USA (spike@uoregon.edu)

IIUniversity of Oregon. Department of Linguistics. Eugene, Oregon, USA and SIL International (dlpayne@uoregon.edu)

ABSTRACT

In his landmark work Language in the Americas, Greenberg (1987) proposed that Macro-Carib was one of the major low-level stocks of South America, which together with Macro-Panoan and Macro-Ge-Bororo were claimed to comprise the putative Ge-Pano-Carib Phylum. His Macro-Carib includes the isolates Andoke and Kukura, and the Witotoan, Peba-Yaguan, and Cariban families. Greenberg's primary evidence came from person-marking paradigms in individual languages, plus scattered words from individual languages collected into 79 Macro-Carib 'etymologies' and another 64 Amerind 'etymologies'. The goal of this paper is to re-evaluate Greenberg's Macro-Carib claim in the light of the much more extensive and reliable language data that has become available largely since 1987. Based on full person-marking paradigms for Proto-Cariban, Yagua, Bora and Andoke, we conclude that Greenberg's morphological claims are unfounded. For our lexical comparison, we created lexical lists for Proto-Cariban, Proto-Witotoan, Yagua and Andoke, for both Greenberg's 143 putative etymologies and for the Swadesh 100 list. From both lists, a total of 23 potential cognates were found, but no consonantal correspondences were repeated even once. We conclude that our greatly expanded and improved database does not provide sufficient evidence to convince the skeptic that the Macro-Carib hypothesis is viable.

Keywords: Greenberg. Long-range linguistics relationship. Cariban. Bora-Witotoan. Andoke. Peba-Yaguan.

RESUMO

No seu trabalho referencial Language in the Americas, Greenberg (1987) propôs que Macro-Carib seria um dos troncos lingüísticos importantes da América do Sul, o qual combinaria com os troncos Macro-Pano e Macro-Ge-Bororo para formar o suposto filo Ge-Pano-Carib. Seu Macro-Carib inclui as línguas isoladas Andoke e Kukura e as famílias Witoto, Peba-Yagua e Caribe. A principal evidência de Greenberg veio de paradigmas de marcação de pessoa em línguas individuais e também de palavras de línguas individuais coletadas em 79 'etimologias' de Macro-Carib, e mais 64 'etimologias' de Ameríndio. A meta deste artigo é reavaliar a proposta Macro-Carib de Greenberg à luz dos dados lingüísticos mais extensivos e confiáveis que se tornaram disponíveis em grande parte a partir de 1987. Baseados em paradigmas pessoais completos de Proto-Caribe, Yagua, Bora e Andoke, concluímos que as afirmações de Greenberg sobre morfologia não podem ser sustentadas. Para nossa comparação lexical, criamos listas lexicais para Proto-Caribe, Proto-Witoto, Yagua e Andoke, tanto para as 143 etimologias criadas por Greenberg quanto para a lista de 100 palavras de Swadesh. No total, encontramos 23 cognatos potenciais das duas listas. Contudo, nenhuma correspondência consonantal foi repetida. Concluímos que nosso banco de dados expandido e aperfeiçoado não fornece evidência suficiente para convencer o cético de que a hipótese Macro-Carib seja viável.

Palavras-chave: Greenberg. Parentesco lingüístico remoto. Família Karib. Bora-Witotoan. Andoke. Peba-Yagua.

INTRODUCTION

In his landmark work Language in the Americas, Greenberg (1987) proposed that Macro-Carib was one of the major low-level stocks of South America, which together with Macro-Panoan and Macro-Ge-Bororo were claimed to comprise the putative Ge-Pano-Carib Phylum.1 1 While it is not the purpose of this paper to evaluate more than the Macro-Carib hypothesis, we wish to clarify that we do not subscribe to Greenberg's 'Ge-Pano-Carib' hypothesis. To date, no solid research suggests that Panoan (or the Tacanan family, which does reconstruct with Panoan into Proto-Pano-Tacanan) has an affiliation with Ge, Cariban, Peba-Yaguan, Andoke, or Witotoan languages. Indeed, the hypothesis that Cariban, Tupian, and Jê languages might form a phylum is more promising (Rodrigues, 1985; Salzano et al., 2005). His Macro-Carib includes the isolates Andoke and Kukura, and the Witotoan, Peba-Yaguan, and Cariban families. The conclusions in Greenberg (1987) have been roundly criticized, by some for perceived methodological shortcomings (Campbell, 1988; Goddard, 1990; Matisoff, 1990; Rankin, 1992; Dixon, 1997), by others for the poor quality of the data on which he based his claims (to Campbell, Goddard and Rankin, add Chafe, 1987; Adelaar, 1989; Berman, 1992; Kimball, 1992). Greenberg's defenses have been largely methodological (Greenberg, 1989) and theoretical (Greenberg, 2000).

The goal of this paper is to re-evaluate Greenberg's Macro-Carib claim in the light of fresh data that has become available largely since 1987. We take seriously Rankin's (1992, p. 329-30) suggestion, that the search for putative grammatical and lexical correspondences should be conducted by individuals with specialized knowledge in at least one of the languages (or families) in question. Payne has substantial field experience with both Yagua (Peba-Yaguan) and Panare (Cariban), as well as some familiarity with the recent Witotoan primary literature; Gildea is a specialist in the Cariban language family, having conducted field work with 15 Cariban languages and having published comparative work since 1993. We agree completely with Greenberg (1987, p. X): "All sources are imperfect. It is simply that some are more imperfect than others." In this paper, we offer sources that are orders of magnitude less imperfect than those that were available to Greenberg when he formed his hypotheses. In the next section, we turn to a brief review of the materials that allow us to claim this empirical advantage.

Our database

In re-evaluating his hypothesis, we review not only the forms Greenberg presents in his 1987 work, but the sources indicated below. In two instances we rely on data from single languages (Andoke, Yagua), but for terminological ease we will frequently refer to four families (Andoke, Peba-Yaguan, Witotoan, Cariban). We do not consider Kukura at all, thanks to Nimuendajú's (1932) revelation it this is not a real language. Nimuendajú found no indication of the (current or prior) existence of such a group in his trips to the area, and argues that the explorer who collected the original 'Kukura' list was deceived and that his word list actually contained 15 items of badly pronounced ('ausgesprochenem') Guarani, one item that was the Ofayé word for 'black', and the rest were made-up 'phantasierte'.

Witotoan: There are six known Witotoan languages, located primarily in southern Colombia and Northern Peru between the Putumayo and Caquetá Rivers, though some small family groups have apparently recently migrated into Brazil. Aschmann (1993) reconstructs Proto-Witotoan (PW), observing that the two sub-branches Bora-Muinane (BM) and Huitoto-Ocaina (HO) each reconstruct conclusively, and that while less robust, Proto-Witotoan reconstructs "to the author's satisfaction in spite of the lower demonstrated cognate count" (Aschmann, 1993, p. 2). We have also consulted Thiesen and Weber (to appear), Thiesen and Thiesen (1998), Burtch (1983), and Leach (1969).

Andoke: Andoke is located on the Aduche River, a tributary of Caquetá in Colombia. In 1908 there were some 10,000 Andoke (Landaburu, 1979), but the group currently numbers around 600 with only 50 monolinguals. Despite some earlier claims that Andoke was Witotoan, Aschmann (1993) considers it an isolate. Data for this study come from Landaburu (1979, 1992), Landaburu and Pineda (1984), and personal communication with Jon Landaburu.

Peba-Yaguan: Yagua is the only Peba-Yaguan language which has received substantive documentation. Today there are some 3,000 to 4,000 Yagua speakers scattered in small settlements mostly along affluents of the lower Napo, upper Amazon, and Yavarí Rivers. Rivet (1911a, b) found short word lists for Peba (now extinct) and Yagua; and for Yameo (also extinct) some short religious texts dating from the 1600s. From these materials he compiled a Peba-Yagua word list and some grammatical morphemes clearly linking the three languages, but the data are highly incomplete, and a serious attempt at reconstructing Proto-Peba-Yaguan has not yet been undertaken.2 2 Peba and Yameo materials are extremely limited, impeding serious comparative work. For this study we rely on Yagua data from Powlison (1995), Payne and Payne (1990), and Thomas Payne and Doris Payne's field notes.

Cariban: Geographically, Cariban languages are found mostly north of the Amazon River, from just west of Maracaibo to the eastern Guianas. Additionally, a few Cariban groups are found well south of the Amazon in the Xingu region, and one cluster of dialects was once spoken in southeastern Colombia on the Upper Vaupés, Yarí, and lower Caquetá Rivers; moribund Karihona is still found in this region. However, migration of Cariban groups to southeastern Colombia must have occurred within the last 500 years, as Karihona appears to be very closely related to Tiriyo (Meira, 2000a, p. 20). Data on Cariban are far more extensive than on the other language families: in addition to more articles and grammars than we can list here, we have the benefit of a rich history of comparative work, especially Girard (1971); Mattéi Muller and Henley (1990); Mattéi Muller (1994, 2002); Gildea (1998, 2003); Meira (2000a-b, 2002); Meira and Franchetto (2005); Meira, Gildea and Hoff (to appear). Additionally, we were able to utilize an exciting new tool, Sérgio Meira's computerized Cariban Toolbox databases (Meira, 2006), which enabled us to generate a number of new Proto-Cariban reconstructions to use as comparanda.

In contrast to such sources, Greenberg appears to have relied primarily on older, less-reliable wordlists that were the basis for the deeply flawed Cariban reconstructions in the Goeje (1946)3 3 Cf. Girard (1971) for a scathing review of all earlier comparative work on Cariban, and of many of the older sources; we endorse the intellectual content of these criticisms. and Durbin and Seijas (1973),4 4 The body of Durbin's comparative Cariban work is hopelessly flawed, cf. Gildea (1998, p. 7), Meira (2000a, p. 10), and the sources cited therein. Durbin and Seijas (1973) ostensibly reconstruct phonology and some lexical items to "Proto-Hianocoto", but as Meira (2000a, p. 15) points out, the various "language" names that are compared are attested as Karihona clan names. The apparent divergence of these "languages" indicates the weakness of the transcriptions by the sources, and the 1973 "reconstructions" should be set aside in favor of the more reliable synthesis of modern and older Karihona data offered in Meira (2000a). or fragmentary materials such as those gathered by Rivet (1911a, b).

The phonological systems

Tables 1 and 2 present the distinctive consonant and vowel segments in the four groups. Cariban and Witotoan segments are based on reconstructed proto-languages. Peba-Yaguan is based on a single modern representative of the family, Yagua, and the isolate Andoke is also based on modern data.

In a first inspection, the systems are not obviously similar in degree of elaboration. Consonants vary from 8 distinctive consonants in Proto-Cariban (PC) to 16 in Proto-Witotoan (PW). Vowels and syllable features are dramatically more differentiated, with only 7 vowels in PC, 5 in PW but with distinctive tone and nasality, 9 oral and 5 nasal vowels in Andoke but with distinctive tone and glottalization, and 6 in Yagua but with distinctive tone, nasality, and length. Note also the parallel allophonic alternation of nasals with voiced or orally-released nasal stops in Andoke and Yagua (both due to denasalization before oral vowels), rampant palatalization in Yagua and Bora-Witotan (and in individual Cariban languages, undone in the reconstructions we consider here), and the possible areal features in Witotoan, Yagua and Andoke of both tone and nasalization in vowels.5 5 Dixon (1997, p. 16) reports that tones are among the easiest phenomena to diffuse through contact; Dixon and Aikhenvald (1999, p. 8) consider the presence of / / and nasality in vowels to be phonological features of an Amazonian linguistic area. See Payne (2001) for comments on the preliminary nature of these claims.

Until such time as regular correspondences can be established, the recognition of superficial resemblances (potential cognates) will have to allow similar segments rather than requiring segmental identity (but see Rankin (1992) for an especially cogent discussion of the risks of this procedure). If in fact these four groups all descend from a single common ancestor, we will need to understand the specific mechanisms by which so many segmental and suprasegmental distinctions have been created or lost. Along with an understanding of these mechanisms will come the ability to better recognize cognates whose similarity has potentially been obscured by the many sound changes.

In the remainder of this paper, we compare modern grammatical and lexical information, based on our new compilations of data, including modern additions to Greenberg's etymologies and the Swadesh 100 basic word list. We hope to reduce the possibility of encountering chance resemblances in the lexicon by eliminating comparison of individual languages in the Cariban and Witotoan families, where we utilize reconstructed forms as the comparanda. We similarly restrict ourselves to reconstructed Proto-Cariban grammar; since there is no comparable reconstruction for Witotoan, we utilize Thiesen and Weber (to appear) comprehensive grammar of modern Bora. At each stage, we sift through the possible cognates and correspondences and offer our assessment of the likelihood that any of them provide a good match. We pay special attention to the grammatical correspondence that Greenberg (1987) and Rankin (1992) both consider among the most important for demonstrating the genetic unity of the putative Ge-Pano-Carib grouping - the prefix paradigm of i- '1', a- '2', and i- '3'. Of course, an absence of positive evidence will not 'disprove' Greenberg's hypothesis (cf. Greenberg, 2000), any more than Greenberg's discovery of a few resemblances has 'proven' it. But regardless, we hope that other comparativists will be able to put the data we compile to profitable future use, as well.

We have to confess that despite our continued skepticism of the Macro-Carib hypothesis, and our tepid view of the strength of any cognates or series of correspondences surfacing in our new data, the question does not seem entirely closed. That is, there appear to be some (tenuous) hints of some repeated patterns in the data, at least some of which may merit additional investigation.

THE GRAMMATICAL EVIDENCE

Most studies that discuss the kind of evidence necessary to demonstrate genetic relationship focus on the importance of idiosyncratic grammatical behavior of the sort that is unlikely to be borrowed, and that hence can be taken as evidence for shared retention from a shared proto-language. The importance of such evidence is one point on which Greenberg and his detractors agree, cf. Greenberg (1957, p. 37-8; 1987, p. 30; 1989, p. 108-9); Campbell (1988, 2003, p. 268-70); Rankin (1992); Dixon (1997, p. 22). After a quick initial survey of typological features in the grammar of all four groups, we review the status in Macro-Carib of the idiosyncratic personal prefix sets utilized by Greenberg (1987, p. 45ff) to argue for (among other units) the existence of the Ge-Pano-Carib macro-phylum, and then we examine some other domains in which idiosyncratic grammar might be used to link the four groups together. In all these domains, we find no evidence in favor of the Macro-Carib construct.

All four groups present several features frequent in the Amazonian region: they are polysynthetic, mostly head-marking, and the same pronominal forms that mark the possessor on nouns are also found as one of the bound or cliticized pronominal paradigms marking core arguments of clauses. They also have the order Genitive-Noun, and all have postpositions as well. All four groups use nominalization as a major strategy for creating subordinate clauses, and in at least Bora of the Witotoan family, this appears to be essentially the only strategy (Thiesen and Weber, to appear). Aside from these potential areal tendencies, we see fewer matches: main clause alignment systems are nominative-accusative (Andoke, Witotoan, Yagua) and Inverse (Proto-Cariban), with minor patterns of split intransitivity in Cariban and Yagua (T. Payne, 1987). Though all four groups use nominalization for subordination, they do not show the same patterns of argument marking in nominalized subordinate clauses: Cariban treats the S and O as possessor of the nominalized verb (hence, an absolutive patterning); but Andoke, Bora and Yagua remain thoroughly nominative-accusative in their treatment of arguments within nominalized subordinate clauses; this is because their possessive forms are identical to those of the nominative pronominal forms in main clauses (Tables 3 and 4).

Leaving out the Cariban family, the other three groups show well-developed grammatical noun classification systems, each with a large set of classifying morphemes that almost certainly derive from noun roots; these systems are discussed for Yagua in Payne (1986, 1987), and in Seifart (2005) for Miraña (Witotoan). Payne (1987) also makes a cogent argument for such classifier systems as a feature of a northwest Amazon linguistic area (also Aikhenvald, 2001, p. 168; and Seifart and Payne, 2007). There is a large, productive class of lexical numbers in Yagua and Witotoan, whereas Cariban has only a few.

Person-marking on nouns and verbs

Dixon (1997, p. 22) suggests that complete personal paradigms (free pronominal or affixal) can provide among the "surest indicators of a genetic relationship." We begin with the exposition and consideration of our data, followed by a discussion of Greenberg's claims. The full array of person markers in all four groups is laid out in Tables 3, 4 and 5: Table 3 displays the possessive forms, Table 4 the intransitive subject forms (which are also the transitive subject forms in Yagua, Andoke and Bora), and Table 5 shows the unique person-markers reconstructed for transitive verbs in the Cariban family. We include certain forms in Andoke and Bora that double as classifiers and as (parts of) argument markers.

In considering the forms in Table 3, there are almost no candidates for even superficial resemblance, except that PC *u=j- '1' and Andoke o- '1' both involve back rounded vowels; the PC *k

- '1+2' and the Andoke ka- '1+2' both involve a voiceless velar stop; the PC *i- '3', Bora i(
)- 'Coref', and Yagua hiy- 'Coref' all involve the high front vowel.

In Tables 3 and 4, the Andoke and Bora third person forms also have classifying functions. In Bora certain classifiers are used on verbs for topical (continuous) third person subjects (Thiesen and Weber, p. 200): pε '3SgM', -ts;̀ '3SgF', -tshi 'M', -ph 'F', - ̀mε 'AnPl', -tsh 'child', -n 'Inanimate general'. Andoke classificatory argument markers are - 'Inan.nature', ó- 'Inan.goods', o-/ya- 'Masc', ni-/õ- 'Fem'. Interestingly, -m

-tshi/-ḿ-phshow up as part of 1DL free pronouns, but also as suffix complexes on verbs for 3DL continuous/topical subjects.

Considered comparatively, the forms in Table 4 are largely the same as in Table 3, except that Bora has a different set of first and second person markers, and both Proto-Cariban and Yagua present extra forms due to the split in intransitive clause marking. Here we see four first-person markers with surface similarities across the groups: PC *u=j- '1SO' and Andoke o- '1S' are joined by PC *w

- '1SA' and Bora o- '1S'; PC *k- '1+2SO' and Andoke ka- 1PLS/A remain as in Table 3.

The additional PC form added in Table 5 does not yield any additional superficial similarities. Superficial inspection of these Tables leads to the clear conclusion that they provide no evidence for any particular relationship amongst these language groups. The specific Ge-Pano-Carib correspondence proposed by Greenberg (1987, p. 45ff, 274, 277, 279) - i- '1', a- '2' and i- '3' (with this latter form showing evidence of a consonant between the prefix and the root) - is simply not attested in our data.

Greenberg's evidence for reconstructing the putative Macro-Carib first-person *i- is limited to the palatalizing effect of the first-person forms ta- (Muinane, BM) and ra- (Yagua - the form is raj-, see Tables 3 and 4), plus the undocumented assertion (p. 274) that "In the Carib languages generally, i- is the first person singular possessive, the subject of intransitive verbs, and the object of verb forms in which a third person acts on the first person." The Yagua form explicitly does end with j- in Payne and Payne's and Powlison's Yagua data, but for Cariban Greenberg has apparently mistaken the *j- 'relational' prefix that follows the *u- first person prefix (and which occurs in a number of other locations, cf. below) as part of the general Cariban first-person morpheme. While Thiesen and Weber (to appear, p. 209) give ta as the 1st person singular possessive pronoun for Bora (BM), their data show that it has a palatalizing effect on at least some subsequent bound morphemes. For the closely-related dialect Miraña, Seifart (2005, p. 52) gives taj as the 1st singular possessive form.6 6 Thiesen and Weber (to appear) claim that Bora ta( ) palatalizes the following consonants only in some morphemes; consonants are also often palatalized after /i/. Thus, it should be noted that Bora 2 nd person tiʔ and 3 rd person iʔ have palatalizing effects on certain following morphemes; examples (5) and (6).

Greenberg (1987, p. 278) already acknowledges that the second person a- is limited to the Cariban family, and in fact, a reasonably large number of modern Cariban languages do present second person forms in a- (enough that Gildea (1998) actually reconstructs *a- '2' to Proto-Cariban). However, Meira and Franchetto (2005) demonstrate the need to reconstruct a vowel *ô- (most likely representing an unrounded mid vowel, either [ə] or []), which Gildea (1998) did not have available in the phonemic inventory he inherited from Girard (1971). Meira, Gildea and Hoff (to appear) point out that the correspondence pattern for the second-person prefix is more consistent with correspondences adduced by Meira and Franchetto in support of this new vowel, and thus they change the reconstruction to *ô- '2', with a- '2' being a later development in a small subset of the languages.

While Greenberg (1987, p. 46) explicitly asserts the existence of "the i-/a-/i- pattern" in Macro-Carib, he provides evidence of the third-person forms in only one language, Pemón: i-paruči 'his sister', i-t-enna 'his hand'. He goes on to speculate that the origin of the form *it- '3', which precedes vowels, is as a remnant of a much older alternation of third person *i- before consonants and *t- before vowels, with a later reanalysis of the t- as part of the stems, to be followed by analogical extension of the i- to the t- forms. This is certainly a coherent story, and it appears to gather strength when a parallel pattern is asserted for languages in Bolivia, Central America and North America; however, it is not consistent with the comparative evidence internal to the Cariban family. First, an ablaut pattern inherited across the family indicates that the third person form *i- (with no intervening element) was already in place preceding vowels by the time of pre-Proto-Carib (Meira, Gildea and Hoff, to appear). The ablaut pattern generated by *i- '3' on vowel-initial stems occurs in nouns, verbs and postpositions, in a geographically and genetically diverse representation of the family. In contrast, the *it- prefix is found only on vowel-initial possessed nouns, and this only in languages within the immediate environs of Venezuela. Gildea (2003) takes the limited geographical and morphological distribution of this form as evidence for its innovative status, listing this as an innovation common to his proposed Venezuelan Branch of the family.

The reconstruction of the i- '1', a- '2' and i- '3' person paradigm was so important to Greenberg's classification that he presented it in a privileged position in Chapter 2, and reprised it in Chapter 5; and in his otherwise uniformly negative review of Greenberg's grammatical evidence, Rankin (1992, p. 340-1) suggested that this correspondence deserved further investigation. But as we have just seen, it turns out that modern data not only fail to reveal similarities between the four groups of the putative Macro-Carib stock, but they also fail to sustain what once appeared to be among the strongest evidence for the putative Ge-Pano-Carib grouping.

Other idiosyncratic grammatical features

Given the generally accepted importance of idiosyncracies to demonstrating genetic relationship, we now consider some typologically idiosyncratic grammatical features from Cariban, Peba-Yaguan, and Witotoan that might be useful heuristics for probing distant genetic relationship. As it turns out, none of these features are common to the four groups of the putative Macro-Carib stock, but we include the information in order to facilitate future comparative research for anyone who seeks to demonstrate genetic connections with these languages.

The first is the Cariban 'Relational' prefix, which occurs on vowel-initial roots. It is seen in the Proto-Cariban prefix sets as the *j- following the first and second person possessive, SO and O prefixes. The Relational prefix also occurs between two words in certain tight two-word constituents, marking a vowel-initial second element (which is also the head) in possessor-possessed, object-verb, and object-postposition sequences (Gildea, 1998, p. 85, 113, for data from many Cariban languages). Etymologically, the first and second-person markers in Cariban come from free pronouns,7 7 For the most recent reconstructions, see Meira, Gildea and Hoff (to appear). hence the relational prefix occurs only with these two persons and with free NP dependents, but not with the other person-markers. In essence, the relational prefix apparently only shows that the preceding noun phrase is a governed complement. Thus, for example, if a NP preceding the possessed noun is not the possessor, the relational prefix does not occur, but instead, the personal possessive prefix does. As illustration, consider the following possessive examples from Hixkaryana (1a-c; Gildea's field notes) and Panare (2a-c; Doris and Thomas Payne's field notes). For Hixkaryana, the noun ohte 'medicine' is possessed by the second person proclitic (1a), the third person free pronoun noro (1b), and the third person prefix (1c). Note the presence of the relational prefix in 1a-b, but not in 1c. For Panare, the noun əwa 'nose' is possessed by the second person proclitic (2a), the third person distal free pronoun kən (2b), and the third person prefix (2c). Again, the presence of the relational prefix is clear in (2a-b); the third person prefix, tj- (2c) appears as though it might contain the relational prefix as well, but this form is, in fact, a reflex of the Proto-Venezuelan *it- '3' (reconstructed by Gildea, 2003, as discussed in 'Person-marking on nouns and verbs' - the same metathesis is found in several Venezuelan languages).

(1) a. o=j-ohte b. noro j-ohte c. Ø-ehte 2-REL-medicine 3 REL-medicine 3-medicine 'your medicine' 'his/her medicine' 'his/her medicine' (2) a. a=j-əwa-n kən jwa-n c. tj-əwa-n 2-REL-nose-POSD 3 REL-nose-POSD 3-nose-POSD 'your nose' 'his/her nose' 'someone's nose'

There is nothing similar to the Relational prefix in the Witototan, Andoke, or Peba-Yaguan languages. Compare the Yagua forms in (3) and (4), noting that no extra prefix occurs on 'cloth' in third person forms, whether or not an NP possessor immediately precedes it.

Yagua does not allow vowel-initial word structures,8 8 In Powlison's (1995) 600-page Yagua-Spanish dictionary, the only vowel-initial Yagua word is the interjection h . but initial syllables beginning with h are 'weak' in the sense that the vowel of the syllable may be elided in fast speech (resulting in hC initial clusters), and an h onset to an initial short syllable is invariably deleted when a short-syllable clitic or prefix is added. Here we chose the closest structural examples possible to those in (1) to show that Yagua lacks a relator-type element between a possessor and following governing head.

(3) a. rahjej b. hihjej c. Tomasa haj d. sahaj raj-haj hij-haj Tomasa haj sa-haj 1SG-cloth 2SG-cloth Tom cloth 3SG.ANIM-cloth 'my cloth' 'your cloth' 'Tom's cloth' 'his/her cloth' (4) a. sa-haj Tomasa 3SG.ANIM-cloth Tom 'Tom's cloth'

Like Cariban, Bora shows tight constituency between an NP possessor and its following possessed head, demonstrated in the case of Bora by palatalization, resyllabification, vowel harmony, and tone processes. But there is nothing resembling the relational prefix intervening between a possessor and the possessed noun specifically for first or second person (5a-b) and (6a) as opposed to bound third persons (5c); or between free NP possessors and their following heads (6b-c):9 9 Thus, though a palatalization effect is evident for first person in Yagua (3a), there is no evidence of a morphologically distinct palatalizing element for only first and second person in Bora (5a) or with certain other consonant-initial nouns. (See note 6 above regarding palatalization idiosyncracies in Bora).

(5) a. thahja b. tihja c. ìhja tha=hà ti=hà ì=à my=house 2-house 3=house 'my house' 'your house' 'his/their/self 's house' (6) a. thà= b. á:n 1SG=dog-SgM his.PROX dog-SgM his dog-SgM 'my dog' his(proximate possessor)dog 'his dog'

The Relational prefix is typologically unusual, so any two languages with a potentially cognate relational prefix pattern would, more likely than not, be genetically related; but the pattern of using the relational prefix only for first and second person bound prefixal forms would make an even stronger case for genetic relationship. Rodrigues (1994) claims to have found such a pattern in the Cariban, Macro-Jê and Tupían families, and for Cariban languages, this exact pattern is confirmed with a reconstruction at the level of Proto-Cariban (see also Meira, Gildea and Hoff, to appear).10 10 Rodrigues uses the term 'Relational prefix' also for morphemes that we consider to be third person markers; our claims regarding typological uniqueness are limited to the forms described in this section. But in the other putative Macro-Carib groups, there is no sign of anything like the Cariban Relational prefix. It does not escape our attention that in Yagua all but one 1st and 2nd person form end in /j/ (Tables 3 and 4). But an argument against analyzing this /j/ as a frozen remnant of a relational prefix is the fact that the pronominal enclitic forms (which double as free pronouns just with addition of stress) retain this same /j/ element - even when there is no following governing head.

Three additional idiosyncracies present themselves from Cariban. First, there is a group of 10 to 15 reconstructible transitive verbs that require an idiosyncratic t- '3' prefix in the imperative and when nominalized. We illustrate a few of the surprising cases of t- '3' from Tiriyó in example (7), and in Table 6, we give a list of all the potentially reconstructible verbs that present this unusual prefixation (in any Cariban language).

(7)

Surprising cases of t- '3' for Tiriyó [t]ənə 'eat meat' (Meira, 1999, p. 220).

t-ənə-i_mə 's/he would eat/have eaten (meat)' t-ənə-kə! 'eat it (meat)!' ootənə-kə! 'eat your meat food!'

t-ənə-se_wae

'I want to eat it (meat)' j-otənə-se_wae 'I want to eat my meat food'

Second, a prefix w- occurs on selected nonfinite forms of a small subset of reconstructable intransitive roots and all detransitivized verb stems. We illustrate with the verb əturu 'talk' in Tiriyó: the finite forms in (8a) present no w-, whereas the nonfinite forms in (8b) present the w-. Like the Relational Prefix, this w- prefix remains robust in some modern languages, but is lost altogether in others (although sometimes it is preserved in allomorphy in personal prefix paradigms). The range of nonfinite forms in which it occurs differs from language to language, but it is widespread enough in the family that it will certainly reconstruct with some important relationship to at least the verbs and prefixes given in Table 7.

(8) The presence and absence of w- in Tiriyó (Meira, 1999, p. 323, 292, 246-7). a.

Inflections without w-

b. Inflections with w- əturu-kə 'talk!' ji-w-əturu-Ø 'my talking' t-əturu 'I talked' ji-w-əturu-to '(for) my talking' m-əturu 'you talked' w-əturu-nə 'talking' k-əturu 'we (incl) talked' t-w-əturə-en 'who will talk' n-əturu 's/he talked' t-w-əturə-e 'talked'

Third, an object-nominalizing prefix *n- occurs between the transitive verb root and the possessive person-marking prefixes, which, in this case, uniquely indicates A of the verb. We illustrate with Panare examples: in (9a), the second person prefix indicates the O of the nominalized verb; in (9b), the same prefix precedes n-, now refering to the A of the verb. This prefix is idiosyncratic in two ways: (i) all other nominalizers in the family are suffixes, and (ii) only in this nominalization does the notional A possess a derived noun. Gildea (1998, p. 128ff) reconstructs this form and function to Proto-Cariban; Gildea (1994) compares this idiosyncratic grammatical behavior with a parallel (and potentially cognate) form in the Tupi-Guaranian family.

(9) a. a-j-ktə-hpə a-n-ktə-hpə 2-Rel-cut-Perfect.Inferential 2-O.Nzr-cut-Perfect.Inferential 'your having been cut' 'the one that you have cut'

Again, none of these potential sources of shared idiosyncracy yield matches in the rest of the putative Macro-Carib grouping.

The final arguably idiosyncratic piece of grammar we discuss here is the morphological complexity found in numerals in Witotoan and in Yagua: both present a vegisimal-decimal numeral system, in which noun classifiers are required to occur as affixes. In certain lower numerals, the classifiers look almost 'infix-like' because they are followed by a required number suffix. Further, both languages likely have (had) singulative, dual, and plural suffixes occurring as the final element of numbers for 'one', 'two', and 'three' (as well as in higher numerals, though the patterns of the number suffixes vary between the languages for the higher numerals; see Payne, 2007). In Yagua (11a-c), the probable singular suffix is not productive and is attested only with the number 'one', particularly making the classifier look infix-like. Bora (10a-c) has a masculine-feminine distinction in animate classifier and number suffixes, which Yagua lacks.

(10)

Bora numerals 'one' through 'three' with selected classifiers (Thiesen and Weber, to appear).

a. tshà-n tshaphi-s one-CL.GENERAL one-CL.AN.SG-FEM.SG 'one item' (class unspecified) 'one feminine animate item' b.

mì-ηέkhw

hέ-ph

two-CL.GENERAL-DL.INAN two-CL.AN.PL-DL.FEM 'two items' 'two feminine animate items' c. phá-phì,tshn-βa phá-phì,tshm-βà all-piled.up-CL.GEN-PL.QNT all-piled.up-CL.AN-PL.QNT 'three items' 'three feminine animate items' (11) Yagua numerals 'one' through 'three' with selected classifiers. a. ta-ra-k t--kii one-CL.GEN-SG(?) one-CL.ANIM-SG(?) 'one item' (class unspecified) 'one animate item' b. n-ra-hy n-n-hy/ndá-n-hy two-CL.GEN-DL two-CL.AN.SG-DL 'two items' 'two animate items' c. mm-rá-my m-way three-CL.GEN-PL three-CL.AN.PL 'three items' 'three animate items'

Though this pattern is interesting and perhaps somewhat unusual in giving rise to the appearance of infixes, as discussed in Payne (2007), the shared pattern of apparent infixation across Yagua and Bora do not show any potentially cognate morphemes. As such, the pattern is possibly due to borrowing or calquing of structure; contact between these groups was almost certain, at least during the time of the rubber boom (though the contact was probably not always friendly; cf. Chaumeil (1983) for discussion of the geography of Yagua migrations).

We were unable to find examples of numbers in our Andoke materials, and the much simpler numeral system in Cariban is completely unrelated, both in forms and structure. Indeed, Cariban completely lacks classifiers of the sort found in the Western Amazon region, and at best only has genitive classifiers (Carlson and Payne, 1989) which are not used in numeral expressions.

This concludes our brief discussion of the lack of idiosyncratic grammatical evidence for the putative connection between the four Macro-Carib groups. We acknowledge that the absence of evidence for a connection may be somewhat unsatisfying because, as pointed out by Greenberg (1957, p. 38), even when languages are related, "Such indications of historical connection founded on morphological irregularities of form and combinability may not always be found." Thus, our failure to encounter shared idiosyncracies in these four or five phenomena does not in itself disprove a hypothesis of relationship. We turn now to the primary evidence given in Greenberg's method, that of the lexicon.

LEXICAL COGNATES AND CORRESPONDENCES

Methodological concerns

Much ink has been spilled about the dangers of Greenberg's method, one of which we reprise here: when one simply compares individual lexical items from individual languages, the danger of finding similar forms and meanings increases simply due to chance. It is for this reason that the comparative method consolidates all early hypotheses by proceeding from the bottom up, reconstructing branches and then comparing the reconstructed forms for each branch to arrive at the reconstruction for the next level. Ultimately (as recommended by Greenberg, 1957, p. 40-1), the goal should be to reconstruct the proto-forms for any families that one wants to compare, and then to consider those proto-forms to be the primary comparanda. While Greenberg (1987) does not use this method, he implicitly endorses it again in his (1989) response to Campbell's (1988) negative review. Campbell demonstrated that one can readily encounter examples in Finnish that are as consistently similar as any language in Greenberg's sample in their match to Greenberg's Penutian etymologies (both lexical and grammatical). In doing this, Campbell illustrates an inherent risk in relying on superficial similarity in both form and meaning as a tool for determining genetic relationship. The conclusion Campbell draws from the Finnish example is that Greenberg's method is not sufficiently rigorous, as had Greenberg encountered these Finnish data in a wordlist from a language of the Americas, his method would have categorized it with Penutian.

Greenberg's (1989, p. 111-12) response is to argue that Campbell made a fundamental error by considering Finnish data in isolation, and that had the Finno-Ugric family been included in the entire database, Greenberg's method would have clearly identified the entire family as separate from Amerind. After reprising the relationship between Finnish, Estonian, and Hungarian, he concludes: "It is a group at this level that should be compared with Amerind, and once more the distinction is obvious. The large majority of Campbell's forms do not even make it to Hungarian."

This argument is striking in that it suggests the importance of prior knowledge about the linguistic families that are being compared; had Greenberg not been familiar with Finno-Ugric, this type of response would not have been possible. Yet Greenberg entered into his comparison of the languages of the Americas without specialist knowledge of any of the families. As a result, his response does not increase the credibility of his overall conclusions. Even more striking is the question that even a specialist in our situation must ask: what if Finnish were an isolate, like Andoke, or the sole well-documented member of a tiny family where the other members are extinct, like Yagua within Peba-Yaguan? If knowledge of (and comparable data from) the rest of the Finno-Ugric family is a prerequisite for the exclusion of Finnish from Penutian, how are we to seek out arguments for the exclusion of Andoke and Peba-Yaguan from Macro-Carib? Andoke, in particular, appears in many fewer of Greenberg's Macro-Carib etymologies than Campbell's Finnish forms do in the Penutian etymologies.

We are forced to the conclusion that in order to test claims of long-range relationship with lexical evidence, prior reconstruction of the lexicon in each family is imperative. As such, our comparanda for the Cariban and Witotoan families are only reconstructed forms. We cannot offer parallel reconstructed forms for Peba-Yaguan because the necessary comparative work has not been carried out, and our preliminary efforts lead us to fear that the older sources in which Peba and Yameo are attested are both too limited and too phonetically unreliable to ever allow a firm reconstruction. To the extent that untimely extinction of the rest of the family leaves us with extensive, reliable data from only one member, we must raise the standards of evidence in order to reduce the likelihood of encountering chance resemblances.

As such, we follow the more conservative requirements for similarity: "similarity of form must be complete similarity. Put rather brutally, if the front halves of two forms are similar, but the back halves aren't, then the forms are not similar." (Harrison, 2003, p. 219, emphasis in original). We also eliminate potentially cognate short forms, as recommended by Campbell (2003, p. 274):

Monosyllabic CV or VC forms may be true cognates, but they are so short that their similarity to forms in other languages could also easily be due to chance. Likewise, if only one or two segments of longer forms are matched, then chance remains a strong candidate for the explanation of the similarity. Such forms will not be pursuasive; the whole word must be accounted for. (See Ringe, 1992, for mathematical proof).

We therefore limit our claims of potential cognacy to forms that share multiple points of articulation and have similar syllable structure; we do consider some cases where the form in one language presents one syllable more or less than the forms in the other languages, but in the absence of a morphological explanation for this difference, we construct nil correspondences for these 'extra' syllables.

Even having taken such precautions, we acknowledge the danger inherent in seeking superficial similarities among modern lexical items from Andoke and Yagua as compared to reconstructed Witotoan and Cariban forms; we are comparing words which, even if they are cognate, are separated by thousands of years of independent language change. The more distant an actual genetic relationship may be, the more divergent we must be prepared for the phonological correspondences to be. For instance, such actual cognates as English wheel and Sanskrit cakra 'wheel' would never be recognized as similar by a non-specialist (Campbell, 2003, p. 267), whereas such non-cognates as Spanish que [ke] 'what' and Nepali ke 'what', or English want and Yagua [] 'want, desire' present surface similarities that one might confidently (and falsely) identify as evidence of relationship. Hence, we take the additional step required by the comparative method (Harrison, 2003, p. 219) of trying to establish repeated regular sound correspondences across our pool of potential cognates. To the extent that a given correspondence recurs, we will take this as strengthening the case for relatedness (Campbell, 2003 p. 266); to the extent that many correspondences recur, we will start to get excited about potential genetic relatedness.

New 'macro-carib' lexical comparisons

In order to test the Macro-Carib hypothesis, we compiled an extended Swadesh word list (Appendix 1 Appendix 1 , available for download at <http://hdl.handle.net/1794/5560>), and we sought to provide more complete and modern forms for all of Greenberg's proposed Macro-Carib etyma, both the 79 that he lists unique to Macro-Carib and the additional 64 items in his "Proto-Amerind Dictionary" for which a Macro-Carib etymon was proposed (Appendix 2APPENDIX 2. Greenberg's Macro-Carib etymologies compared with modern data. pode Huitoto, Hn = Mnca Huitoto, Hr = Murui Huitoto. Additional sources consulted in such cases are Burtch (1983) for Murui Huitoto (Hr), Leach (1969) for Ocaina (O), and Thiesen and Thiesen (1998) for Bora (B)., available for download at <http://hdl.handle.net/1794/5560>).11 11 Sometimes we were not able to find forms in our own sources anything like the forms he offered, other times we found similar forms but with differing transcriptions/translations, and yet other times the matches were good between our fresh data and the forms Greenberg presents.

In examining the lexical data that we have gathered together, we identified some 41 possible cognate sets. Of these, we discarded 18 without listing possible correspondences, 13 because there was only one possible consonantal correspondence (most of these were monosyllabic morphemes) and 5 because the semantic distance between the forms was too great to reconcile with the exploratory nature of a work like this.12 12 In this decision, we explicitly differentiate our methodology from Greenberg's, who accepts more semantic latitude in his potential cognates. While it is true that attested change does produce cognates with ranges of meanings comparable to the rejected forms, we are uncomfortable with the necessity to assume such meaning change in order to relate specific forms, and then to base our potential case for relationship on such assumptions. These forms are listed in Table 8.

The remaining 23 candidates were treated like true cognates: all consonantal correspondences were extracted, as seen in Table 9. Anytime one form has more syllables than another, this necessarily creates a correspondence between a segment (in the longer form) and nil (represented as Ø in the correspondence from the shorter form).

The correspondences were then tabulated, so that we could see if the same correspondences came up in multiple possible cognates. When we compared all 4 families, we encountered 48 consonantal correspondences, 17 labial, 5 velar, and 26 coronal. However, not a single correspondence occurs even once - much less multiple times - in all 4 languages. We do have some attested correspondences with gaps that might be collapsible into a single correspondence, such as the bilabial nasals for 'hand', 'child', and 'shoulder' in (12):

It is only by collapsing correspondences with gaps that we would be able to construct even one correspondence set attested in all 4 families. As such, we conclude that our much more extensive lexical evidence provides no support for Greenberg's Macro-Carib hypothesis.

However, it is also possible to seek correspondences between subsets of the four families, in search of stronger possibilities of trilateral or bilateral relationship. Andoke appears the least in the 23 potential cognates, so we excluded Andoke and compared the remaining three families. We encountered 46 consonantal correspondences, 17 labial, 5 velar, and 24 coronal. This time, 8 of the correspondences have a form in all 3 families, but once again, no full correspondence is repeated multiple times unless we collapse correspondences, and only the three /m/ correspondences seen in (12), removing Andoke, the leftmost column, seem clear candidates for such a collapse. We conclude that the lexical evidence does not support even the weaker trilateral hypothesis.

Finally, we made three bilateral comparisons: Witotoan with Yagua, Witotoan with Cariban, and Cariban with Yagua. In considering the first two bilateral comparisons, there is little to make the heart beat faster: for Witotoan with Yagua, we find 22 correspondences, only 4 attested in 2 or more potential cognates; for Witotoan with Cariban, we find 18 correspondences, with only 3 attested in 2 or more potential cognates. However, when we consider the Cariban and Yagua correspondences, the possibility of a closer relationship between Carib and Yagua cannot be discarded out of hand: of the 21 correspondences, 7 (fully one third) have multiple attestations.

Regarding the word Witoto

One lexical item of note in our data is the PC reconstruction *w

toto 'person' - which is of course striking given the name 'Witoto' for some of the (non-Cariban) Witotoan peoples (often spelled 'Huitoto' or 'Uitoto'). However, there is no evidence that this is a viable cognate across Proto-Witotoan and Proto-Cariban (much less Proto-Peba-Yaguan) groups. It is not an autonym within the Witotoan languages and Aschmann (1993) does not reconstruct such a word for PW or for either branch of the family. Rather, he reconstructs PW for 'people'. In 1921 Preuss wrote that "Uitoto" was not a word of the Uitoto language, but rather was the name used by Cariban speakers with reference to their enemies (Preuss, 1921, p. 160-161). Instead of suggesting genetic relatedness, the fact that this name has been applied to Colombian peoples either suggests probable contact between Cariban speakers and those who are known as Witotoan peoples, or that the first explorers who 'named' the Witotos were travelling with Cariban (for example Karihona, guides); that is 'Witoto', 'Uitoto', or 'Huitoto' were terms used by outsiders. Though of course *w
toto could have spread via an intermediary group that had contact with Cariban speakers, the fact that it is not found as a term for other non-Cariban groups (Tupian, Saliban, etc.) perhaps suggests more direct contact between Cariban speakers and Witotoan groups.

CONCLUSION

The goal we set out with leads only to frustration: one cannot prove a negative, but in seeking to test Greenberg's Macro-Carib hypothesis with more reliable modern data, we have been unable to turn up sufficient data to convince the skeptic that the hypothesis is correct. The pronominal data adduced by Greenberg did not hold up when considering modern sources, and the search for shared grammatical idiosyncracies yielded no results. The lexical data were similarly unkind to the hypothesis: Andoke and Witotoan did not share many potential cognates with Cariban or Yagua, leaving only Cariban and Yagua with any prospects at all of further relatedness. With this in mind, one potentially fruitful direction for future research would be to return to the original Yameo and Peba materials from which Rivet extracted his wordlists, in search of more and better materials to reconstruct Peba-Yagua lexicon and grammar. With the expanded time-depth that would be represented by reconstructed forms, we would be able to better ascertain whether the Proto-Peba-Yagua forms would retain their surprising degree of correspondence to Proto-Cariban, or whether (like the Finnish forms in the dispute between Campbell and Greenberg) the apparent similarity seen in forms from a single language would fade as sister languages come more clearly into focus.

As we have seen, the four families considered in this paper are geographically localized in the northern Amazon region, from west of the Putumayo-Caquetá river region, to the eastern Guianas. Three groups, Andoke, Peba-Yaguan and Witotoan, are found primarily in and to the west of the Putumayo-Caquetá region; and a small handful of Cariban languages are known to have been in this region. The modern geographical proximity itself leaves open the possibility that some of the arguable lexical similarities could be the result of contact and borrowing-if not just chance similarities.

In working towards our first goal, which is more limited in scope, we have also worked towards a second goal with less limited aspirations: we have compiled a database of state-of-the-art reconstructions (created for Cariban, collected from Aschmann, 1993, for Witotoan) and reliable modern forms (for Yagua and Andoke), and we have contributed to what we hope will be a continuing discussion about the kinds of evidence that might one day allow a more convincing proposal of genetic relationships between these languages and other language(s) of the Americas.

ACKNOWLEDGMENTS

This work would not have been possible without the support of NWO grant Nº 235-70-002, Leiden University, Radboud Universiteit Nijmegen, University of Oregon, Universidade Federal do Pará, Museu Paraense Emílio Goeldi, CNPq, and SECTAM-PA, which jointly financed the meetings in 2003, 2004 and 2005 that stimulated us to pursue this research topic; thanks also to the Research Centre for Linguistic Typology, La Trobe University, where Gildea was a fellow during much of his contribution to this paper, and to Alexandra Aikhenvald for comments and stimulating discussion. Additionally, we thank David Young for data entry, and Jon Landaburu who provided us with unpublished lexical information from Andoke. We owe an enormous debt of gratitude to Sérgio Meira for furnishing us with his Toolbox databases, which included individual lexical collections from 20 Cariban languages plus a comparative database with preliminary collections of cognates. Without these databases, we could not have even proposed most of the Proto-Cariban reconstructions we offer here. Some errors in data and/or analysis will certainly be found by future researchers - we are solely responsible for the errors.

Recebido: 10/10/2006

Aprovado: 01/06/2007

APPENDIX 1. The Swadesh 100 list for Proto-Cariban, Andoke, Yagua, and Proto-Witotoan.

Appendix 1 Appendix 1 presents Swadesh 100 word lists for Proto-Cariban, Andoke, Yagua, Proto-Witotoan and its two sub-branches Proto-Huitoto-Ocaina and Proto-Bora-Muinane. Here we are interested in exploring the maximum number of possible cognates - both for our immediate purposes and for future comparative work that we may not envision at the moment. We are explicitly not trying to compile a table that might be useful for glottochronological or lexicostatistical calculations. As such, this table is not restricted to only the most commonly used forms (which would be difficult to discover in any case for proto-languages), but rather represents a table of forms within which cognates may be sought. Throughout the Appendix, hyphens indicate morphemic divisions.

All Proto-Cariban forms in this Appendix are reconstructed by Spike Gildea for this paper, except the pronouns (reconstructed by Meira, 2002). Where a Proto-Cariban form is not readily reconstructible, we sometimes offer reconstructions to lower-level groups or branches, again (with the exception of Proto-Taranoan) by Spike Gildea: Proto-Taranoan (PT: Akuriyó, Karihona, Tiriyó; Meira, 2000), Proto-Parukotoan (PPar: Hixkaryana, Waiwai and Katxúyana), Proto-Pekodian (PPek: Bakairi, Arara, Ikpéng), Proto-Pemón (PPem: Kapón [Akawaio, Patamuna, Ingarikó], Pemón [Arekuna, Kamarakoto, Taurepán] and Makushi), and Proto-Venezuelan (PV, a tentative subgrouping given in Gildea (2003), modified by Mattéi Muller (2002): Kapón, Pemón, Makushi, Panare, †Tamanaku, Mapoyo/Yabarana, and possibly Ye'kwana, †Kumanagoto, and †Chaima). Note that the Venezuelan Branch is a hypothesis that has not been well-substantiated, and as such, even though we expect that PV reconstructions should represent valid historical forms, the label PV may not stand the test of time. Our Proto-Carib forms are indicated by a single asterisk with no further marking. These reconstructions are somewhat tentative, as they are based on incomplete cognate sets, with few reconstructions for intermediate groupings; however, they do follow the most recent developments in our collective understanding of Proto-Carib phonology (especially Meira and Franchetto, 2005) and morphology (especially Meira, Gildea and Hoff, to appear). The data that inform our reconstructions come from cognate sets published in Meira and Franchetto (2005), plus additional forms found in Meira's unpublished comparative Cariban Toolbox database (Meira, 2006). We recognize that all these reconstructions will remain suspect until publication of the cognate sets upon which they are based, and also that they will doubtless need (we hope, minor) adjustments as the body of rigorous comparative work on the family increases.

Our Andoke data for any of Greenberg's proposed Macro-Carib etymons (both in Appendix 1 Appendix 1 and Appendix 2APPENDIX 2. Greenberg's Macro-Carib etymologies compared with modern data. pode Huitoto, Hn = Mnca Huitoto, Hr = Murui Huitoto. Additional sources consulted in such cases are Burtch (1983) for Murui Huitoto (Hr), Leach (1969) for Ocaina (O), and Thiesen and Thiesen (1998) for Bora (B).) were provided by Jon Landaburu in personal communication, from his field notes. All other Andoke forms are from Landaburu (1979, 1992) and Landaburu and Pineda (1984).

Unless otherwise specified, our Peba-Yaguan is all Yagua, compiled by Doris Payne. Forms are usually taken from Powlison's (1995) dictionary, but re-written in a generally phonemic form with a few exceptions noted below. In some cases, where Powlison's data differ from Doris and Thomas Payne's data (particularly for vowel length), the Paynes' data are used. Conventions for writing the Yagua data in this table follow the IPA with the following qualifications (cf. D. Payne (1985), Payne and Payne (1990), T. Payne (1993), for further information on Yagua phonology):

• /j/ at the end of a word has a very lenis pronunciation. /j/ after a consonant reflects either full metathesis of a syllable-final /j/ with that consonant, or palatalization of the consonant.

• Accute accent over a vowel indicates high pitch (best analyzed as high tone). For Yagua, a tilde under a vowel indicates phonemic nasalization (for other languages we have followed the IPA convention of writing nasalization over vowels, but for typographical clarity, we have opted for this for Yagua).

• /n/ has allophones [n] before phonemic nasal vowels, and [nd] before phonemic oral vowels. /m/ has allophones [m] before phonemic nasal vowels, and [mb] before phonemic oral vowels. The allophones are written explicitly in this table.

• Payne and Payne (1990) analyze Yagua as having six phonemic vowels:

Powlison analyzes Yagua as having only 5 vowels, so his dictionary does not distinguish between /i/ and // (// is always fronted to /i/ in the vicinity of /j/ and [] under any analysis is not as frequent as [i]). As a result, the following Yagua list generally has /i/ and // collapsed; we have written // only in those cases where D. Payne is certain of the distinction. Additionally, tone sandhi is common in Yagua, such that the same root might bear multiple tone patterns, depending on the surrounding morphemes. Where possible, in this table we use the tone patterns found on the root in isolation.

A Yagua inalienably possessed noun [inalien] always has a nominal or pronominal possessor (the system of prefixes is the same as for alienably possessed nouns). If a possessed noun begins with /hV/, the vowel can vary in nasalization and quality depending on the form of the possessive prefix. Wa- (variant ha- ?) is a Yagua derivational prefix for abstract qualities. It is broken off in Appendix 1 Appendix 1 , even though sometimes it might be a synchronically well-frozen part of a word (certain other Yagua entries are also morphologically complex).

Our Proto-Witotoan is Aschmann's (1993) reconstruction; a double asterisk ** indicates Aschmann's reconstruction for Proto-Witotoan, while BM plus * indicates his reconstruction for the Bora-Miraña branch and HO plus * indicates his reconstruction for the Witoto-Ocaina branch. Witotan elements in parentheses indicate that Aschmann found the material in more than one language of the (sub-)family but it was not sufficiently attested across all languages of the (sub-)family such that he was able to reconstruct it. These data were entered by David Young. For Witotoan data, /ï/ is high back unrounded, in contrast with //, high central unrounded.

Numbers in Column 1 of the following table correspond to Swadesh numbers. For cross-referencing purposes, following the English gloss in column 2, we include in parentheses the numbers from Aschmann's (1993) Witoto etyma.

Appendix 1 - Click to enlarge

Appendix 1

This Appendix lists all Greenberg's (1985) proposed Macro-Carib etymologies, divided into two parts. Part I contains just those putative Macro-Carib forms that he did not propose also extended to Proto-Amerind. Part II contains putative Macro-Carib forms that he believed did pertain to the Proto-Amerind etymologies. For each proposed etymon, Greenberg's data is given in the first row in bold italics. Our 'modern' data is given in the second row. Numbers in the first column refer to Greenberg's Macro-Carib etyma numbers. We continue to be interested in offering the maximum number of possible cognates, so again, we have sought to put as many forms as possible in each cell of the table. We are not concerned by repetition of forms in the table, as these rows do not represent putative cognates, but only a selection of forms within which cognates may be sought.

Appendix 2.1 - Click to enlarge

Appendix 2.1 - Click to enlarge

Appendix 2.2 - Click to enlarge

Appendix 2.2 - Click to enlarge

  • ADELAAR, Willem. Review of Language in the Americas, Joseph Greenberg, 1987. Língua, n. 78, p. 249-255, 1989.
  • AIKHENVALD, Alexandra. Areal diffusion, genetic inheritance, and problems of subgrouping: a North Arawak case study. In: AIKHENVALD, Alexandra; DIXON, R. M. W. (Eds). Areal Diffusion and Genetic Inheritance Cambridge: Cambridge University Press, 2001. p. 167-194.
  • ASCHMANN. Richard. Proto Witotoan Dallas: Summer Institute of Linguistics and University of Texas at Arlington, 1993.
  • BERMAN, Howard. A comment on the Yurok and Kalapayu data in Greenberg's Languages in the Americas. International Journal of American Linguistics, Chicago, v. 58, p. 230-233, 1992.
  • BURTCH, Shirley (compiler). Diccionario huitoto murui Serie Lingüística Peruana, 20. Yarinacocha: Ministerio de Educación and Instituto Lingüístico de Verano, 1983. 2 v. 262, 166 p.
  • CAMPBELL, Lyle. Review of Language in the Americas, Joseph Greenberg, 1987. Language, n. 64, p. 591-615, 1988.
  • CAMPBELL, Lyle. How to show languages are related: Methods for distant genetic relationship. In: JOSEPH, Brian; JANDA, Richard (Eds.). The Handbook of Historical Linguistics Oxford: Blackwell Publishing, 2003. p. 262-282.
  • CARLSON, Robert; PAYNE, Doris L. Genitive classifiers. In: ANNUAL PACIFIC LINGUISTICS CONFERENCE, 4, 1989, Eugene. Proceedings... Eugene: University of Oregon, 1989. p. 87-119.
  • CHAFE, Wallace. Review of language in the Americas, Joseph Greenberg, 1987. Current Anthropology, Chicago, n. 28, p. 652-653, 1987.
  • CHAUMEIL, J-P. Historia y migraciones de los yagua de finales del Siglo XVII hasta nuestros días Lima: Centro Amazónico de Antropología y Aplicación Práctica, 1983.
  • DIXON, R. M. W. The Rise and fall of languages Cambridge: Cambridge University Press, 1997.
  • DIXON, R. M. W; AIKHENVALD, Alexandra Y. Introduction. In: DIXON, R. M. W.; AIKHENVALD, Alexandra Y. (Eds.). The Amazonian Languages. Cambridge: Cambridge University Press, 1999. p. 1-21.
  • DURBIN, Marshall; SEIJAS, Haydée. Proto-Hianacoto: Guaque-Carijona-Hianacoto Umaua. International Journal of American Linguistics, v. 39, n.1, p. 22-31, 1973.
  • GILDEA, Spike. The Cariban and Tupí-Guaraní object nominalizing prefix. Lingüística Tupí-Guaraní/Caribe. Revista Latinoamericana de Estudios Etnolingüísticos, Lima, v. 8, p. 163-177, 1994.
  • GILDEA, Spike. On Reconstructing Grammar: comparative Cariban Morphosyntax. Oxford: Oxford University Press, 1998.
  • GILDEA, Spike. The Venezuelan branch of the Cariban language family. Amérindia, n. 28, p. 7-32, 2003.
  • GIRARD, Victor. Proto-Carib phonology. 1971. Thesis (Ph.D.) - University of California, Berkeley, 1971.
  • GODDARD, Ives. Review of Language in the Americas, Joseph Greenberg, 1987. Linguistics, n. 28, p. 556-558, 1990.
  • GOEJE, Claudius Henricus de. Études linguistiques Caraïbes Verhandelingen van de Koninklijke Akademie van Wetenschappen, afdeeling letterkunde, nieuwe reeks 49 (2), deel X, n. 3, Amsterdam: Johannes Müller, 1946.
  • GREENBERG, Joseph. Essays in linguistics Chicago: University of Chicago Press, 1957.
  • GREENBERG, Joseph. Language in the Americas Stanford: Stanford University Press, 1987.
  • GREENBERG, Joseph. Classification of American Indian languages: A reply to Campbell. Language, n. 65, p. 107-114, 1989.
  • GREENBERG, Joseph. The concept of proof in genetic linguistics. In: GILDEA, Spike (Ed.). Reconstructing Grammar: comparative linguistics and grammaticalization. Amsterdam: John Benjamins, 2000. p. 161-175.
  • HARRISON, S. P. On the limits of the comparative method. In: JOSEPH, Brian; JANDA, Richard (Eds.). The Handbook of historical linguistics. Oxford: Blackwell Publishing, 2003. p. 213-43.
  • KIMBALL, Geoffrey. A critique of Muskogean, "Gulf", and Yukian material languages in the Americas. International Journal of American Linguistics, Chicago, v. 58, n. 4, p. 447-501, 1992.
  • LANDABURU, Jon. La langue des Andoke (Amazonie colombienne) Paris: SELAF, 1979.
  • LANDABURU, Jon. La predicación en la lengua andoque y parámetros de utilidad para una tipología de la predicación. Bogotá: CCELA; Universidad de los Andes, 1992. p. 155-171. (Memorias 2, Lenguas Aborígenes de Colombia).
  • LANDABURU, Joan; PINEDA, Roberto. Tradiciones de la Gente del Hacha: mitología de los indios Andoques del Amazonas. Bogotá: Instituto Caro y Cuervo, 1984.
  • LEACH, Ilo. Vocabulario Ocaina Yarinacocha, Peru: ILV, 1969. (Serie Lingüística Peruana 4).
  • MATISOFF, James. On megalocomparison. Language, n. 66, p. 109-120, 1990.
  • MATTÉI MULLER, Marie-Claude. Diccionario ilustrado Panare-Español, indice Español-Panare Caracas: Comisión Quinto-Centenário, 1994.
  • MATTÉI MULLER, Marie-Claude. Pemono: eslabón perdido entre mapoyo y yawarana, lenguas caribes ergativas de la Guayana noroccidental de Venezuela. Amérindia, Paris, n. 28, p. 33-54, 2002.
  • MATTÉI MULLER, Marie-Claude; HENLEY, Paul. Los Tamanaku: su lengua, su vida. San Cristobal: Universidad Católica del Táchira, 1990.
  • MEIRA, Sérgio. A Grammar of Tiriyó 1999. Thesis (Ph.D. Linguistics) - Rice University, 1999.
  • MEIRA, Sérgio. A Reconstruction of Proto-Taranoan: phonology and morphology. Munich: LINCOM Europa, 2000a.
  • MEIRA, Sérgio. The accidental intransitive split in the Cariban family. In: GILDEA, Spike (Ed.). Reconstructing grammar: comparative linguistics and grammaticalization theory. Amsterdam: John Benjamins, 2000b. p. 201-230.
  • MEIRA, Sérgio. A first comparison of pronominal and demonstrative systems in the Cariban language family. In: CREVELS, Milly; KERKE, Simon van der; VOORT, Hein van der; MEIRA, Sérgio. Current studies in South American languages Leiden: Research School of Asian, African, and Amerindian Studies (CNWS), 2002. p. 255-275. (Indigenous Languages of Latin America (ILLA) series, v. 3).
  • MEIRA, Sérgio. Comparative Cariban Database. 19 Toolbox databases, with lexical and grammatical data on 18 Cariban languages, plus preliminary cognate sets for 442 lexical and grammatical items, 2006. (unpublished personal database).
  • MEIRA, Sérgio; FRANCHETTO, Bruna. The southern Cariban languages and the Cariban family. International Journal of American Linguistics, Chicago, v. 71, n. 2, p. 127-192, 2005.
  • MEIRA, Sérgio; GILDEA, Spike; HOFF, B. J. On the origin of ablaut in the Cariban family. International Journal of American Linguistics. (to appear).
  • NIMUENDAJÚ, Curt. A propos des indiens Kukura du Rio Verde (Brèsil), Journal de la Société des Américanistes, n. 24, p. 187-9, 1932.
  • PAYNE, Doris L. Aspects of the grammar of Yagua: a typological perspective. 1985. Thesis (Ph.D.) - UCLA, 1985.
  • PAYNE, Doris L. Noun classification in Yagua. In: CRAIG, Colette. Noun classes and categorization. Amsterdam: John Benjamins, 1986. p. 113-131.
  • PAYNE, Doris L. Noun classification in the Western Amazon. Language Sciences, v. 9, n. 1, p. 21-44, 1987.
  • PAYNE, Doris L. Review of Amazonian Languages. Language, n. 77, p. 594-598, 2001.
  • PAYNE, Doris L. Source of the Yagua nominal classification system. In: SEIFART, Frank; PAYNE, Doris (Eds.). International Journal of American Linguistics, Chicago, v. 73, n. 4, p. 447-473, October 2007.
  • PAYNE, Doris L. PAYNE, Thomas E. Yagua. Handbook of Amazonian Languages, Berlin: Mouton de Gruyter, n. 2, p. 249- 474, 1990.
  • PAYNE, Thomas. Pronouns in Yagua discourse. International Journal of American Linguistics, Chicago, v. 53, n. 1, p. 1-21, 1987.
  • PAYNE, Thomas. The Twins stories: participant coding in Yagua narrative. Berkeley: University of California Press, 1993. (Univ. of Califórnia Publications in Linguistics, 120).
  • POWLISON, Paul. Nijyami Niquejadamusiy - May Niquejadamuju, May Niquejadamusiy - Nijyami Niquejadamuju. (Diccionario Yagua - Castellano, Castellano-Yagua). Lima: Ministerio de Educación, ILV, 1995. (Serie Lingüística Peruana 35).
  • PREUSS, Konrad Theodor. Religion und mythologie der Uitoto Göttingen: Vandenhoeck & Ruprecht, 1921.
  • RANKIN, Robert. Review of Greenberg, 1987. International Journal of American Linguistics, Chicago, v. 58, p. 324-51, 1992.
  • RINGE, Donald. On calculating the factor of chance in language comparison. Transactions of the American Philosophical Society, n. 82, p. 1-110, 1992.
  • RIVET, Paul. Affinités du Miránya. Journal de la Société des Américanistes, Paris, n. 8, p. 117-131, 1911a.
  • RIVET, Paul. La Famille linguistique Peba. Journal de la Société des Américanistes, Paris, n. 8, p. 172-206. 1911b.
  • RODRIGUES, Aryon Dall'Igna. Evidence for Tupí-Carib relationships. In: KLEIN, Harriet E.; STARK, Louisa R. (Eds.). South American Indian languages: retrospect and prospect. Austin: University of Texas Press, 1985. p. 371-404.
  • RODRIGUES, Aryon Dall'Igna. Grammatical affinities among Tupí, Carib and Macro-Jê: Unpublished manuscript. Brasília: Universidade de Brasília, 1994.
  • SALZANO, Francisco M.; HUTZ, Mara H.; SALAMONI, Sabrina P.; ROHR, Paula; CALLEGARI-JACQUES, Sidia M. Genetic support for proposed patterns of relationship among lowland South American languages. Current Anthropology, Chicago, v. 46, n. 1, p. 121-129, 2005. Suppl., December. 2005.
  • SEIFART, Frank. The Structure and use of shape-based noun classes in Miraña (North West Amazon). [S. l.]: Max Planck Institute, 2005. (Series in Psycholinguistics, 32).
  • SEIFART, Frank; PAYNE, Doris (Eds.). Nominal Classification in the North West Amazon: Issues in Areal Diffusion and Typological Characterization. International Journal of American Linguistics, Chicago, v. 73, n. 4, p. 381-7, October 2007.
  • THIESEN, Wesley; THIESEN, Eva. Diccionario Bora-Castellano; Castellano-Bora. Yarinacocha: ILV, 1998. (Serie Lingüística Peruana 46).
  • THIESEN, Wesley; WEBER, David. Bora Grammar Dallas: SIL Publications. (to appear).

Appendix 1

APPENDIX 2.  Greenberg's Macro-Carib etymologies compared with modern data.

pode Huitoto, Hn = Mnca Huitoto, Hr = Murui Huitoto. Additional sources consulted in such cases are Burtch (1983) for Murui Huitoto (Hr), Leach (1969) for Ocaina (O), and Thiesen and Thiesen (1998) for Bora (B).

Appendix 2.1 - Click to enlarge

Appendix 2.2 - Click to enlarge

  • 1
    While it is not the purpose of this paper to evaluate more than the Macro-Carib hypothesis, we wish to clarify that we do not subscribe to Greenberg's 'Ge-Pano-Carib' hypothesis. To date, no solid research suggests that Panoan (or the Tacanan family, which does reconstruct with Panoan into Proto-Pano-Tacanan) has an affiliation with Ge, Cariban, Peba-Yaguan, Andoke, or Witotoan languages. Indeed, the hypothesis that Cariban, Tupian, and Jê languages might form a phylum is more promising (Rodrigues, 1985; Salzano
    et al., 2005).
  • 2
    Peba and Yameo materials are extremely limited, impeding serious comparative work.
  • 3
    Cf. Girard (1971) for a scathing review of all earlier comparative work on Cariban, and of many of the older sources; we endorse the intellectual content of these criticisms.
  • 4
    The body of Durbin's comparative Cariban work is hopelessly flawed, cf. Gildea (1998, p. 7), Meira (2000a, p. 10), and the sources cited therein. Durbin and Seijas (1973) ostensibly reconstruct phonology and some lexical items to "Proto-Hianocoto", but as Meira (2000a, p. 15) points out, the various "language" names that are compared are attested as Karihona clan names. The apparent divergence of these "languages" indicates the weakness of the transcriptions by the sources, and the 1973 "reconstructions" should be set aside in favor of the more reliable synthesis of modern and older Karihona data offered in Meira (2000a).
  • 5
    Dixon (1997, p. 16) reports that tones are among the easiest phenomena to diffuse through contact; Dixon and Aikhenvald (1999, p. 8) consider the presence of /
    / and nasality in vowels to be phonological features of an Amazonian linguistic area. See Payne (2001) for comments on the preliminary nature of these claims.
  • 6
    Thiesen and Weber (to appear) claim that Bora ta(
    ) palatalizes the following consonants only in some morphemes; consonants are also often palatalized after /i/. Thus, it should be noted that Bora 2
    nd person
    tiʔ and 3
    rd person
    have palatalizing effects on certain following morphemes; examples (5) and (6).
  • 7
    For the most recent reconstructions, see Meira, Gildea and Hoff (to appear).
  • 8
    In Powlison's (1995) 600-page Yagua-Spanish dictionary, the only vowel-initial Yagua word is the interjection
    h
    .
  • 9
    Thus, though a palatalization effect is evident for first person in Yagua (3a), there is no evidence of a morphologically distinct palatalizing element for only first and second person in Bora (5a) or with certain other consonant-initial nouns. (See note 6 above regarding palatalization idiosyncracies in Bora).
  • 10
    Rodrigues uses the term 'Relational prefix' also for morphemes that we consider to be third person markers; our claims regarding typological uniqueness are limited to the forms described in this section.
  • 11
    Sometimes we were not able to find forms in our own sources anything like the forms he offered, other times we found similar forms but with differing transcriptions/translations, and yet other times the matches were good between our fresh data and the forms Greenberg presents.
  • 12
    In this decision, we explicitly differentiate our methodology from Greenberg's, who accepts more semantic latitude in his potential cognates. While it is true that attested change does produce cognates with ranges of meanings comparable to the rejected forms, we are uncomfortable with the necessity to assume such meaning change in order to relate specific forms, and then to base our potential case for relationship on such assumptions.
  • Publication Dates

    • Publication in this collection
      16 Nov 2010
    • Date of issue
      Aug 2007

    History

    • Received
      10 Oct 2006
    • Accepted
      01 June 2007
    MCTI/Museu Paraense Emílio Goeldi Coordenação de Pesquisa e Pós-Graduação, Av. Perimetral. 1901 - Terra Firme, 66077-830 - Belém - PA, Tel.: (55 91) 3075-6186 - Belém - PA - Brazil
    E-mail: boletim.humanas@museu-goeldi.br