A comparison of verbal person marking across Tupian languages

this paper explores the diachrony of the verbal person marking system across the large and structurally diverse tupian language family. i argue that the historical development of these different patterns are best informed by analyzing their synchronic distributions with regard to the current evolutionary hypotheses on the family. i apply a parsimony reconstruction model across the topology of two different classifications and compare the results with what is known from traditional historical linguistic work. this study is able to provide support for previous claims about the family and also generates a number of additional hypotheses about the intermediate stages of development of these patterns.


iNTRoDUCTioN
Just as phonological systems and lexical inventories evolve over time in ways that can be indicative of the history that related languages share, so do grammatical structures.Historical morphosyntax examines both the development of a particular form in a language and the changes in its function.the phonological form of a morpheme can be reconstructed using reoccurring sound correspondences that occur systematically in the languages under comparison.the function of a morpheme can change over time through processes of extension and reanalysis (Harris;Campbell, 1995;Gildea, 1998) or through contact-induced structural change (Heath, 1984).With this in mind, the reconstruction of the function of a morpheme should be based on the distribution of structural patterns across a language family and informed by the presence of cognate forms in the various functions.
the development of divergent pattern in the ways that different languages within a linguistic family express grammatical relations is one area that can especially benefit from a comparative approach to morphosyntactic diachrony.this paper explores the development of the verbal person markers across the tupian language family from a typological and historical perspective in order to provide insights into how the different forms and functions of these grammatical systems arose over time.
the tupian language family shows a high degree of internal variation regarding the ways in which the obligatory participants of a clause, i.e. the arguments, are expressed on the predicate.Many of the tupian languages spoken in rondônia only index the absolutive (s/o) argument on independent main clause predicates.However, the branches primarily spoken outside of rondônia show a number of divergent patterns, such as the hierarchical marking pattern where only a single argument in transitive clauses are indexed depending on whether it is a speech act participant (sAPs; first and second persons) or not.some of these languages also display different classes of intransitive verbs that index their sole argument using different marker sets, and/or have developed portmanteau morphemes that index certain situations where sAPs act on one another.furthermore, other languages show marking patterns such as the indexation of only accusative arguments, as in Juruna, or the indexation of only nominative arguments, as in nheengatú.
this paper first provides an overview of the tupian languages by presenting two classificatory proposals for the family and then examining the typological diversity that the languages display in their verbal person marking systems.this is followed by a brief outline of previous proposals on the historical development of different person marking patterns in the family.the changes that took place in earlier stages of development of the tupian family are first analyzed using a parsimony reconstruction model applied to the two classificatory proposals for the family.this preliminary analysis is then reconciled with additional information coming from the presence of cognate forms in the various verbal person marker sets. it is argued that the development of both the form and function of the verbal person markers can gain considerable insight from being examined with respect to hypotheses on the diversification of the family tree into its respective branches.

THE TUPiAN FAMiLY: AN oVERViEW oN THE CLASSiFiCATioN oF THE TUPiAN FAMiLY
seven distinct branches of the tupian family were first identified in rodrigues (1958), whose classification was later expanded to ten branches by considering Mundurukú, Awetí and Mawé as separate clades distinct from tupí-Guaraní (rodrigues 1984, 1985).More recent work has focused on joining the tupí-Guaraní branch into a single subgroup with Mawé and Awetí, which is referred to in this paper as Mawetí-Guaraní (rodrigues;dietrich 1997dietrich , Corrêa-da-silva, 2010, drude;, drude;Meira, 2015).Based on his reconstruction of the Proto-tupí phonological system, rodrigues (2007) proposes a binary division of the family into an Eastern branch and a Western branch.While the author admits that work is still needed on the reconstruction of the intermediate stages of the family, this proposal provides a fully resolved hypothesis on how the branches developed with relation to one another 1 .this classification is given in figure 1, where the tip labels of each branch of the tree correspond to the name of the respective branch of the family rather than a specific language.
since this classification draws on the specialized knowledge of tupian experts who have worked on the family for over half a century, it will be referred to in this paper as the 'expert classification'.While a number of systematic sound correspondences are presented in rodrigues ( 2007), the specific motivation for the ordering of the splits is not discussed in detail.As such, it should only be considered a working hypothesis until the details of the classification can be more fully developed.Cabral (2002) notes that the distinction between the languages that show a primarily absolutive indexation pattern and those with more divergent patterns was used "as a basis for a first division of the tupian stock into two principal branches".in the absence of more explicit motivation for the ordering of splits within the family, it could prove problematic to use a tree whose topology was influenced by argument marking patterns to discuss the development of these patterns since the two are not independent.Argument marking patterns are also problematic to use as evidence for the Western branch of the family since the presence of absolutive indexation could be a shared retention rather than a shared innovation.for these reasons, an additional classification is also used for the parsimony analysis.
Another classification of the tupian family was published in Walker et al., (2012)  on a distance matrix computed using normalized Levenshtein distances (edit distances) between pairs of corresponding lexical entries that have been transcribed with a simplified orthography (Holman et al., 2008).the topology of the tree is then calculated by applying the neighbor Joining algorithm to this distance matrix (saitou; nei 1987).the tree was rooted using Proto-Carib as an outgroup since the two protolanguages have been postulated as descending from a common ancestor (rodrigues, 1985).A cladogram of the AsJP tree is given in figure 2. note that the tip labels of the tree correspond to the names of the languages from which the data were gathered, rather than the names of the branches (subfamilies) as in figure 1. for ease of visualization, only the languages used in this study are included 2 .Unlike the expert classification of the tupian family presented in figure 1, the AsJP classification has the benefit of being fully transparent and replicable, with all data and sources freely available online for inspection 3 .it is notable that the AsJP correctly identifies all of the ten branches that are traditionally distinguished in tupian studies, as well as a higher order relationship between Mawé, Awetí and tupí-Guaraní.But unlike the East-West division in the rodrigues (2007) classification, the AsJP classification shows a more ladder-like model of diversification for the family, without the rondônian groups forming a distinct clade.Both the expert classification and the AsJP classification include Mundurukú and Juruna as the nearest phylogenetic relatives to Mawetí-Guaraní, although AsJP first groups the two former branches into an intermediate clade before forming a clade containing all of the languages in rodrigues' Eastern branch.since both classifications agree on a higher order relationship between Mawetí-Guaraní, Mundurukú and Juruna, and since these families are primarily located outside of rondônia, which is often considered the homeland for the tupian peoples (noelli, 1996), they are referred to in this paper as the 'expansionist' subgroup of the family.
2 the Emerillon language was not included in the original AsJP tree in Walker et al., 2012.its position in the tree is based on its closest phylogenetic relative in their analysis, Wayampi.Both languages belong to subgroup 8 of tupí-Guaraní (Jensen, 1999). 3http://email.eva.mpg.de/~wichmann/languages.htm figure 2. AsJP classification of the tupian family adapted from Walker et al., (2012).

VERBAL PERSoN MARKiNG PATTERNS
As mentioned above, the tupian languages show a wide array of different verbal person marking patterns.in this paper, the tupian languages are approached from a typological perspective that aims to explore the ways in which the different languages express arguments on the predicate within independent main clauses.the following terms are adopted to represent three important comparative concepts that are used in the coding of different structural properties, following standard use in modern typology (Comrie, 1989;dixon, 1994, among many others): s represents the sole argument of an intransitive predicate; A represents the Agent-like argument of the predicate in a prototypical transitive construction; and, o represents the Patient-like argument of the predicate in a prototypical transitive construction.transitivity is used here as a semantic concept to identify a specific morphosyntactic construction within a language for comparison across languages, following Lazard (2002).
for the sake of comparison of different alignment patterns in the sets of verbal argument markers, no terminological distinction is made between pronominal agreement, cross-reference and bound pronouns, in the sense that the former two can co-occur with a nominal or a free pronoun while the latter cannot (Haspelmath, 2013).imposing such a distinction on the data would obscure any diachronic discussion of alignment.for example, some of the tupian languages with an absolutive marking pattern do not allow for the verbal marker to co-occur with a realized o nominal but only a s nominal, such that these languages would therefore be described as having an unusual marked-s system (Galucio, 2001, p. 77-79 for a discussion on the person markers in Mekens).these different characterizations of verbal argument marking are included under the term 'indexation' in this paper.As in many tupian studies, the term 'person markers' is also used here to include both pronominal clitics and affixes that are bound to the verb.
An absolutive marking pattern, where s and o are indexed with the same morphological paradigm, is observed in main clause verbal inflection in languages of the tuparí, ramarama, Mondé and Arikém subfamilies4 .An example of a language with ergative alignment through the indexation of the absolutive argument (s/o) is Wayoró in (1): (1) Wayoró (tuparí; nogueira 2011:69-70)5 a.

ʧi-pi:to-kar-a-t ʧire
1Pl.incl-rest-vblZ-tv-PSt 1Pl.incl 'We (incl.)rested' b. ʧi-po-kw-a-t agopkap 1Pl.incl-burn-vblZ-tv-PSt fire 'the fire burned us (incl.)' While many of the tupian languages that reside in rondônia show an absolutive indexation pattern like Wayoró, others in the family use distinct marker sets to index A and o arguments in the clause.in most languages, these marker sets are in complementary distribution with each other and occur in the same morphological slot, generally as a prefix or proclitic 6 . in languages like sateré-Mawé, Mundurukú and some members of the tupí-Guaraní branch, the basic main clause construction shows multiple alignment patterns where s of one class of intransitive verbs is indexed with the same set of markers as o, showing ergative alignment, while s of a different class of intransitive verbs is indexed with the same set of markers as A, showing accusative alignment 7 .furthermore, the indexation of A and o in transitive clauses is conditioned by which argument is ranked higher on a person hierarchy that privileges speech act participants (1 st and 2 nd persons) over non-speech act participants (3 rd persons), resulting in a person hierarchy that can be roughly characterized as sAPs> 3.As such, these languages show two different factors that condition indexation: the scenario between co-arguments in transitive clauses and the lexical class of intransitive predicates 8 .
first, let us consider the hierarchical indexation pattern in transitive clauses.take for example the use of different sets of markers to index A and o in Mundurukú, as shown in example (2): 7 Many tupian languages, particularly of the expansionist group, have two distinct classes of intransitive verbs that index their subjects with different marker sets. the class of verbs composed of members that displays semantic properties associated with the Agent argument, namely volition and control (cf. foley, 2005), are referred to as the major class of intransitive verbs, abbreviated as s maj .As discussed below, tupian specialists have employed a number of different labels to describe this class of verbs such as 'active', 'agentive', 'procedural' and so forth.the class of verbs whose s is indexed differently from the major class intransitive verbs will be referred to as the minor class, whose argument is abbreviated as s min .in tupian languages, the verbs of the minor class often displays semantic properties typically associated with the Patient argument, namely affectedness and lack of control.However, semantic differences alone are not sufficient to posit a split intransitive system unless these semantic differences are reflected in the way that the s argument is indexed on the predicate. 8see Zúñiga (2006) for the use of the term 'scenario' in reference to the typology of hierarchical marking. in this paper, the term local scenario is used when sAP A acts on sAP o, direct scenario when sAP A acts on 3 rd person o, inverse scenario when 3 rd person A acts on sAP o, and non-local scenario when 3 rd person A acts on 3 rd person o. 9 in non-local scenarios A is indexed on the verb.Unlike many tupí-Guaraní languages that use a special portmanteau morpheme to index both A and o in local scenarios (see ex.5 below and rose, this volume).Mundurukú always indexes o when it is a sAP (Gomes, 2006, p. 48).furthermore, as can be seen in table 2, there are a number of homophonies across the paradigms such that 1 st and 2 nd persons singular and 1 st person plural exclusive arguments are indexed using the same form whether s maj , s min , A or o. 10 the major class of verbs, what Gomes (2006) calls 'procedural' intransitive verbs, are those that denote a dynamic event involving an agent that exerts control over the event.the minor class of intransitive verbs, the 'stative' verbs, denote a state or quality of the subject (Gomes, 2006, p. 63).Picanço (pers. comm. 2011) notes that certain verbs denoting events that involve a participant that does not exert control over the event, such as 'drown' or 'cry', are marked in the same way as the stative verbs.
a. (2) Mundurukú transitive indexation (Gomes, 2006, p. 48-49, 52, 74) notice how in example (2a), the A argument is indexed with the set i proclitic o=, while in (2b), the o argument is indexed with the set ii proclitic wuj=.this is due to the fact that the A argument in (2a) outranks the o co-argument on the person hierarchy, while the o argument in (2b) outranks the A co-argument since the indexed arguments are speech act participants 9 .However, verbs that are marked for imperfective aspect with the suffix -m never index the A argument (2c), while sAP o is indexed with a set ii proclitic (2d).
Additionally, Mundurukú has two classes of intransitive verbs that each take a different set of markers to index the sole argument of the predicate 10 .for the major class of verbs, the subject is indexed with the set i markers, while the subject of a minor class verb is indexed with the set ii markers, as seen in example (3): Given the examples shown in ( 2) and (3), Mundurukú major class intransitive verbs show accusative alignment (s=A), while the minor class verbs show ergative alignment (s=o).indexation patterns with splits in intransitive marking like that shown for Mundurukú in (3a-b) have often led to these languages being described as 'split ergative'.this split only occurs with verbs in the unmarked perfective aspect in Mundurukú, while verbs in the marked imperfective aspect show an entirely different verbal marking pattern: accusative alignment through the indexation of o (3c).
the Awetí language shows a split in the indexation of intransitive subjects that diverges somewhat from the pattern seen for perfective Mundurukú verbs.Awetí has three sets of argument markers: set i is used to mark A, set ii is used to mark s min and o, and set iii is used to mark s maj 11 .(4) Awetí (Borella, 2000 p. 136, 140, 148, 155) (3) Mundurukú intransitive indexation (Gomes, 2006 p. 50-51, 141) 11 Borella (2000, p.131) labels the s min intransitive verbs as 'descriptive verbs' and the s maj class as 'active intransitive verbs'.
12 the indexation pattern in Awetí is problematic for the often-repeated assumption that "those s which are semantically similar to A (exerting control over the activity) will be s A , marked like A…" (dixon, p. 1994).this is one of the primary reasons that the s maj /s min notation is adopted here, so that a distinction can be made between the alignment of the intransitive class of predicates and the semantics of the lexical class. 13While no similar portmanteau forms occur in Awetí, sateré-Mawé uses the prefix moroto index local scenarios where 1 st person A acts on a 2 nd person o co-argument.Corrêa-da-silva (2010, p. 248) suggests that the form was probably borrowed from a tupí-Guaraní language.
notice how in (4a) the 3 rd person A is indexed by a prefix wejt-, while the 3 rd person intransitive subject (s maj ) in (4b) is indexed by the prefix o-.furthermore, the 1 st person singular s min in (4c) is indexed with the same prefix i(t)as the 1 st person singular o in (4d).While there are certain homophonies within the prefix paradigm not presented above, these examples show that the class of active intransitive verbs in Awetí is not indexed using the same prefix paradigm as A, while still showing the same pattern of hierarchical marking with split intransitivity found elsewhere in the Mawetí-Guaraní and Mundurukú subgroups 12 .
Many tupí-Guaraní languages show hierarchical indexation patterns similar to that seen for Mundurukú perfective verbs in (2) and (3).But unlike Mundurukú, these languages often have an additional marker set of portmanteau prefixes (Jensen's set 4) that index a 1 st person A with a 2 nd person o co-argument in transitive clauses, as in (5) for Kamayurá 13 .the set ii markers in Kamayurá show different attachment to the verb depending on the person of the marker.the sAP forms are analyzed as proclitics that attach after the inclusion of the 'relational prefix' (5b), a phonological class marker that attaches to certain nouns and verbs depending on their inflection (Meira;drude, 2013).the 3 rd person form iis analyzed differently in various tupí-Guaraní languages, but seki (2000, p. 66) analyzes it as a 3 rd person form of the relational prefix.since the referential hierarchy in Kamayurá does not allow for the indexation of 3 rd person o in transitive clauses, the 3 rd person set ii form is best observed in the person inflection of minor class intransitive verbs, as seen in ( 6) 14 . 14the use of ias a 3 rd person o marker is well attested in the now extinct tupí-Guaraní language tupinambá, such as in the example a-i-kutúk 1Sg.i-3.ii-pierce'i pierced it' (Jensen, 1990, p.121).this pattern of marking 3rd person o together with the set i A marker in transitive clauses is reconstructed for PtG in Jensen (1998).this helps to explain the difference between the s maj and A marker sets in Awetí for 1st person plural exclusive, where the A form oʐoj-arose from the s form oʐo-attached to i-in the transitive.see also Monserrat (1976). 15in some tupí-Guaraní languages, the same set of prefixes is used to indicate coreference between the subject of a transitive clause and the possessor of o, such as in tapirapé ã-ma-pen we-pa 1Sg.i-cAuS-break 1Sg.coreF-hand 'i broke my hand' (Leite 1989 apud Jensen 1998, p. 504). in most other instances, the s min /o markers are also used to express possession.
Jensen (1998) reconstructs an additional marker set for Proto-tupí-Guaraní that was used to index the absolutive argument of dependent verbs when it is coreferential with the subject (s/A) of the matrix independent clause 15 .this can be seen in the Kamayurá 'gerund' constructions shown in (7).not all tupí-Guaraní languages show the same pattern of hierarchical indexation as Kamayurá.nheengatú appears to have lost the hierarchical marking pattern found across much of the rest of the branch while still utilizing two distinct marker sets to index the subjects of different classes of intransitive verbs.nheengatú additionally lost the inclusive-exclusive distinction in its person markers and does not have portmanteau forms for 1 st person A acting on 2 nd person o (Cruz, 2011, p. 132-3).
A final interesting pattern can be observed in the Juruna language.Juruna was initially described as only indexing o in fargetti ( 2001), similar to its sister language Xipaya described in rodrigues (1995).However, a more recent analysis of Juruna verb structure in Lima (2008) shows that there is a class of intransitive verbs, which she calls 'unaccusatives', that indeed allows for s (8b-c) to be indexed with the same marker set as that used for o (8a), while the other class of intransitive verbs ('unergatives') do not allow for s to be indexed (8d).(8) Juruna (fargetti, 2001, p. 178, 191;Lima, 2008, p. 176, 180) While absolutive marking and the hierarchical/split intransitive marking are the two dominant patterns observed across the tupian family, the divergent patterns in languages such as Juruna, Awetí and Mundurukú help to inform an analysis of how the dominant patterns arose.

oN THE DEVELoPMENT oF THE VERBAL PERSoN MARKERS
now that the different marking patterns across the tupian languages have been introduced, the discussion turns to the historical development of these patterns.While a reconstruction of the marker sets in Proto-tupí is well beyond the scope of this article, it is possible to identify cognate forms and compare their grammatical function across the family based on previous work.these differences are explored with relation to two hypotheses on the classification of the tupian language family, and the changes discussed are modeled over these trees using a parsimony model for ancestral state reconstruction.since considerable work already exists on the development of verbal person marking within the tupí-Guaraní branch, the following sections focus on the development of the markers across the family with regard to the changes that must have occurred to produce the patterns reconstructed for Proto-tupí-Guaraní (PtG) as well as those observed within the other branches of the family.

PREVioUS CLAiMS
the two most explicit proposals for the development of verbal argument marking across the tupian family are Jensen (1998) and Gildea (2002).Based on her experience with tupí-Guaraní languages, Jensen (1998, p. 565-573) proposes a five stage process for the development of the hierarchical indexation in PtG from a putative Pre-PtG, as shown in (9): (9) the development of PtG indexation per Jensen (1998) a.
Pre-PtG indexed only absolutive arguments on independent verbs.Ergative arguments were expressed with free pronouns or nominals.b.
Agentive intransitive verbs developed a new set of prefixes.A distinction between sAPs develops in the person hierarchy such that 1>2>3.e.
Portmanteau prefixes *oro-and *opo-are developed that index 1 st person A acting on a 2 nd person o co-argument.
thus in Jensen's analysis, the s min /o marker set is considered inherited from an earlier stage in the family, and an additional marker set for indexing agentive intransitive verbs was innovated from an unidentified source.the person hierarchy only developed in transitive clauses after the extension of the innovated marker set from marking only the subject of major class (agentive) intransitive verbs to the subject of transitive clauses.Both of these key aspects of this model are further discussed below.
drawing from basic tenets of grammaticalization theory and his reconstructive work with Cariban languages, Gildea (2002) proposes a completely different diachronic pathway for the development of PtG person marking.His argumentation is based on principles of historical syntax that hold that older morphemes tend to be phonetically smaller, semantically more opaque, closer to the stem and show more morpho-phonemic irregularity than relatively newer morphemes (following Givón, 2000, p.120-121).Using this logic, Gildea argues that it is unlikely that pre-PtG indexed the absolutive argument on the predicate since the s min /o marker set are phonetically larger than the s maj /A set, less bound to the predicate, and show a higher degree of similarity with the free pronoun set.He proposes that the *i-/c-3 rd person s min /o prefix is the oldest and that the system that developed for PtG arose from a predominately nominative-accusative indexation system as shown in ( 10).
Predicates index 3 rd person o with the *i-/*c-prefix. it is possible that a class of intransitive verbs also used these prefixes for subject indexation.b.
s maj /A prefix set develops from a set of free pronouns that were lost prior to the development of PtG.these prefixes attach further away from the stem than the i-/cmarkers, resulting in a nominative-accusative pattern when o is 3 rd person.c.
the 1 st person acting on 2 nd person portmanteau prefixes develop.d.
the s min /o marker set developed from the free pronoun set that replaced the earlier pronoun set that formed the s/A marker set.
Central to Gildea's model is that the 3 rd person s min /o prefix *i-/*cwas inherited from an earlier stage of the family and that the proclitic forms that index the sAP arguments within that same marker set developed after the inclusion of an additional marker set that indexes s maj /A.
Both Jensen (1998) and Gildea (2002) contribute many important insights into the development of PtG argument indexation.However, at the time that these proposals were being written, only a few studies had been published on the grammar of the non-tupí-Guaraní languages.since the turn of the century there has been a marked increase in the quantity and quality of the descriptive materials available on these languages.furthermore, greater attention has since been paid to the classification and reconstruction of the various branches of the family.Both of these factors can now help to refine our understanding of the development of this system.

QUANTiTATiVE ANALYSiS oF VERBAL PERSoN MARKiNG PATTERNS
since Jensen (1998) and Gildea (2002) propose quite different indexation systems for the early stages of the tupian family before the development of PtG, a logical starting place for a discussion on the development of this system would be to examine the observed distribution of verbal argument marking patterns in the modern languages.one straightforward technique for this is the application of the parsimony principle to discrete typological data in order to reconstruct the ancestral state of these features over a given phylogenetic tree.A parsimony-ased comparative analysis considers that the best model for evolutionary development is the one that requires the least amount of changes needed in order to fully account for the observed data (fitch, 1971).While languages surely develop in ways that are not always the most parsimonious, the possible model is a reasonable place to begin the discussion.from there, any developments within the family that diverge from the most parsimonious scenario should be supported by additional evidence and argumentation (Cysouw, 2009).
A comparative phylogenetic analysis of the development of verbal person markers was carried out over the two discussed classificatory hypotheses using a parsimony reconstruction model.the languages shown in table 1 were sampled for this analysis 16 .
16 three important factors were considered when constructing the language sample: genealogical diversity, geographic spread and the availability of materials.since the target sample is composed entirely of members from a single language family, representatives of as many different clades of that family tree as possible were selected.Given the ten branches of the tupian family presented in rodrigues (1999), at least one member of each branch is included in the sample with the exception of Puruborá, for which little grammatical data is available (Galucio, 2005;Monserrat, 2005).since the development of the person markers within the tupí-Guaraní branch of the family has been thoroughly studied and discussed, this branch has not been densely sampled, but rather, only a few representative languages from different geographically widespread subgroups of the branch have been included.the analysis was implemented using the Mesquite software package (Maddison;Maddison, 2011).Each language was given a general typological feature value based on the arguments that are indexed in independent transitive clauses and the alignment of the marker sets used with the major class of intransitive verbs.for the analysis using the expert classification shown in figure 1, the values for each branch were given based on the majority consensus of the languages from that branch in the sample.A side by side comparison of the analysis applied to both classifications can be seen in figure 3, including the reconstructed values for intermediate nodes of the tree.
the analysis applied to the AsJP classifications (figure 3, right) clearly reconstructs an absolutive indexation pattern for Proto-tupí.However, the topology of the expert classification (figure 3, left) makes it impossible to resolve a single pattern using purely independent clause data.this may be partly due to the way in which the data are coded, since each language is given only a single typological characterization.this method of coding is often used in typological databases, but a number of details that may be historically informative are lost so that each language can fit into a pre-established category.the analysis also differs between classifications regarding the point in the family history that the hierarchical marking pattern developed, with the AsJP classification positing its development in the early stages of the expansionist group before the diversification of the different branches, with a subsequent loss in the Juruna branch.the analysis applied to the expert classification posits the development of the hierarchical marking pattern after the separation of the Juruna branch from the other expansionist languages (Monserrat;soares, 1983).
An alternative approach to coding is to allow each language to be coded for the alignment of the different marker sets used in independent clauses.Under this coding scheme, a language like Mundurukú that has one set of markers that indexes A and s maj and another set that indexes o and s min is coded as having two observed states-an s/A marker set and a s/o marker set-rather than a single state 'Hierarchical'.the Mundurukú imperfective construction that only indexes the o argument is not considered in this analysis since it uses the same marker set as the unmarked perfective construction.this alternative analysis is shown in figure 4.
Using the alternative coding scheme based on marker sets, the AsJP classification again reconstructs an absolutive marker set for Proto-tupí, while the analysis of the expert classification does not allow for a single state to be reconstructed for Proto-tupí due to the tree topology.the analysis applied to both classifications reconstructs an ancestral marker set for the expansionist group that indexes o with the same markers as the minor class of intransitive verbs.As discussed in the following section, traditional comparative method evidence suggests that these two sets are related to each other rather than forming two completely independent sets, albeit having undergone a number of historical developments that obscure their true cognacy.some other aspects of the parsimony analysis are inconsistent with what we know from classic comparative method reconstruction, and as such, the analysis should only be considered a preliminary hypothesis on which to base more detailed work17 .for instance, the marker set-based analysis does not reconstruct multiple marker sets for Proto-Mawetí-Guaraní, largely due to the three marker set system found in Awetí.the parsimony analyses propose different diachronic pathways for the development of the hierarchical marking pattern when the data are treated with the different coding schemes, with figure 3 proposing a single development that was maintained in figure 3. A reconstruction of typological patterns in tupian argument marking modeled over the expert classification (left) and AsJP classification (right) using maximum parsimony.
Proto-Mawetí-Guaraní, whereas figure 4 proposes a development of the system in Mundurukú independent of the development of hierarchical marking found elsewhere.this can be interpreted as showing that the hierarchical pattern of marking developed early on in the expansionist group of languages, but that the marker sets went through a number of different configurations between the ancestor of the expansionist languages and the formation of PtG, with the presence of a single s min /o marker set being maintained throughout this history.

oN THE CoGNACY oF FoRMS WiTHiN MARKER SETS
the different marker sets that occur in the languages of the sample are presented in table 2. the arguments indexed by each marker set are indicated.Certain allomorphies are simplified for ease of comparison, and reconstructed forms from Jensen (1998) are given for Proto-tupí-Guaraní as well as those for Proto-tupí from rodrigues and Cabral (2012).the reconstruction of PtG marker sets in schleicher (1998) are largely congruent with those in Jensen (1998).the 1>2 portmanteau morphemes present in many Mawetí-Guaraní languages are not included for the sake of space.the markers that are likely cognates with the reconstructed Proto-tupí forms are indicated in bold, based on the forms identified in rodrigues and Cabral (2012, p.543).Additional provisional cognate judgments are made by the author following systematic sound correspondences identified in rodrigues (2007) when those languages were not included in the former study.A number of important observations can be made about the retention and innovation of forms within the marker sets.first, of the expansionist subgroup of languages, Mundurukú is the only language outside of Mawetí-Guaraní to have distinct forms that index A in transitive clauses: oɁ= 3 rd person, a= 1 st person plural inclusive and epe= 2 nd person plural.none of these forms appear cognate with the reconstructed Proto-tupí forms in table 2, suggesting that they are innovations.Likely cognates of these forms are maintained in the marker sets reconstructed for PtG s maj /A set: ocorresponding to oɁ= is retained in the 3 rd person, and pecorresponding to epe= is retained in the 2 nd person plural.this suggests a continuous diachronic development of a distinction between A and o retained from a common ancestor.interestingly, forms similar to epe= occur as 2 nd person plural free pronouns in many Mawetí-Guaraní languages but not in Mundurukú, where the free pronoun is ejdʒu.other forms in the Mundurukú set i markers were extended from the older set ii markers, which are cognate with the Proto-tupí forms.the reanalysis of these retained forms for 1 st and 2 nd persons singular and 1 st person plural exclusive as s maj /A markers suggests that the distinction between set i and set ii forms developed gradually over multiple stages in the family history.An alternative hypothesis where all the innovative forms seen in Proto-Mawetí-Guaraní s maj and A markers were also innovated in pre-Mundurukú and then partially replaced by reanalysis of the set ii forms does not appear to be supported by the available evidence.
the lack of clear cognacy between the Proto-tupí s/o proclitics as reconstructed in rodrigues and Cabral (2012) and the PtG s min /o proclitics as reconstructed in Jensen (1998) can be attributed to the development of the free pronouns in PtG. the s/o markers in dependent clauses in PtG (Jensen's set 3) are true reflexes of the reconstructed Prototupí absolutive marker set, at least in the singular (rodrigues, 1985, p.380;Jensen, 1998, p. 574). in main clauses, the indexing function of the Proto-Mawetí-Guaraní o set was replaced in PtG by attaching the independent pronouns to the verb, resulting in phonologically-reduced clitic forms.drude (pers. comm. 2013) suggests that these clitic forms are not directly cognate with the earlier pronouns because they include an additional morpheme, a stress-bearing "formative element" *e18 .it is possible that after the addition of the *e formative to the free pronouns in pre-PtG, the new stress pattern resulted in the loss of unstressed phonological material.A clear example of this proposed pathway can be seen for the 2 nd person singular pronoun: Proto-Mawetí-Guaraní en became pre-PtG ené, resulting in the PtG s min /o marker né=. the other non-3 rd person pronouns developed along similar lines.
the fact that the s min /o marker set in PtG can be traced to a development after the separation of Mawé, Awetí and tupí-Guaraní into their respective branches provides support for the claim in Gildea (2002) that these proclitics developed later than the 3 rd person prefix i-/cof the same set.When examining the marker sets across the whole family, the 3 rd person s min /o form *i-in PtG does indeed appear to be a retention from an earlier stage of development in the family.for example, 3 rd person iis found in the tuparí language Mekens, and a similar proclitic form i= is used to index 3 rd person arguments in certain focus constructions in Karo (Gabas 1999, p.122-125).this form is also found in Juruna.

DiSCUSSioN
the analysis presented above allows for a refinement of our understanding of the developments that took place in the person marking system of tupian verbs, particularly among the expansionist group of languages.the first step in the development of the marking patterns in the expansionist languages was a restriction of the use of the originally absolutive marker set for indexing the subject of intransitive verbs.this restriction is retained in the Juruna branch of the family, where Xipaya does not index any intransitive subjects and Juruna only indexes those of a minor class of intransitive verbs.this restriction can also be seen in the Mundurukú imperfective construction.
due to differing classifications regarding the relation of the Juruna and Mundurukú branches, it is difficult to ascertain whether this restriction on intransitive indexation predated the development of the person hierarchy.However, no forms in Juruna have been identified as cognate to the forms that distinguish the marker sets in Mundurukú, suggesting that the expert classification may more accurately portray the relationship of these groups to one another than the AsJP classification.Based solely on the analysis of the expert classification, it is clear that the restriction of intransitive indexation arose before the development of the person hierarchy as a condition on transitive indexation.
As mentioned above, some of the innovative forms in Mundurukú have cognate forms in the marker set that indexed s maj /A in PtG (2 nd person plural, 3 rd person).it is still unclear whether there existed cognates in PtG of the innovative a= in Mundurukú 1st person plural inclusive smaj/A marking (with the 1st person marker *a-being a possible candidate).As yet, there is no strong hypothesis for how these innovative forms arose.However, notice that the plural person markers reconstructed for Proto-tupí by rodrigues and Cabral (2012) shown in table 2 all include the glide /j/, either as the onset of the last syllable, as in or j e= '1pl.excl', or as the coda of the monosyllable forms, as in Vj= '1pl.incl' and ej= '2pl'.Also note that for the first person exclusive form and the second person plural form, the initial vowel of the marker corresponds to the singular form, o= and e=, respectively.this could suggest a four-term person marking system may have existed at some point in the history of the tupian languages, much like that found in many Cariban, northern Jê, Matacoan and Aymaran languages.it is possible that the glide element results from a putative plural marker that fused with the person marker to express the plural form.A similar argument is made for the development of the person markers and free pronouns in the tuparí branch by Galucio and nogueira (2012), where they reconstruct the plural marker *-jat, resulting in synchronic forms such as ejatin Mekens for second person plural.if this is indeed the case for the family in general, it could be possible that the a= marker found in Mundurukú is a reflex of the heretofore unidentified 1 st person plural inclusive person marker before the accretion of the additional plural element.such a diachronic pathway can help to explain the difficulty in identifying clear reflexes of this reconstructed form in the daughter languages (rodrigues; Cabral 2012, p. 544).While clearly speculative at this point, such a hypothesis deserves further examination.Gildea (2002) suggests that innovative forms found in PtG arose through the cliticization of a set of free pronouns that were later lost before the formation of PtG. if at some point in the history of the family there did exist a set of free pronouns that developed into the s maj /A prefix set in PtG, it would be expected that reflexes of this set could be found in the free pronouns of the languages that did not develop a set of s maj /A marking prefixes, as in the Arikém, tuparí, Mondé, ramarama and Juruna branches.free pronouns of selected members of these branches are given in table 3.
As can be seen by comparing the free pronouns shown in table 3 with the prefix/proclitic sets in table 2 for the same languages, the bound sets appear to be phonologically reduced forms deriving from the free pronoun sets rather than representing a separate diachronic development, with 3 rd person forms being notable exceptions.the free pronouns and proclitic set in Proto-tupí as reconstructed in rodrigues and Cabral (2012) also maintain the same distinction in the sense that the latter are phonologically reduced forms of the former.At present, there still does not seem to be a clearly identifiable source for the innovated forms that index s maj and A in the expansionist languages that mark such arguments.

CoNCLUSioNS
the diversity of different verbal argument marking patterns across the tupian language family provides an interesting case study for the application of both computational and traditional techniques to make inferences about the morphosyntactic diachrony of a language family.the parsimony analyses tend to support the claim that Proto-tupí originally had a system of indexation that marked absolutive arguments on the predicate.from there, the system began to change in the languages that spread outwards from rondônia, while the branches that remained in rondônia maintained the absolutive pattern.Early on in the history of the expansionist group of languages there was a restriction on the indexation of intransitive subjects with the ancestral marker set, resulting in the indexation of only the o argument in Xipaya and Mundurukú imperfective constructions and the indexation of both o and s min in Juruna.the hierarchical pattern found in the Mundurukú and Mawetí-Guaraní branches is the result of a single gradual development due to the retention of cognate forms across these groups that are not found elsewhere in family.the PtG s min /o markers then developed from a reanalysis of the free pronouns into verbal proclitics resulting from the addition of the formative element *e.
While much work remains to be done on the reconstruction of tupian phonology and grammar, this paper has confirmed some previous proposals on the earlier stages of development of the verbal argument marking system while putting forth some novel ideas on when and how these changes developed.these proposals highlight the importance of including both classificatory hypotheses and data from a wide range of languages in future work on the morphosyntactic reconstruction within the tupian language family.

ACKNoWLEDGEMENTS
i would like to acknowledge the support of the Languages in Contact research group in nijmegen funded by grants from the European research Council and dutch royal Academy of sciences (KnAW) awarded to Pieter Muysken.thanks go to the tupian experts who enthusiastically discussed their languages of study with me, notably Aline da Cruz, sebastian drude, Vilacy Galucio, Hebe González, suzi Lima, denny Moore, Gessiane Picanço, françoise rose and Luciana storto.the paper was substantially improved thanks to comments from the editors and one anonymous reviewer, as well as Mily Crevels, Michael Cysouw, rik van Gijn, Harald Hammarström, sérgio Meira and Pieter Muysken.However, those mentioned do not necessarily share the views adopted in this paper, and all errors are my own.

REFERENCES
figure 1. Expert classification of the tupian family in rodrigues (2007).
figure 4. A reconstruction of marker sets used in tupian argument marking modeled over the expert classification (left) and AsJP classification (right) using maximum parsimony.
A summary of tupian verbal person marker sets.
prefix set was extended to also index A of transitive verbs when o is 3 rd person.in these direct scenarios, both A and o are indexed.in inverse and local scenarios, where o is a sAP, only o is indexed.d.