Word Order Indecision

I’ve spent the last couple of weeks stuck on the main clause word order principles for Ch’ubmin (my current conlang). Ch’ubmin is a synthetic language, mostly verb-initial, with polypersonal agreement and incorporation, and somewhat inspired by Mayan and some other native North American languages. 

The thing I’m most certain of is the function and structure of contrastive fronting. Like many verb initial languages, and like some Mayan languages, Ch’ubmin has both contrastive topic fronting and contrastive argument focus fronting. The difference between these is partly word order, intonational, and partly morphological. 

Fronted Topics (Left-dislocated)

  1. The topic can be offset from the clause by an intonation break
  2. The topic is normally marked by a deictic particle / demonstrative that otherwise occurs clause finally
  3. Topic fronting has no impact on the form of the following verb / clause, which behaves like any other clause. The topic can control zero anaphora or a resumptive pronoun

Fronted Focus

  1. The focus is part of the same intonational contour as the clause
  2. Extraction of the focus requires the verb to be in its conjunct form, which is the same form used for relative and adverbial clauses (i.e. this is a kind of cleft)
  3. Resumptive pronoun highly marked / forbidden, but the extracted focus can control verb agreement and possessor agreement on ordinary nouns and relational nouns / prepositions

When both are combined, the order can only be topic – focus – verb. Examples:

Fronted topic only

in tacha' ai, rena'
DET man TOP, 3-thither-be
`As for the man, he went'

Fronted focus only

in tacha' nuyoha' o' 
in tacha' nu-yo-ha' o' 
`The man was (the one) who came'
`The man came'

Fronted Topic + Focus

in tacha' ai, il hemen nulít o'
DET man TOP, DET woman CONJ.PFV-see DEIC
`As for the man, he saw the woman'
`As for the man, the woman saw him'

The last example raises the unanswered question of role marking. In the specific case where the focus is on a single argument and the rest of the clause is backgrounded or presupposed I think the risk of ambiguity is lower in context, but that’s certainly not the case for the unmarked post-verbal clause structure and word order which I haven’t described yet.

The first thing to say here is that, if you have a highly synthetic language with polypersonal agreement, it’s likely to allow free dropping of arguments, and Ch’ubmin does. That means that if you have transitive verb + argument, in principle the argument could be either the agent or the patient. And, unlike some head-marking languages, Ch’ubmin does not have direct/inverse voice or pragmatically driven voice which would distinguish between the topical/dropped argument being the actor or patient.

I guess there are three major options worth considering:

  1. Rigid post-verbal word order, no case marking
  2. Flexible post-verbal order, no case marking
  3. Flexible post-verbal order, case marking

The Original Idea for the Post-Verb Domain in Ch’ubmin

I had originally planned (3). There is an ablative-instrumental preposition fe’ which contracts with the determiner to give forms bel~ben~ber, much like Spanish de+el = del or French de+le = du, which I had planned to be an ergative/marked nominative marker, combined with free word order. And there are languages with optional ergative or nominative markers, which generally surface under conditions of either focus / non-default information structure, or ambiguity. See:

Optional ergativity and information structure in Beria

 The table above, taken from the article in the caption, describes how in Beria, =gu mostly occurs when A is focused or low-topicality, or when it is necessary for disambiguation.

Similarly for Tibetan languages with optional ergative/nominative markers:


“What we find, across Tibetan and in the majority of languages across the family, is that in elicited data we have something approximating a consistent ergative, aspectually split-ergative, or active-stative case marking pattern, while in natural discourse the ergative marking is found only in some clauses, often a minority, usually with some pragmatic sense of emphasis or contrast.”

“Tournadre notes the same interaction with word order, and points out that there is no syntactic environment where ergative is truly obligatory, and that wherever it occurs it indicates contrastive focus (see also Zeisler 2004: 514ff). He shows that the presence or absence of ergative marking often seems to have a pragmatic (or rhetorical) force, such that the presence of ergative marking serves to emphasize the agentivity of the A argument, or to place it in discourse-pragmatic focus.”

“Not all TB languages have been described as ergative; a significant set, including many Lolo-Burmese and Bodo-Garo languages, appear at first glance to have a more nominative-accusative cast. But for Burmese, the best-known example, the case is by no means so simple. As in Tibetan and elsewhere, subject marking is not syntactically obligatory, but is strongly determined by pragmatic factors; this has been a long-standing problem in Burmese linguistics.”

Finding examples of languages with flexible VSO / VOS word order is also not hard, although finding a detailed description for what motivates word order differences is harder. A number of factors seem to play into such word orders: Ojibwe seems to prefer verb initial verbs and for the obviative argument to be closer to the verb, which generally produces VOS orders but permits VSO especially when verbs are in the inverse voice. Only some Mayan languages have flexible word order, but those that do also tend to be VOS but with VSO permitted due to factors like animacy, definiteness, argument weight (heavy shift) etc.

In both these examples, topic fronting and zero anaphora is common, so it has to be remembered that VOS really means a choice of S realisation between:

  1. Highly topical S not overtly present in clause
  2. Established afterthought topic S in VOS order
  3. New topic or contrastive topic fronted to clause initial position

The VOS default order effectively keeps the predicate together and focal material early in the clause, with the normally topical S on the periphery (either last or fronted into topic position). S only gets sandwiched next to the verb if it is a defective subject in some way. The alternative, VSO as the default and VOS as a marked alternative when S is focal or non-topical, also seems to occur. 

An example is Alto Perené, an Arawak language there happens to be a good grammar for. Elena Mihas’ grammar says:

“Definiteness of noun referents does not affect constituent order. Both definite NPs, typically active topics, and indefinite NPs, typically newsworthy information, tend to occur post-verbally in particular slots, as shown in (13.109)-(13.112). The post-verb position at the right periphery is occupied by a non-contrastive focus constituent, whereas topical NPs are located immediately after the verb.”

Thus the post-verbal order in Alto Perené is verb – topic – focus. There is no role marking of post-verbal NPs, so identification of actor and patient in transitive clauses relies quite heavily on context. 

The interesting thing about this word order pattern is that, in the generally pragmatically unmarked predicate (VO) focus, the focus domain is split. The verb is fixed initially and may or may not be in focus, then you have some partly backgrounded material, then you have whatever argument is most foregrounded / focal at the end.

The language also, like many verb-initial languages, allows fronting to the immediately pre-verbal position for contrastive argument focus as part of a cleft-like construction, and it allows left-dislocation of topical arguments. You might recognise this general pattern from both Mayan and from my Ch’ubmin proposal above. 

So my idea was to combine two basic principles:

  1. Approximate default word order of V topic focus, i.e. VSO by default, but invertible to VOS under marked information structures like subject focus or thetic (entire clause) focus.
  2. Case marking of the subject by fe’ or its fused determiner forms bel~ben~ber if and only if the S is in focus: i.e. in the inverted VOS transitive word order, or when a transitive patient is omitted or extracted but the agent is overt, or when an intransitive S is markedly focal (although maybe with a higher barrier in the intransitive case?)

This could approximately be thought of as a kind of word order or dependent marked passive, with information structure deficient subjects being marked by an oblique case marker even though verb agreement remains unchanged.

For example, with both arguments present:

transitive verb, absolutive subject, absolutive object
predicate (VO) focus

išaq' in tacha' in ch'om o'
3.PFV-grab DET man DET stick DEM
`The man grabbed the stick'
`The man grabbed the stick'

transitive verb, absolutive object, nominative subject 
moderate subject focus or thetic/clause focus 

išaq' in ch'om ben tacha' o'
3.PFV-grab DET stick NOM.DET man DEM
`The man grabbed the stick'
`The man grabbed the stick'
`The man grabbed the stick'

And with one argument omitted:

transitive verb, absolutive object
predicate (VO) focus
išaq' in ch'om o'
3.PFV-grab DET stick DEM
`He grabbed the stick'
`He grabbed the stick'

transitive verb, nominative subject
moderate subject focus or extracted / topical object 

išaq' ben tacha' o'
3.PFV-grab NOM.DET man DEM
`The man grabbed it'
`The man grabbed it'

None of this would affect the preverbal domain, because I assume that the mechanisms that produce pre-verbal NPs, namely left dislocation and clefts (technically a kind of inverted pseudo-cleft), resist pied piping. Compare the English:

*As for by the man, the stick was grabbed
*By the man was the one who grabbed the stick

And actually this is, again, attested. Languages with some kind of case marking but a no case before the verb rule include Semelai (the actor marker la= is used post-verbally), and a number of African languages with marked nominatives or ergatives, whose argument fronting construction historically came from clefts like Päri, Teso, Shilluk, Dinka, Baale, etc. See Case in Africa by König for more details.  

Hesitation and Concerns

I guess the main reason I’m not sure about this is because it feels busy. It combines:

  1. Left-dislocation of topics
  2. Argument focus via cleft-like fronting
  3. More moderate information marking and role marking via a combination of post-verbal word order and presence/absence of an ergative / marked nominative preposition

On the other hand, having multiple devices to mark information structure distinctions is maybe not so unusual. As I described above, a number of verb initial languages have the first two (dislocated topics + pre-verbal foci), plus post-verbal word order sensitive to some kind of pragmatic concerns, plus other devices. Alto Perené lacks any kind of core case marking, but it does have special focus pronouns and markers which can be used to mark pragmatic distinctions. 

The Alternatives

The alternative would be either approximately fixed word order no case marking, which also provides role disambiguation because you can always add a pronoun instead of using zero anaphora if you need to clarify a transitive clause, or flexible order no case marking.

For the no marking case, Alto Perené seems to get by relying mostly on context, although verbal marking does distinguish the role of extracted contrastive foci. 

For the rigid case, excluding topic/focus fronting, both rigid word order options are easily attested. In Mayan alone, Mam is rigidly VSO, whereas many other members of the family are reported to be either rigid VOS or VOS dominant with some flexibility, although in all cases zero anaphora is widespread.

Anyway, what do you all think?