-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NPs in head-marking languages #1036
Comments
I suppose by "roles" you do not mean semantic roles? Those would depend on the meaning of the verb anyway. But their grammatical relations are probably Cāniy kī-wīcihēw Mērīwa.
The verb cross-references two 3rd person core arguments, Johnny and Mary. Mary is marked as obviative (by the suffix -wa) and the verb is in direct voice, hence Johnny will be the subject and Mary will be the object. |
Thank you! Yes, I did not mean roles as in semantic roles. RRG, for example, has a good explanation for this phenomenon, treating the NPs as pronominal anaphoras (see Van Valin's introduction to the Handbook of RRG). This is a good fit to dep, in my opinion. |
Well, since the bound morphemes are just agreement markers and not pronouns (i.e., they have no nodes in UD; they result in layered morphological features of the verb node), then I do not see reasons to annotate the nouns (when they are present) as "outer" subjects or anything else. In the Plains Cree examples above, the verb would have the following features (plus perhaps others like
|
The are not agreement markers, they are the core arguments! |
Or maybe they are just reflection of the core arguments on the verb. It may not be the same in every head-marking language, of course. But they would first have to be words to be nodes and core arguments. As bound morphs, they are either affixes or clitics. If they have a prescribed position close to the verbal root (possibly just with other verbal affixes in between), they are affixes and not clitics. If, on the other hand, their position is not fixed and other words can occur between them and the verb, then they are clitics and can be treated as separate syntactic nodes in UD. |
... and if they were clitics, then probably |
`expl' is not a good choice, because expletives occupy the position of core arguments. The NPs I refer to are not in core argument positions |
I might not understand what you mean here with position: is it an exact, fixed position like "subject is before the verb"? Else It is also quite similar to languages where one does not really speak of head marking, such as in Italian:
Otherwise, I could comment that there is no worse choice than |
In other languages with polypersonal agreement, i.a. Nahuatl ( |
We have this situation very often in Coptic and we just give the pronominal elements their proper label and the nominal ones get dislocated (like we would do English "they're my favorites, bagels") |
But in this case the pronouns are tokenised off and have their own node, right? |
Yes, in Coptic the pronominal elements are always their own tokens (we have nominal object incorporation at times, but pronouns are distinct). It's actually pretty similar to Chadic languages like Hausa, except that in Coptic you can still get a nominal subject inside the TAM position, though it is somewhat rare. For example for the past tense marker "a" with third person nominal subject "p-rōme" or "f" (pronoun):
In languages where the 2nd option doesn't exist (and in Coptic it gets increasingly rare with time), we could argue about whether that's agreement or a pronoun, but since we have option 2 and there are also other good reasons to think of them as pronouns (which they are historically), we analyze them as separate nodes. |
But why dislocation if it is so common? I have the impressio nthis gives a strange picture of the language, topicalising subjects so often... |
Yes, that's true - based on this interpretation, Coptic has a very high proportion of dislocation. The only reason is that we understood the UD guidelines to mean that this is the correct thing to do for a language that behaves that way. Using what I called option 2 above, Coptic can have sentences that look like English ones (with a lexical subject NP and no pronoun), but with time, the pronoun option became increasingly preferred, and a secondary realization of the lexical subject became popular. In terms of word order, the position between auxiliary and verb is the canonical subject position, so it seems right to treat it as nsubj, and if there is another copy of the subject as a lexical NP somewhere, we call that dislocated. Incidentally, like in Hausa, it's possible for the dislocated subject to be a pronoun itself, then using the 'strong' pronoun form (again like in Hausa):
Since Hausa doesn't have what I referred to as 'option 2' with a lexical NP subject between the TAM and verb, it's more arguable whether we want to consider "shi" to be dislocated - it would depend on whether we subtokenize "ya" to contain a pronoun. But for Coptic, since the pronoun is not always there, we have to make it a token, and then they both need deprels. The most common option is just to have the pronoun, and it's also closest to the verb, so IMO it makes sense for it to be nsubj, and the other one is dislocated. |
It might be a redundant affix which can or cannot be there, why does it need to be a pronoun? From your examples it looks like a bound morph, always appearing in that position. If the construction with the "interposed" lexical subject is the rarer one, probably it is this one which needs to have a "deviant" annotation (in terms of dissociated nucleus?). The opposite really looks like a lectio difficilior to me. Only my 2 cents... |
In some languages, it can be very difficult to decide whether NPs coreferring with a subject pronominal index are dislocated or not. In our paper at UDW 2020, we explore the case of French interrogatives (Marie est-elle là? 'Mary is she there?') and above of Wolof where most sentences contains a particle focusing one element in the clause. What is particularly complicated in Wolof is that some particles block the realization of a subject in the canonical position and the question arises whether the NP in the pre-focus position is a dislocated element or a subject. To solve the question (supposing that the question is relevant and we need to decide) we need to have annotated data. It is why I think that it is important, in such cases, to annotate the different positions where a "subject" is realized and to have relations such as |
No, that's not the case - in the 'option 2' construction, there is no pronoun, just a lexical NP subject. If it were an inflectional morpheme, it should always be there IMO, even when the subject is lexical. What's more, the same forms appear as object pronouns, prepositional complements, etc. Cases 2 and 3 below correspond 1:1 to their Semitic equivalents, which are regularly regarded as pronouns:
I don't doubt that Coptic was probably on it's way to becoming a language like Hausa, where some version of the pronoun has to be there and we could argue whether at some point it becomes inflectional, but it never made it that far before it was overtaken by Arabic as the common language of Egypt. It seems very likely that Hausa must have gone through a similar process, and there are other similarities and cognates between the two languages, but earlier stages are not documented since we don't have Hausa texts before the 15th century.
Agreed, I think that makes a lot of sense, especially if the list of possible dislocation types is limited and they are frequent in the language. |
"transversal" and flexibly combinable subtypes are indeed something that would have its use. I proposed them here #955 but they keep coming up in similar discussions. "Layers" for deprel, in a sense"!
They continue looking very much like bound personal morphs to me (this is also what I recall from my little Arabic). Case 2 reminds me of "inflected" adpositions in languages like Irish, but also Hungarian; case 3 of many possessive affixes. Are there not "full", "strong" forms for pronouns?
This was a point of confusion which I probably did not elaborate. I do not think that inflectional affixes need to be always mandatory: there might be some redundancy and cases where they might or might not appear. This is common typologically for plural affixes, but it would not surprise me for person indexing. The previous case number 2 looks very interesting and raises many questions (incorporation? nature of TAM element a?). But more generally, I meant that also methodological questions are raised if the commonest construction is annotated as the very marked, and at the same time underdefined, |
No, they are regarded as pronouns in Arabic, and this is handled the same in UD Coptic, Arabic and Hebrew. This example illustrates both the prepositional object (token 26) and the possessive enclitic pronoun (token 28). Notice both are tagged as PRON and treated as the head of the PP and a genitival modifier respectively: 24 عثرت عَثَر VERB VP-A-3FS-- Aspect=Perf|Gender=Fem|Number=Sing|Person=3|Voice=Act 17 advcl 17:advcl:عِندَمَا Vform=عَثَرَت|Gloss=discover,come_across,find|Root=` _t r|Translit=ʿaṯarat|LTranslit=ʿaṯar
25-26 عليه _ _ _ _ _ _ _ _
25 علي عَلَى ADP P--------- AdpType=Prep 26 case 26:case Gloss=on,above|LTranslit=ʿalā|Root=` l w|Translit=ʿalay|Vform=عَلَي
26 ه هُوَ PRON SP---3MS2- Case=Gen|Gender=Masc|Number=Sing|Person=3|PronType=Prs 24 obl:arg 24:obl:arg:عَلَى:gen Gloss=he,she,it|LTranslit=huwa|Translit=hi|Vform=هِ
27-28 شقيقته _ _ _ _ _ _ _ _
27 شقيقة شَقِيقَة NOUN N------S1R Case=Nom|Definite=Cons|Number=Sing 24 nsubj 24:nsubj Gloss=sister|LTranslit=šaqīqat|Root=^s q q|Translit=šaqīqatu|Vform=شَقِيقَةُ
28 ه هُوَ PRON SP---3MS2- Case=Gen|Gender=Masc|Number=Sing|Person=3|PronType=Prs 27 nmod 27:nmod:gen Gloss=he,she,it|LTranslit=huwa|Translit=hu|Vform=هُ
I don't know Irish or Hungarian, but in my mind if a preposition inflects, that would mean that it agrees with something or expresses some categories to indicate a choice with semantic meaning in a paradigm. What we have here in the prepositional case is simply an allomorph of the preposition which is triggered in the environment of a pronoun as the object. This is common in Afro-Asiatic languages and works the same in Coptic, Egyptian, Arabic and Hebrew, to name a few. For example:
Notice that the preposition does not change its form based on person or number, or definiteness, or contact with an article, or anything else - it's an automatic allomorphic alternation based solely on whether the object is pronominal or nominal.
Yes, these are the clitic pronouns, and there are also independent ones, like the "ntof" above, which I loosely translated "as for him". It's used in more marked information-structural environments. But the same thing happens in Indo-European (e.g. Polish dat. strong mnie/enclitic mi "me", tobie/ci "you") but we don't say any of those are not pronouns just because some of them have to be post-tonic.
I wouldn't say it's the most common one - that would be just a pronominal subject, with no dislocation. And lexical NP subjects are not exactly rare, perhaps because the UD corpus is focused on classical literature. I just had a look, and for the past tense (which is admittedly only one environment), we get:
So yeah, lexical NPs are conspicuously rare in Coptic, but they're still 12% of the data, and treating all pronouns as inflection just because of that would suddenly mean that Coptic becomes a pro-drop language with 70% subjectless sentences. I'm sure this would surprise a lot of people working on the language, since there is no real discontinuity here with Ancient Egyptian. Those pronouns are standing exactly where the subject was standing in late Egyptian, where dislocations are much less common. In short, I think of Coptic as exhibiting a language change in progress, which never came to completion, but the results of which would have given us a language like Hausa. |
In head-marking languages, the core arguments are expressed as bound indexes on the predicate. Grammatical theories offer various explanations for the semantic roles of the noun phrases (NPs) related to these indexes. In Dependency Grammar (DG), it seems more appropriate to consider these NPs as dependents of the predicate. What are your thoughts on this?
Here is an example:
[ John Mary (3.SG-3.SG-Verb)]
I am referring to the roles of 'John' and 'Mary.'
The text was updated successfully, but these errors were encountered: