Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation of subtypes and layers #955

Closed
Stormur opened this issue Jul 5, 2023 · 5 comments
Closed

Documentation of subtypes and layers #955

Stormur opened this issue Jul 5, 2023 · 5 comments

Comments

@Stormur
Copy link
Contributor

Stormur commented Jul 5, 2023

I was wondering if it would not be possible to implement a more efficient way, both for redactors and users, to document transversal subtypes, multisubtyped relations and layers.

In particular:

  • some subtypes are transversal and are applied to more than one relation, always representing the same information but declined in different contexts. For example cmp for comparative clauses, which can be applied to advcl and obl, maybe to acl, and possibly to others. It would be handy to have a single page for :cmp where a general explanation and the specific applications are explained together, and it would be more practical than copypasting the same lines of text;
  • some relation can regularly receive different subtypes, for example dislocation with respect to all possible dislocated arguments nsubj, obj, csubj, obl,... Here it would be nice to simply list all these possibilities on one documentation page, keeping a unitary description;
  • and a very similar issue can be had with layers: for example, [psor] is transversal with respect to Gender, Number, Person,...
@nschneid
Copy link
Contributor

nschneid commented Jul 5, 2023

  • subtypes: The subtyping mechanism exists for flexibility. Most subtypes are idiosyncratic to one or a few languages (with a few exceptions deemed "semi-mandatory"). I see your point that if a language defines *:cmp or dislocated:* (where * ranges across several values), a page giving an overview of all of them would make sense. I would expect such a page to occur within the language-specific documentation.
  • morphological layers: (I've never dealt with these, will leave to someone else to weigh in)

@Stormur
Copy link
Contributor Author

Stormur commented Sep 20, 2023

Other labels for which these issues apply:

  • outer, as many (if not most) functional elements can be "outside" a given phrase with respect to the internal structure, and the logic is the same for all of them (this has come out in other discussions recently);
  • NumValue: I would like this to be extended by means of a regular expression. As of now, it stops as 4 as a catchall for any value greater than 3, but I think this might be cause for confusion. Ideally, I would like to express any numeric value, and those are infinite!

@dan-zeman
Copy link
Member

  • NumValue: I would like this to be extended by means of a regular expression. As of now, it stops as 4 as a catchall for any value greater than 3, but I think this might be cause for confusion. Ideally, I would like to express any numeric value, and those are infinite!

NumValue does not seem useful to me. I just recently realized it was used in the Czech data in a completely useless manner and I removed it. And if you want to express any value of a number, then it is a semantic feature, not morphological, and it should be in MISC.

@Stormur
Copy link
Contributor Author

Stormur commented Sep 21, 2023

I can understand its usefulness at least for lower numbers, though. Anyway, even if it is shifted to MISC (but could we not argue it is lexical as many others?), it still needs a way to accept possibly infinite values!

@dan-zeman
Copy link
Member

Anyway, even if it is shifted to MISC (but could we not argue it is lexical as many others?), it still needs a way to accept possibly infinite values!

We could say it is lexical but in FEATS I would expect it to partition the numeral space into a finite set of categories that are in some sense interesting to be annotated. (And for many lexical features it actually means they have some specific grammatical behavior; although, as Nathan just noted in another issue, PronType may be an exception.) If it is merely to signal that three, 3 and 3.0 all have the value of 3, then we indeed have an infinite set of values, but it does not fit in FEATS. On the other hand, in MISC it is quite OK (you do not have to enumerate the values somewhere to persuade the validator that the values are legitimate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants