Skip to content

RmrsDesign

DanFlickinger edited this page Mar 26, 2008 · 26 revisions

Sets of interests or constituencies that have a stake in the design of (R)MRS representations:

  • parsing/underspecification
  • generation
  • integration with shallow systems
  • shallow inference/IE
  • inference
  • syntax, composition
  • multilingual harmonization
  • transfer
  • anaphora

Agenda items

  • cross-linguistic similarity
  • paraphrase
  • robustness, underspecifiability
    • predicate names
  • stability, documentation (core phenomena)
  • abstract predicates
  • decomposition
  • near-equivalences
  • clean IE
  • sense mapping
  • `real' documentation
  • naming conventions for unknown word predicates
    • proper names

Demise of pronoun_q_rel:

When we remove pronoun_q_rel, we run into the problem that the tree gets smaller, and the labels of the pronoun relation have to get identified with something. This was troubling for quite a while until we convinced ourselves that modified pronouns are always pseudopartitives.

Pronouns don't ever get quantifiers, and share their labels with the verbal projection they combine with ("You who brought laptops probably have the right idea." "You with laptops must leave now."). Modified pronouns ("You who brought laptops") involve a pseudopartitive construction which introduces both a new variable and a quantifier for that variable.

Proper names can go without quantifiers, but might get them when they are overt ("Every Kim I've met"). "John in Paris" doesn't need a quantifier.

Optionality of pronoun_n_rel

In languages with pro-drop (of various kinds), dropped arguments often fill the role of unstressed pronouns in a language like English (used for reference to discourse entities currently in focus). But dropped arguments can also have other uses (e.g., like indefinite null instantiation in English). The question is what would break if we didn't put in any ep for these dropped arguments.

Raises questions about the notion of characteristic variables/eps: Every ep has an ARG0 and every non-u index is the ARG0 of exactly one ep (with special exceptions). The rmrs matching code uses this notion, and drops those variables that can't be grounded in an ep for which they are the characteristic variable. The notion of characteristic variable is also relevant to dependency extraction code (used e.g., in producing n-gram models against which to rank transfer outputs) and potentially to filtering lexical edges based on information from other lexical edges in the gender.

In terms of the model theory, both sleep(e,u) and sleep(e,x) are fine, but the lack of any characteristic ep affects the complexity of tasks such as deciding the mutual satisfiability of two rmrs.

PRED with internal structure

It may be useful to make explicit the internal structure of (open-class) PRED values, which by common consent are currently strings which have the following structure: "_LEMMA_POS_SENSE"

We explored a proposal to change PRED so its value is no longer a string for open-class lexical entries, but a type with the following structure:

  • [ LEMMA string, POS type, SENSE type ]

There are several possible benefits of this increased transparency:

  • More direct comparison of MRS and RMRS structures
  • Enabling of lexical redundancy rules like causative-inchoative in English, where each of the entries has a single EP, and where the PRED value in one entry is systematically related to that of the other entry. For example, with "open", the ERG has two entries, the intransitive verb with "_open_v_1", and the transitive verb wtih "_open_v_caus". On this proposal, the redundancy rule would replace just the SENSE value '1' with 'caus'. (Note that the LKB does not yet have TDL notation defined to support a definition of a lexentry1 as lexentry2+lexrule3, but this is not expected to be hard. Note also that such a redundancy rule is a descriptive device, not a unary rule to be applied in parsing or generation, since it does not preserve monotonicity of the MRS. (But it was suggested that this monotonicity could be preserved if the entry for 'open' were underspecified for SENSE, with two rules, one for inchoative and one for causative.))
  • Improving treatment of subregularities for verb-particles, such as with semi-productive "up" as in "wake up". Here it is good to have SENSE as a type, since we could arrange its value here to be something like the following, in order to capture the causative-inchoative alternation for "wake up".
    • up caus

      • \ /

      up-caus

Clone this wiki locally