Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMRX creates <realpred> nodes without required pos field #129

Closed
goodmami opened this issue Dec 20, 2017 · 3 comments
Closed

DMRX creates <realpred> nodes without required pos field #129

goodmami opened this issue Dec 20, 2017 · 3 comments
Milestone

Comments

@goodmami
Copy link
Member

Jacy has a predicate like _te_adjunct_rel, which PyDelphin converts to the following realpred:

<realpred lemma="te" sense="adjunct" />

Then dmrx._decode_pred returns None for elem.get('pos'), and Pred.realpred trips on the non-string value of the pos field.

See: http://lists.delph-in.net/archives/developers/2017/002631.html

@goodmami
Copy link
Member Author

Strict fix:

  • throw error when Pred.stringpred() has no pos
  • throw (more informative) error when Pred.realpred() has no pos
  • (maybe unnecessary) throw error in dmrx._encode_node() if pos is None
  • (maybe unnecessary) throw error in dmrx._decode_pred() if pos is None

Robust fix:

  • create pos="" attribute in dmrx.encode_node() when pos is None
  • create default pos (x? empty string?) in dmrx.decode_pred() when pos is None
  • refactor Pred class to allow, e.g., empty strings for pos

@goodmami goodmami added this to the v0.7.0 milestone Dec 20, 2017
@fcbond
Copy link
Member

fcbond commented Dec 20, 2017 via email

@goodmami
Copy link
Member Author

Oh I thought x was the catch-all pos. But u works, too.

But I want to avoid doing something quietly that would yield a different MRS on round-tripping. E.g., if PyDelphin reads in _te_adjunct_rel, then outputs _te_u_adjunct_rel, it has changed the semantics for the purpose of preventing a PyDelphin exception.

I would rather throw an error (grammar devs need to get with the times, man), make normalization (i.e. inserting the 'u' pos) an explicit operation by the user (e.g. x = mrs.normalize(simplemrs.loads_one('...'))), or make it robust in a way that is equivalent on round-trip (e.g. empty string pos, then make sure PyDelphin writes _te_adjunct_rel and not _te__adjunct_rel)

goodmami added a commit that referenced this issue Apr 18, 2018
The previous fix allowed PyDelphin to deal with missing POS values in
Preds, but it still output DMRX that had no pos attribute on nodes,
which was non-conformant with the DTD. This commit makes pos=""
equivalent to a None value of pos for DMRX and MRX. MRX is still
non-conformant in that it won't output <label> or <var> as TOP and
INDEX when they are None on the Xmrs object, but there is not a good
solution for that case. This commit also allows MRX to deal with
missing TOP and INDEX on input.

Addresses #129
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants