You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some things in *MRS are considered case-insensitive, like predicates, morphosemantic property names and values, and variables, but XML is case-sensitive and the dmrx codec is currently outputting property names upper-cased. Python is also case-sensitive, so PyDelphin normalizes the case following the SimpleMRS conventions (variables, predicates, and property values down-cased; property names up-cased).
>>> from delphin.codecs import simplemrs
>>> m = simplemrs.decode('[ TOP: h0 RELS: < [ _RAIN_v_1 LBL: h1 ARG0: E2 [ e tense: PAST ] ] > HCONS: < h0 qeq h1 > ]')
>>> m.rels[0].predicate
'_rain_v_1'
>>> m.properties('e2')
{'TENSE': 'past'}
These conventions persist in the internal DMRS representation upon conversion, which is fine:
>>> from delphin import dmrs
>>> d = dmrs.from_mrs(m)
>>> d.properties(10000)
{'TENSE': 'past'}
But they should not persist in serialization to XML, where it would not follow the DTD:
Some things in *MRS are considered case-insensitive, like predicates, morphosemantic property names and values, and variables, but XML is case-sensitive and the
dmrx
codec is currently outputting property names upper-cased. Python is also case-sensitive, so PyDelphin normalizes the case following the SimpleMRS conventions (variables, predicates, and property values down-cased; property names up-cased).These conventions persist in the internal DMRS representation upon conversion, which is fine:
But they should not persist in serialization to XML, where it would not follow the DTD:
Similarly, they are not normalized when decoding, unlike SimpleMRS:
This issue is mainly about DMRX as PyDelphin is outputting data that doesn't comply with the DTD, but it also affects other codecs.
The text was updated successfully, but these errors were encountered: