Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022 XML file has version="2021" #902

Closed
goodmami opened this issue Mar 9, 2023 · 2 comments
Closed

2022 XML file has version="2021" #902

goodmami opened this issue Mar 9, 2023 · 2 comments
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Milestone

Comments

@goodmami
Copy link
Member

goodmami commented Mar 9, 2023

Release format
LMF

Describe the bug
The 2022 release assets from LMF have the following at the top of the file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE LexicalResource SYSTEM "http://globalwordnet.github.io/schemas/WN-LMF-1.1.dtd">
<LexicalResource xmlns:dc="https://globalwordnet.github.io/schemas/dc/">
  <Lexicon id="oewn"
           label="Open English WordNet"
           language="en"
           email="[email protected]"
           license="https://creativecommons.org/licenses/by/4.0/"
           version="2021"
           citation = "John P. McCrae, Alexandre Rademaker, Francis Bond, Ewa Rudnicka and Christiane Fellbaum (2019) English WordNet 2019 – An Open-Source WordNet for English, *Proceedings of the 10th Global WordNet Conference* – GWC 2019"
           url="https://github.com/globalwordnet/english-wordnet">

Note that the version is 2021 and not 2022. This is true for both files:

I confirmed that the files are in fact different from the 2021 release XML, it's just the version attribute that was not updated.

As a side note, the WN-LMF version could be 1.2, but I don't think there are any differences in the WN-LMF format between 1.1 and 1.2 (it seems all the differences were for the other formats).

To Reproduce

I noticed this when trying to import OEWN-2022 into Wn (see goodmami/wn#181):

>>> import wn
>>> wn.download('oewn:2022')
Download [##############################] (12528988/12528988 bytes) Complete
Skipping oewn:2021 (Open English WordNet); already added

PosixPath('/home/goodmami/.wn_data/downloads/3e786e15b7f43627beddafdfa4a95807a5b12cf7')

Expected behavior

It would be great if the release assets (both links) were replaced with new files that use version="2022":

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE LexicalResource SYSTEM "http://globalwordnet.github.io/schemas/WN-LMF-1.1.dtd">
<LexicalResource xmlns:dc="https://globalwordnet.github.io/schemas/dc/">
  <Lexicon id="oewn"
           label="Open English WordNet"
           language="en"
           email="[email protected]"
           license="https://creativecommons.org/licenses/by/4.0/"
           version="2022"
           citation = "John P. McCrae, Alexandre Rademaker, Francis Bond, Ewa Rudnicka and Christiane Fellbaum (2019) English WordNet 2019 – An Open-Source WordNet for English, *Proceedings of the 10th Global WordNet Conference* – GWC 2019"
           url="https://github.com/globalwordnet/english-wordnet">
@goodmami goodmami added the release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository label Mar 9, 2023
@goodmami
Copy link
Member Author

goodmami commented Apr 8, 2023

@jmccrae just checking in. Do you think you could re-release the 2022 WN-LMF files with this simple fix? I wouldn't mind them being released alongside the originals (e.g., english-wordnet-2022-r2.xml.gz) in case you don't want to break anything.

@jmccrae jmccrae closed this as completed in bc07902 May 5, 2023
jmccrae added a commit that referenced this issue May 5, 2023
@goodmami
Copy link
Member Author

goodmami commented May 7, 2023

@jmccrae thanks for merging and publishing the new release files! I've now released Wn 0.9.4 with OEWN 2022 indexed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Projects
None yet
Development

No branches or pull requests

2 participants