Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid to parse an ontology source file that is the same that previous version, even if retrieved by cron job #171

Closed
jonquet opened this issue Jan 5, 2022 · 3 comments
Labels
content Issues related to the content of AgroPortal feature request

Comments

@jonquet
Copy link
Contributor

jonquet commented Jan 5, 2022

This issue is a long term fix for problems such as the ones in #167

We need to implement a mechanism that explicitly double check that the source file downloaded is different from the source file of the previous version (if exist).
This mechanism must be added to the automatic pull script that check file changes with HTTP headers.

@syphax-bouazzouni syphax-bouazzouni added the content Issues related to the content of AgroPortal label Jan 10, 2022
@syphax-bouazzouni
Copy link
Contributor

syphax-bouazzouni commented Jan 11, 2022

after checking the code to know more about how is the ncbo_cron job figuring out, that a new version of an ontology was released, we found that it's not looking for the http header but download every day the ontologies and hash it to compare it with the local ones (see code below source : https://github.com/ontoportal-lirmm/ncbo_cron/blob/master/lib/ncbo_cron/ontology_pull.rb#L54)

image

In summary here is the process of pulls checking. Every day at 18:00, for each ontology we :

  1. Check if the URI is accessible and respond
  2. Download the remote file, Hash it and compare it with the local file hash.

@syphax-bouazzouni
Copy link
Contributor

So @jonquet, the requirement of this issue need to be updated or to be closed.

@jonquet
Copy link
Contributor Author

jonquet commented Jan 11, 2022

Now that we see how the script works we can close this issue as it is already implemented.
In the future, we might create another issue to enhance the script so that it does not pull the fils if HTTP headers do no indicate a change since the last pulled one.

@jonquet jonquet closed this as completed Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Issues related to the content of AgroPortal feature request
Projects
None yet
Development

No branches or pull requests

2 participants