Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

platform.short_name and instrument.short_name cannot be empty #1

Closed
akorosov opened this issue Mar 29, 2017 · 7 comments
Closed

platform.short_name and instrument.short_name cannot be empty #1

akorosov opened this issue Mar 29, 2017 · 7 comments
Assignees

Comments

@akorosov
Copy link
Member

platform.short_name and instrument.short_name are used as natural_key in Source and cannot be empty.
But for level-4 data the short_names are empty because higher level GCMD platform and GCMD instrument must be used (e.g. ACTIVE REMOTE SENSING).
So far a hack is introduced in Nansat for GLOBCURRENT data: mapper_opendap_globcurrent sets instrument and platform to Jason-1.

@mortenwh
Copy link
Contributor

mortenwh commented Oct 24, 2018

As mentioned in nansencenter/nansat#389, a dataset can have several sources. Currently, we have one source per Dataset but moving the ForeignKey from the Datasetmodel to the Source model would allow multiple sources which can be defined more specifically using the short_names.

This involves some significant changes in the code, including migrations, so it needs some focus time...

@akorosov akorosov self-assigned this Jan 22, 2019
@mortenwh mortenwh assigned mortenwh and unassigned akorosov Jan 31, 2019
@mortenwh
Copy link
Contributor

mortenwh commented Feb 1, 2019

I suggest to add all "super-fields" of shortname to the natural_key methods and the __str__ methods in the vocabularies to solve the problem with empty fields.

@mortenwh
Copy link
Contributor

mortenwh commented Feb 4, 2019

  • Update Dataset.sources to contain previous value of Dataset.source in catalog/migrations/0007_auto_20190204_1025.py

  • Update Dataset.parameters to point to same parameters as previously pointed to by the intermediate model (DatasetParameter)

  • Test data migration to new Dataset.parameters and Dataset.sources fields:

  • First, forward migration:

$ git checkout master
$ rm -rf project/
$ vagrant provision
$ vagrant ssh
$ source activate py3django
$ cd project
$ ./manage.py flush
$ ./manage.py loaddata /vagrant/geospaas/vocabularies/fixtures/vocabularies.json
$ ./manage.py loaddata /vagrant/geospaas/catalog/fixtures/catalog.json
$ ./manage.py shell
In [1]: from geospaas.catalog.models import Dataset                                                                                                        
In [2]: from geospaas.vocabularies.models import Parameter                                                                                                 
In [3]: from geospaas.catalog.models import DatasetParameter                                                                                               
In [4]: ds = Dataset.objects.get(pk=1)                                                                                                                     
In [5]: pp = Parameter.objects.get(standard_name='sea_surface_temperature')                                                                                
In [6]: DatasetParameter.objects.get_or_create(dataset=ds, parameter=pp)                                                                                   
Out[6]: 
(<DatasetParameter: AQUA/MODIS/2010-01-01T00:00:00+00:00:sea_surface_temperature>,
 True)
In [7]: ds.parameters.all()[0]                                                                                                                             
Out[7]: <Parameter: sea_surface_temperature>
$ exit 
$ git checkout issue1_gcmd_short_names
$ vagrant ssh
$ ./manage.py migrate
$ ./manage.py shell
In [1]: from geospaas.catalog.models import Dataset                                                                                                        
In [2]: ds = Dataset.objects.get(pk=1)                                                                                                                     
In [3]: ds.parameters.all()[0]                                                                                                                             
Out[3]: <Parameter: sea_surface_temperature>

Then reverse migration:

$ ./manage.py migrate catalog 0006_auto_20190130_1442
$ exit
$ git checkout master
$ vagrant ssh
$ source activate py3django
$ cd project
$ ./manage.py shell
In [1]: from geospaas.catalog.models import Dataset                                                                                                        
In [2]: ds = Dataset.objects.get(pk=1)                                                                                                                     
In [3]: ds
Out[3]: <Dataset: AQUA/MODIS/2010-01-01T00:00:00+00:00>
In [4]: ds.source.pk                                                                                                                                       
Out[4]: 1
In [5]: ds.source                                                                                                                                          
Out[5]: <Source: AQUA/MODIS>
In [6]: ds.parameters.all()                                                                                                                                
Out[6]: <QuerySet [<Parameter: sea_surface_temperature>]>
$ exit
$ git checkout issue1_gcmd_short_names
$ vagrant ssh
$ source activate py3django
$ cd project
$ ./manage.py migrate # db is in final state...
  • Update web form in viewer to enable searches for platform, instrument and parameter

  • I have updated the vagrant configuration to be able to run the django web server as ./manage.py runserver 192.168.33.10:9090

  • The search form is unconvenient (finding the right platform and instrument is a nightmare) but works as expected

  • It will be a quite big job to update the search form so I suggest to make a new issue to handle that at a later stage...

  • Change method Dataset.__str__ to return entry_title instead of source platform and instrument

  • Make sure nansat_ingestor.mangers.get_or_create creates meaningful entry_title (I suggest to simply use the filename, as otherwise we will have to update all the nansat mappers)

mortenwh added a commit that referenced this issue Feb 4, 2019
…st version of Aqua platform keywords (was AQUA before)
mortenwh added a commit that referenced this issue Feb 4, 2019
…from ForeignKey source to ManyToManyField sources, changed __str__ method of Source to rely on platform and instrument only (not specs) and removed the field specs.
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
…de itself. Assert that the entry_title is now equal to the filename, test new use of 'platform/instrument' as list
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
…t and parameter instead of source. This is still ongoing..
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
… there is extra information linked to the connection between two models. In this case, we don't have that...
mortenwh added a commit that referenced this issue Feb 4, 2019
…new migrations, and updated fixtures. This also applies to #51
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
mortenwh added a commit that referenced this issue Feb 4, 2019
@mortenwh
Copy link
Contributor

mortenwh commented Feb 5, 2019

Note that the current search form only allows single entries of parameters, platforms and instruments. If we want to be able to search, e.g., several parameters, we need to change the filtering in the render method viewer.views.IndexView.render

mortenwh added a commit that referenced this issue Feb 5, 2019
mortenwh added a commit that referenced this issue Feb 5, 2019
mortenwh added a commit that referenced this issue Feb 6, 2019
…ged natural_key and get_by_natural_key methods of the Source model
mortenwh added a commit that referenced this issue Feb 6, 2019
mortenwh added a commit that referenced this issue Feb 6, 2019
@akorosov
Copy link
Member Author

geospaas.catalog.models.Source

defines natural_key()
which uses short_name of insterument and platform.
But sometimes short_name can be empty. E.g.
[('Category', 'Earth Remote Sensing Instruments'),
('Class', ''),
('Type', ''),
('Subtype', ''),
('Short_Name', ''),
('Long_Name', '')]
In that case natural key is ambigous and Django fails.

Solution 1

Add all fields to natural key: Category, Class, etc

Solution 2

Remove natural_key()
Question: Do we need this natural key at all?
https://docs.djangoproject.com/en/3.0/topics/serialization/#natural-keys

Solution 3

Replace ForeingKey with ManyToMany relatiuonship from Dataset to Source. That is complicated
as it requires changing Nansat.

opsdep pushed a commit that referenced this issue Mar 26, 2020
@opsdep opsdep assigned akorosov and unassigned akorosov Mar 26, 2020
@opsdep
Copy link
Contributor

opsdep commented Mar 26, 2020

@akorosov
Dear Anton, please review the third test as well. I have modified the tests and return the fixture to their original state based on your comments.

akorosov pushed a commit that referenced this issue Mar 30, 2020
@akorosov
Copy link
Member Author

Closed in #80 by removing natural_key from several models in vocabularies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants