Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GNIP 75 - Enable spatial data store for Django DB #6079

Closed
6 of 11 tasks
mtnorthcott opened this issue May 13, 2020 · 35 comments
Closed
6 of 11 tasks

GNIP 75 - Enable spatial data store for Django DB #6079

mtnorthcott opened this issue May 13, 2020 · 35 comments
Assignees
Labels
gnip A GeoNodeImprovementProcess Issue

Comments

@mtnorthcott
Copy link
Contributor

mtnorthcott commented May 13, 2020

GNIP 75 - Enable spatial data store for Django DB

Overview

This GNIP proposes the transition to spatial data stores such as PostGIS and Spatialite in place of the existing Django database. This will enable more streamlined and easier implementation of features involving geospatial queries such as 'search by extent'. The implication of this transition is the migration of existing data sets. The implementation requires adjustments to all installation sources and fixes to the Travis CI and paver test suites.

Proposed By

@mtnorthcott
@geotob

Assigned to Release

This proposal is for GeoNode 3.x

State

  • Under Discussion
  • In Progress
  • Completed
  • Rejected
  • Deferred

Existing work

This GNIP started as a fix for the 'search by extent' feature, proposed in #6064 which serves as a basis. In this state, the feature works with the Docker, paver and SPCGeoNode installation methods. The focus is now on getting Travis CI tests to pass, documenting and advertising all changes thoroughly.

Pull Requests:

Motivation

  • Easier storage and handling of geospatial data
  • Removal of unnecessary bbox_* decimal fields
  • More simplistic implementation of features like 'search by extent'
  • On-the-fly reprojection of non-EPSG4326 bounding boxes

Originally, an issue (#4346) was raised over the 'search by extent' feature not working properly. The issue was mitigated in a commit, however, the end goal is to transition to spatial databases exclusively. Work towards this goal has begun in effort to enable full functionality to the 'search by extent' feature in #6064.

Proposal

  • Update paver installation method and test accordingly
  • Update Docker installation method and test accordingly
  • Update Django test suite
  • Update Travis CI
  • Update all documentation accordingly, especially the setup parts
  • Advertise all new changes accordingly

Backwards Compatibility

  • We can maintain access to bounding box fields but calculate these upon request via Python @property decorated access methods
  • Tests require adjustment to be compatible with spatial data stores

Future evolution

Feedback

Some initial thoughts have been discussed in the original issue (#4346) and the 'search by extent' pull request (#6064).

Voting

Project Steering Committee:

  • Giovanni Allegi: ✔
  • Alessio Fabiani: ✔
  • Francesco Bartoli: abstained
  • Simone Dalmasso:
  • Toni Schoenbuchner: ✔
  • Florian Hoedt: ✔

Links

@mtnorthcott mtnorthcott changed the title Enable spatial data store for Django DB GNIP 6079 - Enable spatial data store for Django DB May 13, 2020
@mtnorthcott
Copy link
Contributor Author

@t-book If approved, please add this to the wiki. Cheers

@t-book t-book changed the title GNIP 6079 - Enable spatial data store for Django DB GNIP 75 - Enable spatial data store for Django DB May 13, 2020
@t-book
Copy link
Contributor

t-book commented May 13, 2020

@t-book t-book added the gnip A GeoNodeImprovementProcess Issue label May 13, 2020
@t-book
Copy link
Contributor

t-book commented May 13, 2020

@mtnorthcott Thanks a lot for this PR which will definitely improve our codebase. Where I am unsure is the issue with SQLite which will stop working as from then we rely on PostGIS, right? In other words, paver setup which is very nice, for quick testing and development, will diverge in functionality from a setup using PostGIS. In case you or others do not have a better solution, I could live with it in case we're returning a meaningful error message plus having a bold red warning in docs.

@srtonz
Copy link

srtonz commented May 13, 2020

@mtnorthcott Thanks a lot for this PR which will definitely improve our codebase. Where I am unsure is the issue with SQLite which will stop working as from then we rely on PostGIS, right? In other words, paver setup which is very nice, for quick testing and development, will diverge in functionality from a setup using PostGIS. In case you or others do not have a better solution, I could live with it in case we're returning a meaningful error message plus having a bold red warning in docs.

SQLite should continue to work through Spatialite, though @mtnorthcott has run into the issue described on his PR. It works for me, though my machine has a newer libspatialite 5.0.0 installed.

If it really comes down to a version incompatibility between Shapely and Spatialite (as this ticket suggests - shapely/shapely#904) we may have to include a warning in the documentation?

@t-book
Copy link
Contributor

t-book commented May 17, 2020

my +1

@afabiani
Copy link
Member

+1

1 similar comment
@giohappy
Copy link
Contributor

+1

@gannebamm
Copy link
Contributor

+1
nice work @geotob and @mtnorthcott

@francbartoli
Copy link
Member

+0 as I cannot figure out all the possible implications right now. Maybe there aren't :)

@t-book
Copy link
Contributor

t-book commented May 19, 2020

With Francesos answer I would say this PR can be seen as accepted as the majority of PSC members voted +1

@afabiani
Copy link
Member

The GNIP, the points of the proposal should be met first.

@t-book
Copy link
Contributor

t-book commented May 19, 2020

@afabiani I do not fully understand you, do you mean the existing PR should first met the points stated in this GNIP before it can be merged?

@afabiani
Copy link
Member

I would like to see all the PRs related to all the points of the GNIP before start merging, that what I'm saying. Aka, documentation, travis updates and updates to installations methods (this last point probably easier, except that needs testing, because almost all already use postgresql).

Moreover, paver commands using sqlite will be dropped as far as I understood. So I'd expect some checks on the DB added to pavements script too.

@mtnorthcott
Copy link
Contributor Author

mtnorthcott commented May 27, 2020

I now have working installations of GeoNode using paver commands and docker-compose which utilise Spatialite and PostGIS databases, respectively. Although it is possible to run a Postgres server locally and enable PostGIS when using a local paver installation, it is easier to maintain SQLite support for this method. It's works as-is with the workaround suggested in shapely/shapely#904 (included in my PR). However, I can document how to use PostGIS locally, despite not being the default, if necessary.

Lastly, the Selenium tests are currently causing my PR to fail in Travis CI. I believe this could be caused by their dependence on the geonode/postgis Docker image, which requires an update to be compatible with PostGIS (see GeoNode/postgis-docker#2 - though I understand this cannot be merged yet).

Any ideas on the best course of action from here? I would be keen to hear the community's thoughts.

@mtnorthcott
Copy link
Contributor Author

Work on Travis CI is progressing. I have confirmed that the standard Selenium test passes with an updated geonode/postgis image but the SPCGeoNode variant runs into some issues with the celerycam container where I observe the issue described in #6018.

Please see the description for an updated list of PRs pertaining to this GNIP (which I will continue to update)

@mtnorthcott
Copy link
Contributor Author

Most tests now pass under Travis CI. To get SPCGeoNode working, the celerycam container was disabled as a temporary measure. I will wait on community feedback concerning #6018 before finalising this, but given that the package is now deprecated and incompatible with Django 2.2 as per jazzband/django-celery-monitor#107, I find this appropriate until a workaround or replacement is found.

@afabiani Would you be able to review my PR at some stage and let me know your thoughts? I will return on Wednesday next week so there's no rush. Cheers.

@afabiani
Copy link
Member

@mtnorthcott sure, thanks for your effort. Will try to review it in these days.

afabiani pushed a commit to GeoNode/geonode-selenium that referenced this issue Jun 1, 2020
@mtnorthcott
Copy link
Contributor Author

Tests now passing under Travis.

@afabiani
Copy link
Member

afabiani commented Jun 5, 2020

@mtnorthcott cool, good job!

I guess the PR is stable now, I'm going to do final testing and help with the documentation. Then we can finally merge it.

@giohappy
Copy link
Contributor

giohappy commented Jun 5, 2020

great work @mtnorthcott. This is an important step for GeoNode, it will open up new capabilities also for any new REST API which could require spatial filtering / search of data, preview of data locations, etc.

@t-book
Copy link
Contributor

t-book commented Jun 5, 2020

well done @mtnorthcott thanks a lot!

magic

:))

@mtnorthcott
Copy link
Contributor Author

Appreciate the feedback! It's certainly exciting that we're getting close :)

I realise I made an error and put a duplicate link in my PR list. I have amended this and have also added an initial documentation proposal. Feel free to critique.

@afabiani
Copy link
Member

I guess we are very close, I just need to test it with all the current GeoNode installation ways.

I'll put a list here below and I'll check the successful ones while testing:

  • core dev mode
  • core docker
  • core spcgeonode
  • project dev mode
  • project docker

afabiani pushed a commit to GeoNode/postgis-docker that referenced this issue Jun 15, 2020
afabiani pushed a commit to GeoNode/documentation that referenced this issue Jul 21, 2020
[Issue GeoNode/geonode#6079] Reference spatial data stores for Django database
afabiani pushed a commit to GeoNode/documentation that referenced this issue Jul 21, 2020
afabiani pushed a commit to GeoNode/documentation that referenced this issue Jul 21, 2020
Revert "[Issue GeoNode/geonode#6079] Reference spatial data stores for Django database"
afabiani pushed a commit to GeoNode/documentation that referenced this issue Jul 21, 2020
@gannebamm
Copy link
Contributor

What is the status for this GNIP?

@srtonz
Copy link

srtonz commented Aug 17, 2020

Good question - I was under the impression that the remaining issues on #6064 had been fixed and everything was ready to go.

@mtnorthcott will rebase the PR this week and check if everything works. It'd be good to get a second opinion too once that's done.

@mtnorthcott
Copy link
Contributor Author

I have now rebased #6064 and locally verified that the Docker and paver installation methods work as expected and behave as they did before the rebase. I did encounter some errors in the Docker installation where some layers did not upload correctly. However, reverting to the current master branch yielded the same results. A summary of these issues are as follows:

  • Failure to start using the documented docker-compose command (docker-compose -f docker-compose.yml -f docker-compose.override.localhost.yml up) -- returned 'Bad Gateway' on navigation to http://localhost -- the following tests run using docker-compose -f docker-compose.async.yml -f docker-compose.development.yml
  • Uploaded vector layers do not show in the MapStore preview at all, although the bounding box calculation is correct and the expected area of interest is shown
  • Returns 'Bad Request' on attempting to upload a zipped shapefile
  • Returns a Java NoSuchMethodException when uploading a zipped shapefile with additional tabular metadata

From here, the GNIP is nearly complete. I'll be looking into fixing the CI build and then the PR will be ready for another review I expect.

@mtnorthcott
Copy link
Contributor Author

CI build has been fixed.

The exercise involved upgrading the CI to bionic for the newer version of SQLite/Spatialite. The previously used backport ppa has since been discontinued. Using bionic also means an update to GDAL version 2.2.x, or 2.4.x with the official Ubuntu GIS ppa. This revealed an issue with renaming shapefile column names with windows-1258 encoding which I have detailed in #6394. I have implemented a workaround (also detailed in the issue) to restore old CI functionality for the test affected.

That said, the PR is ready for another review @afabiani. The problems outlined in the last review are now be resolved and rebased on master.

@afabiani
Copy link
Member

afabiani commented Sep 1, 2020

@mtnorthcott awesome! Thanks for your big effort. I'll try to review asap. Performing all the tests requires (both automatic and manual) requires a lot of time.

@afabiani
Copy link
Member

afabiani commented Oct 17, 2020

@mtnorthcott @gannebamm @t-book @giohappy @geotob @francbartoli PR merged, it is working nicely on master.

Still to backport on 3.x (aka 3.1) after more extensive tests.

We still need to check again the documentation and merge it. Other than this, we will need to advertise people on the new model change. Especially, updating the existing layers will need to run "updatelayers" management command in order to fix the bounding boxes.

P.S.: thanks @mtnorthcott for the hard work. Sorry this taken so long.

@srtonz
Copy link

srtonz commented Oct 19, 2020

Thanks @afabiani for the time on reviewing this 👍

We'll make a ticket internally to check & update documentation and include a visible breaking change notification about having to run updatelayers. We'll try to submit this before the new year :-)

@gannebamm
Copy link
Contributor

Ok, I think this will be in 3.2 than, which is perfectly fine. What do you guys think @afabiani @t-book @giohappy @francbartoli ?
With Geostories on the horizon and this PR and the rest-api rework 3.2 will have loads of new cool features. I would therefore use 3.1 as more update cenctric relase for 3.x branch and get it done soon?

@gannebamm
Copy link
Contributor

The current geonode-project docker composition fails with:

  Applying base.0044_resourcebase_bbox_polygon...Traceback (most recent call last):
  File "manage.py", line 31, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 83, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/migrate.py", line 232, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 245, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.8/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.8/site-packages/django/db/migrations/operations/fields.py", line 110, in database_forwards
    schema_editor.add_field(
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/schema.py", line 433, in add_field
    definition, params = self.column_sql(model, field, include_default=True)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/schema.py", line 150, in column_sql
    db_params = field.db_parameters(connection=self.connection)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/fields/__init__.py", line 696, in db_parameters
    type_string = self.db_type(connection)
  File "/usr/local/lib/python3.8/site-packages/django/contrib/gis/db/models/fields.py", line 105, in db_type
    return connection.ops.geo_db_type(self)
AttributeError: 'DatabaseOperations' object has no attribute 'geo_db_type'

I already tried to rebuilt the docker-postgis container but it is still not working.

@mtnorthcott
Copy link
Contributor Author

mtnorthcott commented Oct 28, 2020

Thanks for the feedback and for reviewing @afabiani

@gannebamm I have taken a look at the geonode-project repo but could get to the stage of performing migrations due to the servers being down at the time. Regardless, when scoping the GNIP, the geonode-project repo was not considered and does not have a 3.x version release. For the time being, I will leave this up to the community to investigate but would be happy to help if needed.

afabiani pushed a commit to GeoNode/documentation that referenced this issue Nov 1, 2020
@afabiani afabiani closed this as completed Nov 1, 2020
@afabiani
Copy link
Member

afabiani commented Nov 1, 2020

Doc merged here

@frafra
Copy link
Contributor

frafra commented Nov 28, 2021

@gannebamm you should use postgis:// instead of postgresql:// for DATABASE_URL (.env file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gnip A GeoNodeImprovementProcess Issue
Projects
None yet
Development

No branches or pull requests

8 participants