GNIP-51: GeoNode 4 #3228

pjdufour · 2017-08-23T23:20:32Z

GeoNode 4

At the recent FOSS4G 2017, the community discussed our vision for GeoNode 3. Below you will find a document that describes a few of principals we identified as important for adopting into the next major release of GeoNode. Take a read!

https://github.com/GeoNode/geonode-vision/blob/master/geonode-vision.md

This document is only a draft and we'd appreciate your input. You can provide feedback in the comments below or edit the document if there's agreement. Thanks!

tomkralidis · 2017-08-23T23:59:28Z

Looks good! Initial comments?

can we clarify fresh CSW?
should GeoNode be Python 3 only?

pjdufour · 2017-08-24T00:09:04Z

In regards to "fresh CSW", if my recall is correct GeoNode had at one time supported GeoNetwork and other CSW backends. But right now pyCSW is the only recommended backend. That legacy code had prevented deep integration of permissions, users, etc. into our pyCSW metadata services. IMHO, we should generate CSW directly from the models as a API service, instead of the current flow where we save ISO 1915 XML directly into the model and then translate. I'd be in favor of explicitly promoting pyCSW as a top-level component, but we hadn't discussed it yet.

In regards to Python 3, that's a good question! I don't know.

tomkralidis · 2017-08-24T01:07:23Z

Some clarifications:

we do generate CSW directly from the models. We additionally store a synced static ISO 19139:2007 document for the CSW workflow where a clients asks for ISO with a full element set (specification requirement)
there's nothing stopping deep integration of permissions or users. pycsw is extensible such that repository plugins can handle this transparently
there is room for improvement in the pycsw GeoNode plugin query functionality. Currently, pycsw translates the CSW query into an SQL where clause. This would be better implemented by the plugin interpreting the CSW query (pycsw passes a dict of the CSW query into plugins) into it's native query syntax (Django query, ES, SOLR, etc.). This would play nicer with users/permissions native to Django
what do we mean by top-level component?
I'd be in favour of Python 3 only to take advantage of its respective improvements

pjdufour · 2017-08-24T01:13:35Z

By "top-level" I meant we should make pyCSW a required component rather than optional.
My thinking is that with a fresh codebase, we could build much deeper pyCSW integration.
Also, FWIW, caching of the CSW directly in the model has lead to issues when the IP address changes, etc. IMHO, would be better not to cache it at all.
We shouldn't need to rely on updatelayers as much as we do know to fix synchronization issues.

tomkralidis · 2017-08-24T01:21:00Z

we'll also want to address CSW transactions (which would get easier with deeper integration) and CSW 3.0 by default (it's configurable in the pycsw API)
+1 to iron out synchronization

More comments/thoughts:

should we consider MapServer or Mapnik as a data provider option?

afabiani · 2017-08-24T07:51:44Z

Hi all, thanks for putting together this proposal. Overall it looks to me very good.
Unique points I would like to highlight and possibly to discuss more are:

The core of GeoNode 3 should be more oriented to geospatial data instead of the simple concept of Layer. A geospatial dataset may produce more than one Layer and also it is possible to produce Layers out of analysis and processing.
GeoNode 3 should be more oriented to implement and correctly manage workflows. It is important both from users and mainteners perspective. From the users perspective, often there is the need of better manage the data flow being able to establish who can publish data, who can check it's quality, who can access the results when ready. From the developers perspective there's often the need to introduce more checks on the data uploaded by users and eventually to pre-process it before ingesting to the system. It would be also very useful to allow GeoNode 3 to manage and follow remote processing.
Generally speaking the overall architecture should go toward a direction where GeoNode core is completely independent from the geospatial backend.
Multi-tenancy support and clustering. GeoNode 3 architecture should be ready to be deployed on cloud being used from big organizations which have the need of tweaking the portal, both in terms of layout and available functionalities, accordingly to the Group / User connected.
Dev-ops instruments integration to have more control on accesses to the portal. It is very important to be able to understand who is doing what and eventually limiting unwanted activities to the portal.
Pluggable and extensible metadata system. Better rely on the power of instruments like pycsw and GeoNetwork to let GeoNode 3 being able to support dynamic metadata forms.

pjdufour · 2017-08-24T12:31:46Z

Thanks for the feedback @afabiani! Below are comments/questions:

My suggestion in the past has been the additional of a "collection" or "project" model that would encapsulate a set of layers, documents, and maps. The project could be manually created or automatically created. As mentioned, the UI could be quite different depending on the instance. Would that satisfy your requirement?
We should certainly add the necessary hooks to the core for workflow management (database fields and APIs), but I wonder how much of this is more on the UI/UX side.
Certainly can add principal on multi-tenancy support and clustering. I think we all agree with that.
Could you clarify? An ELK server could satisfy demands for continuous monitoring.
What are dynamic metadata forms?

capooti · 2017-08-24T17:50:24Z

Love all of the idea behind this GNIP.
I agree with @afabiani for the need of a containers of data (could be also a project model, as @pjdufour defines it).
Workflow management could be handy, but I agree with @afabiani that should come as a separate block.
Lot of good ideas here, wish I will be able to find the time to work on it :)

pjdufour · 2017-08-25T00:43:44Z

I put together an image of the architecture we discussed at the code sprint.

https://github.com/GeoNode/geonode-vision/blob/master/GeoNode3_Vision_Architecture.jpg

tomkralidis · 2017-08-25T00:56:59Z

Thanks @pjdufour. How can we edit the diagram? Suggest we add CSW and OWS data services to articulate GeoNode's support for SDI and standards.

pjdufour · 2017-08-25T02:31:29Z

I looked for a hackpad like app for a diagram, but couldn't find anything good. I ended up using google diagram. I'll add you to the google diagram.

francbartoli · 2017-08-31T09:31:51Z

Thanks all for the discussion! Few thoughts:

General:
1. Love diagram and give a quick architecture of the next major version but I'd prefer to list what GeoNode is and what isn't maybe in a meaningful format like gherkin which makes itself useful also for BDD
```
Feature: Backend agnostic configuration
    In order to ingest spatial dataset
    As a maintainer
    I want to be able to have abstract configuration for multiple backends

    Scenario:
        Given there is a spatial dataset somewhere
        And multiple backends configuration for storing spatial data
        When somebody ingests a dataset
        And any target backend has not been chosen
        Then I see the dataset ingested in all of them
```
  It's just an example, I'm not quite sure it is all formally correct.
2. The more we allow this exercise to everybody included people who are not developers the more we open mind for features which are business oriented rather than technical oriented. Duplication would not be an issue right now, we can then make a triage for duplicated feature/scenario
3. Be more generic possible with no reference to user interface interaction at all
Technical:
1. Minimal core with only a RESTful API, thinking of anything can be done from a GUI then it should be done as well as from a machine
2. Geospatial backend providers can be local or remote and should be designed in django models and can be various: classic geospatial engines (GeoServer, Mapserver, QGIS server, ArcGIS) with all their complexities, big data storages(GeoMesa, GeoWave, etc) but also simpler like just file(GeoJson) or DB storage(ex. GeoDjango)
3. Agreed at all with the suggestions from @afabiani in particular:
  - Introduce workflows: any ingestion should trigger a workflow that can be more or less complicated and composed by jobs, each of them defined by atomic tasks - preprocessing, data cleaning, validation, trasformation are just few examples
  - Introduce the concept of Dataset/Resource which can lead to multiple layers from a single resource, different workflow can be applied to Dataset vs its single layers
  - A dataset can be the output of whatever Web processing/API that has a spatial component(any geometry with coordinates and a declared CRS) clearly defined or hidden. We can span from WPS (local, remote) to generic Web API passing through all intermediatory file formats supported by GDAL
  - Metadata are crucial: if a Dataset already has own metadata then they can be part of the workflow in term of checks, validation, compliance etc but also any action performed on the data has to be added to metadata, if it doesn't at least a minimal set of metadata fields has to be created to track the tasks performed by GeoNode
  - Multitenancy should be supported with a more structured design "organisation-centric" for user and permission management:
    - Account Management ==> {Organisation,Role,Group,User}
    - Permission Management should rely on roles
    - There should be a dedicated django web api app for this purpose and for single sign on control while social login should be supported at the largest possible extent along with all use cases for passwordless, password management, forgot password etc

jondoig · 2017-08-31T09:50:48Z

Exciting ideas! We would second the need for data containers or projects, but urge that this be as flexible as possible and look towards a linked open data approach, using RDF to build relationships between datasets and possibly even between features across datasets. @rob-metalinkage may have advice on how to approach this within the proposed architecture.

rob-metalinkage · 2017-09-03T23:20:50Z

Road map looks good IMHO and the discussion directions here are interesting. FYI I'm working on Mapstory, a Geonode project, on its future directions and a couple of comments:

strongly agree the "layer" paradigm is limited - often data-manager centric not user needs centric. Users generally need to know the relationships between layers. ISO metadata is extremely poor in this regard - agree with comments it should be a view on a richer model.
Augmentation of metadata with custom elements can be done, as suggested, with static files using the target schema - this will be fragile if you try to extract anything more interesting from such data however. The approach I am using (which does not need to be core but should be considered as an example of a pluggable approach) is to use semantic models and create opportunities of rich reasoning to build the types of relationships and customised metadata views needed (using managed rules as just more types of content)
A key part of metadata (also extremely poorly handled by both standards and REST API environments) is "what are the valid values in an attribute in a dataset". This is in fact the key to being able to link between features in individual layers. Exploiting this aspect of metadata about layers allows one to build a Linked Data UI for the individual the features. (Note this is not Linked Data as in "turn everything to RDF in a bucket" but rather "build me a link to the service API endpoints that have specific data about the object I am looking at).

This stuff is all nascent - but in active development and hope to have the early backend betas integrated into the MapStory UI so the intent is more visible before the end of the year.

starting point is at https://github.com/rob-metalinkage/django-gazetteer - but feel free to contact me for a more detailed walk through.

Coop56 · 2017-11-28T04:54:40Z

Is there a planned date when work will start on Geonode 3?

francbartoli · 2017-11-28T07:26:59Z

not yet @Coop56. Do you want to start a thread in the dev mailing list?

gamesbook · 2019-03-08T06:50:18Z

Has this work being continued elsewhere? Should this issue be closed now?

francbartoli · 2019-03-08T08:42:13Z

@gamesbook for now there is just an empty repo which is supposed to be for OpenApi v3 model

ingenieroariel · 2019-03-15T16:09:39Z

Personally, I tried to give this problem a go but was not able to make a dent. Main issue is the discrepancy between describing what is there (not a good idea), vs describing something that is not a pipe dream and is actually achievable in 6-12 months with 3-4 people (I was not able to deal with the complexity).

The direction I took in order to continue being productive was to rethink GeoNode around only the core upload / permissions / download problem and just like we did in 2009, look at existing robust software packages to solve those problems, I settled on the following 3 tools:

Minio for raw data storage (private AWS S3)
ORY Hydra, Keto and Oathkeeper for permissions (private AWS IAM)
Nginx with small Lua functions (private AWS Lambda)

What I am doing now is writing OpenAPI 3 definitions for an upload / permissions / download process based on how those tools already work. This obviously ignores the even bigger problem of workflows, for example configuring datasets in postgres, geoserver, tegola, etc.

In order to write an API that is implementable, I am creating a minimal geonode with just those tools and expect to have results to share by the next summit. The working code is at:

https://github.com/piensa/puertico

francbartoli · 2019-03-15T18:00:30Z

@ingenieroariel I think the minimal problem for an OpenAPI 3 definition is to model the main entities for this new api. Layers don't convince me and I would prefer the concept of Datasets.

Then I'm wondering what is a dataset:

A combination of vector and raster collections?

/datasets/{cool_dataset_name}/collections/{my_awesome_vector_collection}/items (WFS3 implementation or subset from remote WFS3)
- {my_awesome_vector_collection} has a collection_type=vector property
- example: http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/
/datasets/{cool_dataset_name}/collections/{my_awesome_raster_collection}/cogs ( COGs bucket server or subset from remote collection. Here a collection can be also described by a STAC specification)
- {my_awesome_raster_collection} has a collection_type=raster property

Just one of the two above with a property for being distinguished?

/datasets/{cool_vector_dataset_name}/collections/{my_awesome_collection}/items
/datasets/{cool_raster_dataset_name}/collections/{my_awesome_collection}/cogs
{cool_vector_dataset_name} has a dataset_type=vector
{cool_raster_dataset_name} has a dataset_type=raster

Open mind to different vision/model and happy if this work can be started. Also a great tool to collaborate can be spotlight

I'm going to open an issue and discuss this on the geonode-api repo

tomkralidis · 2019-03-16T00:52:49Z

Definitely agree that we should put forth a dataset centric approach. This dovetails with the resource oriented architecture we are seeing with emerging OGC standards. Types of datasets can be enumerated as per ISO 19115 (vector, grid, etc.). We can also put forth the notion of virtual datasets (think WPS). We implemented the above in pygeoapi and are working on WCS REST. I would suggest looking at the pygeoapi config and API (WFS3/WPS) for ideas for data access. Of course we would need a management API and do on for endpoints that do not fit into the above.

…

On Mar 15, 2019, at 14:00, Francesco Bartoli ***@***.***> wrote: @ingenieroariel I think the minimal problem for an OpenAPI 3 definition is to model the main entities for this new api. Layers don't convince me and I would prefer the concept of Datasets. Then I'm wondering what is a dataset: A combination of vector and raster collections? /datasets/{cool_dataset_name}/collections/{my_awesome_vector_collection}/items (WFS3 implementation or subset from remote WFS3) {my_awesome_vector_collection} has a collection_type=vector property example: http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/ /datasets/{cool_dataset_name}/collections/{my_awesome_raster_collection}/cogs ( COGs bucket server or subset from remote collection. Here a collection can be also described by a STAC specification) {my_awesome_raster_collection} has a collection_type=raster property Just one of the two above with a property for being distinguished? /datasets/{cool_vector_dataset_name}/collections/{my_awesome_collection}/items /datasets/{cool_raster_dataset_name}/collections/{my_awesome_collection}/cogs {cool_vector_dataset_name} has a dataset_type=vector {cool_raster_dataset_name} has a dataset_type=raster Open mind to different vision/model and happy if this work can be started. Also a great tool to collaborate can be spotlight I'm going to open an issue and discuss this on the geonode-api repo — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

francbartoli · 2019-03-16T12:32:01Z

Definitely agree that we should put forth a dataset centric approach. This dovetails with the resource oriented architecture we are seeing with emerging OGC standards. Types of datasets can be enumerated as per ISO 19115 (vector, grid, etc.).

Good idea!

We can also put forth the notion of virtual datasets (think WPS). We implemented the above in pygeoapi and are working on WCS REST.

pygeoapi is something where definitively to have a look at. Do you have some examples of WCS REST? How would that be related to the concept of COG?

I would suggest looking at the pygeoapi config and API (WFS3/WPS) for ideas for data access. Of course we would need a management API and do on for endpoints that do not fit into the above.
…
On Mar 15, 2019, at 14:00, Francesco Bartoli @.***> wrote: @ingenieroariel I think the minimal problem for an OpenAPI 3 definition is to model the main entities for this new api. Layers don't convince me and I would prefer the concept of Datasets. Then I'm wondering what is a dataset: A combination of vector and raster collections? /datasets/{cool_dataset_name}/collections/{my_awesome_vector_collection}/items (WFS3 implementation or subset from remote WFS3) {my_awesome_vector_collection} has a collection_type=vector property example: http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/ /datasets/{cool_dataset_name}/collections/{my_awesome_raster_collection}/cogs ( COGs bucket server or subset from remote collection. Here a collection can be also described by a STAC specification) {my_awesome_raster_collection} has a collection_type=raster property Just one of the two above with a property for being distinguished? /datasets/{cool_vector_dataset_name}/collections/{my_awesome_collection}/items /datasets/{cool_raster_dataset_name}/collections/{my_awesome_collection}/cogs {cool_vector_dataset_name} has a dataset_type=vector {cool_raster_dataset_name} has a dataset_type=raster Open mind to different vision/model and happy if this work can be started. Also a great tool to collaborate can be spotlight I'm going to open an issue and discuss this on the geonode-api repo — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

francbartoli · 2019-03-16T12:35:35Z

Opened issue geonode-api:#1 for continuing the discussion

tomkralidis · 2019-03-19T02:30:30Z

Nothing in pygeoapi that is stable (working on it). I would imagine COG would simple be a given raster resource to which HTTP Range requests would be supported. +1 to continue over in GeoNode/geonode-api#1

capooti · 2019-04-04T10:53:39Z

I have renamed this GNIP to GeoNode 4, as GeoNode 3 will be still based on current architecture using Python 3 and Django 2

gannebamm · 2020-03-30T15:24:28Z

Just my 2cents:

With all my experience with current Geonodes codebase and working hard to keep it maintained, I would give my +1 for not beeing agnostic but choose frameworks and pin them. Check what possibilities are currently feasible and pick one as reference implementation. Only maintain this implementation. No more QGIS Server, Leaflet, OpenLayers, Geoserver, React and angular and maybe a bit of Vue codebase. We should be API first, which enables other frontend implementations, but we should not incorporate any of their code into our codebase.

This is a bit harsh, but I think will help us to stay clean for a longer period of time.

gannebamm · 2020-09-03T12:04:27Z

See PSC Meeting notes at: https://github.com/GeoNode/geonode/wiki/GeoNode-PSC-Meeting,-2020-09-03

giohappy · 2022-05-09T10:40:40Z

RC of GeoNode 4 released

pjdufour added the gnip A GeoNodeImprovementProcess Issue label Aug 23, 2017

capooti changed the title ~~GNIP: GeoNode 3~~ GNIP: GeoNode 4 Apr 4, 2019

afabiani closed this as completed Jul 11, 2019

afabiani changed the title ~~GNIP: GeoNode 4~~ GNIP-51: GeoNode 4 Aug 22, 2019

afabiani reopened this Aug 22, 2019

giohappy closed this as completed May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GNIP-51: GeoNode 4 #3228

GNIP-51: GeoNode 4 #3228

pjdufour commented Aug 23, 2017 •

edited by capooti

Loading

tomkralidis commented Aug 23, 2017

pjdufour commented Aug 24, 2017 •

edited

Loading

tomkralidis commented Aug 24, 2017

pjdufour commented Aug 24, 2017

tomkralidis commented Aug 24, 2017

afabiani commented Aug 24, 2017

pjdufour commented Aug 24, 2017

capooti commented Aug 24, 2017

pjdufour commented Aug 25, 2017

tomkralidis commented Aug 25, 2017

pjdufour commented Aug 25, 2017 •

edited

Loading

francbartoli commented Aug 31, 2017

jondoig commented Aug 31, 2017

rob-metalinkage commented Sep 3, 2017

Coop56 commented Nov 28, 2017

francbartoli commented Nov 28, 2017

gamesbook commented Mar 8, 2019

francbartoli commented Mar 8, 2019

ingenieroariel commented Mar 15, 2019

francbartoli commented Mar 15, 2019

tomkralidis commented Mar 16, 2019 via email

francbartoli commented Mar 16, 2019 •

edited

Loading

francbartoli commented Mar 16, 2019

tomkralidis commented Mar 19, 2019 •

edited

Loading

capooti commented Apr 4, 2019

gannebamm commented Mar 30, 2020

gannebamm commented Sep 3, 2020

giohappy commented May 9, 2022

GNIP-51: GeoNode 4 #3228

GNIP-51: GeoNode 4 #3228

Comments

pjdufour commented Aug 23, 2017 • edited by capooti Loading

GeoNode 4

tomkralidis commented Aug 23, 2017

pjdufour commented Aug 24, 2017 • edited Loading

tomkralidis commented Aug 24, 2017

pjdufour commented Aug 24, 2017

tomkralidis commented Aug 24, 2017

afabiani commented Aug 24, 2017

pjdufour commented Aug 24, 2017

capooti commented Aug 24, 2017

pjdufour commented Aug 25, 2017

tomkralidis commented Aug 25, 2017

pjdufour commented Aug 25, 2017 • edited Loading

francbartoli commented Aug 31, 2017

jondoig commented Aug 31, 2017

rob-metalinkage commented Sep 3, 2017

Coop56 commented Nov 28, 2017

francbartoli commented Nov 28, 2017

gamesbook commented Mar 8, 2019

francbartoli commented Mar 8, 2019

ingenieroariel commented Mar 15, 2019

francbartoli commented Mar 15, 2019

tomkralidis commented Mar 16, 2019 via email

francbartoli commented Mar 16, 2019 • edited Loading

francbartoli commented Mar 16, 2019

tomkralidis commented Mar 19, 2019 • edited Loading

capooti commented Apr 4, 2019

gannebamm commented Mar 30, 2020

gannebamm commented Sep 3, 2020

giohappy commented May 9, 2022

pjdufour commented Aug 23, 2017 •

edited by capooti

Loading

pjdufour commented Aug 24, 2017 •

edited

Loading

pjdufour commented Aug 25, 2017 •

edited

Loading

francbartoli commented Mar 16, 2019 •

edited

Loading

tomkralidis commented Mar 19, 2019 •

edited

Loading