Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GNIP-55: Support production-grade deployement using Docker #3707

Closed
olivierdalang opened this issue Apr 4, 2018 · 15 comments
Closed

GNIP-55: Support production-grade deployement using Docker #3707

olivierdalang opened this issue Apr 4, 2018 · 15 comments
Labels
gnip A GeoNodeImprovementProcess Issue

Comments

@olivierdalang
Copy link
Member

Overview

The idea of this GNIP is to officially support a production-grade deployment method using Docker. By production-grade, it is meant a deployment solution that satisfies basic needs in terms of performance and security.

Rationale

  1. There's currently no clear deployment method supported for production (there's an entry about that in the doc, but it's incomplete and very hard to follow) which results on a lot of very poorly secured installations
  2. Docker works on all major platforms (Windows, Linux, Mac OS X)
  3. It's very easy to use (basically, start the whole stack in 1 command line, almost identical on any system)
  4. It's very easy to make new releases (basically one git tag)
  5. Similar setup for dev and production
  6. It's possible to include features that are not strictly part of Geonode, but still essential for a production ready deployment (https encryption, backups...)

Points 1. and 6. are a major problem for smaller institutions that don't have advanced sysadmins / security experts, which represents a lot of Geonode user (just look at the Geonode gallery and at how many of them use https).

Implementation

I suggest an implementation similar to https://github.com/olivierdalang/SPCgeonode (as presented earlier on the ML)

  • Basically a Geonode-project (so that it's easy to customize, you can still get the vanilla geonode if you don't customize anything)
  • Including customized Dockerfile (such as Nginx, Geoserver, etc.) in the same repo rather than in external repositories, so that releasing new versions can be done all in one place and that it's easier to know what's actually being installed
  • Including automatic ssl encryption and easy to configure backups using popular cloud provider
  • Optionally maintain a Rancher catalog entry (which is almost 0 work)

I suggest mentioning that deployment method on the home page, making it clear that the apt-get method isn't production-ready out of the box.

Repository

This could probably live in the Geonode-project repository.

@giohappy
Copy link
Contributor

giohappy commented Apr 4, 2018

Thanks @olivierdalang. It seems very hard to converge on a common Docker deployment, given the number of pieces that compose the GeoNode stack. We have the "official" proposal by @francbartoli, the Rancher and Docker templates from Kartoza, yours and we, at GeoSolutions, are also working on our solution.

Although we can agree on some basic general requirements (https, nginx, etc.) more opinioneted approached (like using syncthing, rclone and letsencrypt services) may not fit well for everyone.
IMHO, also the choice of having GeoNode and Geoserver share users and roles through PG can't find general agreement.

I like the idea of using secrets though. We also were considering it...

@francbartoli
Copy link
Member

Thanks @olivierdalang for your proposal. I have few comments:

  1. Despite I like the idea to have one single repository I'm not quite sure it's good to mix things up on the geonode-project. It's likely we should create a new repository where to consolidate all docker related stuff of GeoNode images with tags (To be published on docker hub) for official releases (mostly python package artifacts apart from the master branch which can still use git+https link) of the code. Geonode repo and geonode-project can easily maintain an override in terms of docker-compose files which exploit those images. My feeling is that geonode-project is for developers who wants to customize GeoNode, that means code and eventually deployment. Need to investigate further how your repo has been organized.
  2. Definitively +1 to add SSL encryption and backups procedure even if the latter should be just a preparation for the preferred backup method, which I would leave up to the users. Add just the easiest example should be enough here.
  3. +1 to add a Rancher catalog with inputs from that already done from the Kartoza team.
  4. -1 to declare the apt-get method as not production-ready, we have spent so much efforts on doing it as easier as possible and that now it is a stable artifact even if the built process for the release is not easy. In general, although I love the docker ecosystem, I'm not with the opinion that dockerize GeoNode means sysadmin aren't required anymore, there are plenty of network/system things to know behind a container deployment which cannot be simplified so much IMHO. However still +1 to add Docker/Rancher as alternative deployment method on the home page.

In addition as a major issue for a production-grade deployment I would add the scaling of GeoServer which is actually a critical issue now.

Best Regards,
Francesco

@cezio
Copy link
Contributor

cezio commented Apr 4, 2018

Thanks @olivierdalang. I have few comments:

  • In general I like the idea of using geonode-project as a dockerization starting point. This is recommended to start customized development of GeoNode-based projects, so having docker support there makes sense.

  • saying that, it looks there's no one 'correct' way to deploy GeoNode with docker, multiple paths exist. I'm not sure if adding additional elements into core infrastructure is a good thing. Maybe those elements should be optional, and added in separate compose file. Of course, user can customize it at the end in rendered project, but I think the effort should be to add additional elements instead of removing them from defaults.

  • I'm ok with having nginx/geoserver dockerfiles and support files within repository. It may look like overhead and repetition, but once geonode-project is rendered into actual django application, it's separate entity, and should be considered as such. Having supporting images configuration locally allows to customize it and keeps it in sync with rest of the code.

  • ok with rancher catalog entry, meaning it will be also template in geonode-project repo.

@olivierdalang
Copy link
Member Author

Just to keep track, here's Alessio's comment from the ML:

+1 on root password security encryption
-1 on having a role service bounded to the local DB. We should really use the standard REST based role service here.

Just to be clear, when I mentioned my previous implementation (https://github.com/olivierdalang/SPCgeonode), I mainly thought about those 4 points (Geonode project, including dockerfiles of related services, ssl+backup, rancher catalog entry), not so much about the implementation details (which I'm not very happy with neither :-) )

@cezio @giohappy

  • I'm not sure if adding additional elements into core infrastructure is a good thing
  • more opinioneted approached (like using syncthing, rclone and letsencrypt services) may not fit well for everyone

I think we should keep at as little opinionated as possible, but still opinionated enough so that it works out of the box (backup and ssl). People who have strong opinions about question such as certificate authorities or backup providers also are generally also capable of customizing those aspects to their liking.

@francbartoli

In addition as a major issue for a production-grade deployment I would add the scaling of GeoServer which is actually a critical issue now.

Do you think of having multiple GeoServer containers with a load balancer ? Is that easy to achieve ? Is there a performance gain if running on a single host or only if on a cluster ? I think it's a good goal if achievable without making usage more complex for small deployments.

@francbartoli
Copy link
Member

@olivierdalang It is not so easy to achieve but something valuable for an enterprise deployment with rancher/kubernetes cluster and multiple hosts. Probably not for small deployments

@olivierdalang
Copy link
Member Author

@francbartoli If it results in making usage/maintenance significantly more complex, I'd favor leaving this out of the basic setup, since for large scale deployments, an advanced sysadmin will be involved anyways.

@olivierdalang
Copy link
Member Author

One thing I didn't mention in the GNIP but may be critical for long term support of deployments is the ability to ship updates to the geoserver data directory. Maybe just keep a file/flag with a version number, so that we can have scripts that can incrementally update the geoserver configuration if needed.

@lucernae
Copy link
Contributor

Hi everyone. Jump in from Kartoza here.
+1 for dockerized deployment (and development environment if possible).
I don't know with you guys, but us in Kartoza, we heavily uses docker for development setup in our laptop (there are various linux and mac). At first it is really hard to create such setup because maybe the docker environment in geonode is not developer friendly at first. So, sometimes we had to create our own docker setup.
Additionally we also uses docker to setup CI/CD and running unittest. We could also think long term so that these 'official' docker images can be used for people to setup migration tests easily.

Also, just a thumbs up @olivierdalang for providing letsencrypt and celery admin panel, we could really benefit from that.

Do you think of having multiple GeoServer containers with a load balancer ? Is that easy to achieve ? Is there a performance gain if running on a single host or only if on a cluster ? I think it's a good goal if achievable without making usage more complex for small deployments.

Even if it is on a single host, there could be benefit from parallelized GeoServer REST request, like map rendering or job, or authorization. However, the most troublesome thing I encountered so far is how to make GeoServer authentication/authorization work properly if scaled. In QGIS Server it was easy to achieve because there is no authentication at all between GeoNode and QGIS Server. So, I'm guessing GeoServer scaling is not only a sysadmin problem, but you have to add some code as well to support it.

@olivierdalang
Copy link
Member Author

olivierdalang commented Apr 15, 2018

Going a little bit more in detail about what would be a complete implementation in my mind :

  • Officially supported / documented / maintained
  • Easy to use (as few steps as possible to get it running)
  • Working on Windows, Mac & Linux
  • A Geonode-project (so that it's easy to customise)
  • Including customised Dockerfiles for related services in the same git repo
  • As lightweight as possible (alpine rather than debian)
  • Secure out of the box (besides basic configuration, no further step is required to have a secure install) - if possible pen-tested
  • 100% working out of the box (incl. auth for OGC services, celery tasks, etc.)
  • Automatic SSL encryption
  • Easy to configure backups using popular cloud provider
  • Travis continuous testing (or equivalent) of the installation procedure
  • Including a Rancher catalog entry
  • Updatable via git pull
  • Ready for long term maintenance (e.g. including a version flag in the geoserver data directory so that we can have migration scripts)
  • Including horizontal scaling capabilities only if it doesn't make usage more complex
  • Stable (not pulling any unstable branches/builds)
  • Installable offline (probably requires a preconfgured computer acting as a docker registry)

@giohappy
Copy link
Contributor

giohappy commented Apr 16, 2018

thanks @olivierdalang it will be a nice agenda for the next call :)
Something we miss is a "restore container".
Backup/restore is accomplished by GeoNode's backup restore facilities, but probably in a Docker scenario it would be overly complex to automate it in a reliable way.
Having a "restore container" would close the circle, at least if upgrades paths are not considered.

@timlinux
Copy link
Contributor

Hi All

Great discussion here. By the way there was an issue with the QGIS backend Geonode catalogue entry at https://github.com/kartoza/kartoza-rancher-catalogue/blob/master/README.md which I just fixed.

I also just received a huge patch to the Kartoza GeoServer docker recipe at https://github.com/kartoza/docker-geoserver (the patch was sent by email and is not in the PR queue yet). The patch is a security patch and takes the container down from about 215 known vulnerabilities to about 10. The vulnerabilities cover stuff like not running as root through to packaged libraries in libstdc etc. I think there would be a huge benefit in at least providing the building blocks (QGIS Server, GeoServer, Core GeoNode config) from an 'official' source so that we can provide the best quality possible base for people building out custom GeoNode docker setups.

We also have some plans in our work with WB to extend our docker setup so that we can have user managed theme customisation (probably delivered via a file sync protocol such as Dropbox). I think we should provide a general / generic solution like this 'officially' through the GeoNode project as it will probably cover many user's needs. Those that need something really customised could unpack the official recipe and tweak it for their own needs.

I also discussed with GeoSolutions a bit the idea of providing a common base recipe that both the QGIS Server based and GeoServer based recipes can build on so that we minimise reinventing the wheel as much as possible.

@giohappy
Copy link
Contributor

giohappy commented Apr 17, 2018 via email

@afabiani afabiani added gnip A GeoNodeImprovementProcess Issue and removed devops labels Sep 4, 2018
@olivierdalang
Copy link
Member Author

Just copying what was discussed on the ML here in the Docker Strategy Chat thread (check there for more context) :

I suggest voting during next PSC which of the following strategies we prefer :
1/ status quo : no merge of SPC geonode, incremental work continues to be done on the Docker setup towards more stability, with no short term plan to have a production-grade deployment
2/ two-phase merge : we copy over SPC geonode under github.com/Geonode/geonode-docker-deployment (or whatever) and officialy document / support it, then we start incremental work to merge the two setups with the goal of continuous support of both production-grade deployment and existing workflows (e.g. live demo)
3/ one-phase merge : we copy over SPC geonode under the main repo disrupting current workflows

@afabiani
Copy link
Member

@olivierdalang I recently took a look at SPCGeonode and also had the occasion to rework and improve the Dockerization of GeoNode and GeoNode-Project, which now supports also asynchronous setup.

The two setups are not really different, except that they use fairly different approaches to configure the images and few different environment variables but in the end, the outcomes are the same.

I guess that the best option would be to put some effort to make the two approaches converge. It should not take too much effort. This is the best option for this GNIP to be accepted somehow.

Before further discussing it, please take some time to review the recent work on

https://github.com/geosolutions-it/geonode-project/blob/master/README.rst

in particular the section

https://github.com/geosolutions-it/geonode-project/blob/master/README.rst#start-your-server

@francbartoli
Copy link
Member

@olivierdalang after a quick review and more thoughts on the topic I would definitively adopt the status quo with the implementation of a CI/CD through CircleCI as soonest as possible.

So my vote will go for 1/.

@afabiani afabiani changed the title [GNIP] Support production-grade deployement using Docker GNIP-55: Support production-grade deployement using Docker Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gnip A GeoNodeImprovementProcess Issue
Projects
None yet
Development

No branches or pull requests

7 participants