Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release GN v4.4.0 #107

Merged
merged 24 commits into from
Oct 25, 2023
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0053807
GN v4.4.0
juanluisrp Jul 11, 2023
c4e01cb
Update
fxprunayre Jul 21, 2023
71cad20
Move env to variable for future easier configuration of multiple inst…
fxprunayre Aug 31, 2023
f21fd8c
Jetty / Increase max form keys
fxprunayre Sep 7, 2023
9d1cb4f
Jetty config / Allows larger form.
fxprunayre Sep 11, 2023
8bf8eda
Java options to be able to return metrics in Java 11.
fxprunayre Sep 11, 2023
da562a5
Clustering / Add instruction for testing and add a simple load balancer.
fxprunayre Sep 11, 2023
4f6488d
Nginx / Add body size parameter and fix hard coded scheme.
fxprunayre Sep 11, 2023
8c37971
Move to traefik.
fxprunayre Sep 12, 2023
d64c0f1
Add health check.
fxprunayre Sep 13, 2023
df6a147
Readme update for traefik change.
fxprunayre Sep 13, 2023
fdf9372
Monitoring / Load traefik log using filebeat. Removing Apache and Ngi…
fxprunayre Sep 13, 2023
724036d
Update multiple instances limitations.
fxprunayre Sep 13, 2023
cd6af05
Jetty / Update version and fix sending mail on java 11.
fxprunayre Sep 18, 2023
ad067fd
Add timezone config and use separate schemapublication dir (avoid issue
fxprunayre Sep 21, 2023
12bee86
Update README.md
fxprunayre Sep 25, 2023
f768ddd
Update README.md
fxprunayre Sep 25, 2023
9f06cdd
Update README.md
fxprunayre Sep 25, 2023
b249c6e
Fix typo in variable expansion
juanluisrp Sep 28, 2023
410cd70
Update MD5 for 4.4.0
fxprunayre Oct 19, 2023
0b9ff2f
When building image, set image name for docker compose to find it.
fxprunayre Oct 19, 2023
b13cc6f
Add proxy variables example
fxprunayre Oct 20, 2023
09af312
Update 4.4.0/README.md
fxprunayre Oct 20, 2023
a2e70d7
Fix Markdown warnings
juanluisrp Oct 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions 4.4.0/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
FROM jetty:9-jdk11

ENV DATA_DIR /catalogue-data
ENV WEBAPP_CONTEXT_PATH /geonetwork
ENV GN_CONFIG_PROPERTIES -Dgeonetwork.dir=${DATA_DIR} \
-Dgeonetwork.formatter.dir=${DATA_DIR}/data/formatter \
-Dgeonetwork.schema.dir=/opt/geonetwork/WEB-INF/data/config/schema_plugins \
-Dgeonetwork.indexConfig.dir=/opt/geonetwork/WEB-INF/data/config/index


ENV JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -Djava.awt.headless=true \
-Xms512M -Xss512M -Xmx2G -XX:+UseConcMarkSweepGC

USER root
RUN apt-get -y update && \
apt-get -y install --no-install-recommends \
curl \
unzip && \
rm -rf /var/lib/apt/lists/* && \
mkdir -p ${DATA_DIR} && \
chown -R jetty:jetty ${DATA_DIR} && \
mkdir -p /opt/geonetwork && \
chown -R jetty:jetty /opt/geonetwork

USER jetty
ENV GN_FILE geonetwork.war
ENV GN_VERSION 4.4.0
ENV GN_DOWNLOAD_MD5 36638cfd380942801ff2038792ee54a9

RUN cd /opt/geonetwork/ && \
curl -fSL -o geonetwork.war \
https://sourceforge.net/projects/geonetwork/files/GeoNetwork_opensource/v${GN_VERSION}/${GN_FILE}/download && \
echo "${GN_DOWNLOAD_MD5} *geonetwork.war" | md5sum -c && \
unzip -q geonetwork.war && \
rm geonetwork.war

COPY jetty/geonetwork_context_template.xml /usr/local/share/geonetwork/geonetwork_context_template.xml
COPY ./docker-entrypoint.sh /geonetwork-entrypoint.sh

RUN java -jar /usr/local/jetty/start.jar --create-startd --add-module=http-forwarded

ENTRYPOINT ["/geonetwork-entrypoint.sh"]
CMD ["java","-jar","/usr/local/jetty/start.jar"]

VOLUME [ "${DATA_DIR}" ]
54 changes: 54 additions & 0 deletions 4.4.0/Dockerfile.local
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
FROM jetty:9-jdk11 as base

USER root
RUN apt-get update && apt-get install -y --no-install-recommends unzip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /opt/geonetwork \
&& chown -R jetty:jetty /opt/geonetwork

COPY geonetwork.war /tmp

USER jetty
RUN unzip /tmp/geonetwork.war -d /opt/geonetwork



FROM jetty:9-jdk11 as final

ENV GN_FILE geonetwork.war
ENV GN_VERSION 4.4.0

ENV DATA_DIR /catalogue-data
ENV WEBAPP_CONTEXT_PATH /geonetwork


# This variable can be used to define additional config options in the way of Java System properties
# (e.g. "-Des.protocol=http -Des.port=9200 -Des.index.records=geo-records")
ENV GN_CONFIG_PROPERTIES -Dgeonetwork.dir=${DATA_DIR} \
-Dgeonetwork.formatter.dir=${DATA_DIR}/data/formatter \
-Dgeonetwork.schema.dir=/opt/geonetwork/WEB-INF/data/config/schema_plugins \
-Dgeonetwork.indexConfig.dir=/opt/geonetwork/WEB-INF/data/config/index

# JAVA_OPTS can be used to configue JVM specific options, like max memory, debugger port and method...
ENV JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -Djava.awt.headless=true \
-Xms512M -Xss512M -Xmx2G -XX:+UseConcMarkSweepGC

USER root
RUN apt-get update && apt-get install -y --no-install-recommends unzip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir -p /catalogue-data \
&& chown -R jetty:jetty /catalogue-data

USER jetty

COPY jetty/geonetwork_context_template.xml /usr/local/share/geonetwork/geonetwork_context_template.xml
COPY --from=base /opt/geonetwork /opt/geonetwork

COPY ./docker-entrypoint.sh /geonetwork-entrypoint.sh

RUN java -jar /usr/local/jetty/start.jar --create-startd --add-to-start=http-forwarded

ENTRYPOINT ["/geonetwork-entrypoint.sh"]
CMD ["java","-jar","/usr/local/jetty/start.jar"]
250 changes: 250 additions & 0 deletions 4.4.0/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
# Version 4.4.0

## Running with integrated Elasticsearch

1. Clone this repository

```shell script
git clone https://github.com/geonetwork/docker-geonetwork.git
cd docker-geonetwork/4.4.0
```

2. Run the docker-composition from the current directory:

```shell script
docker-compose up
```

3. Open http://geonetwork.localhost/geonetwork/ in a browser


## Build docker image

If not published, you can build the image locally using:

```shell script
docker build . -t geonetwork:4.4.0
```

## Running with custom geonetwork.war


This directory includes two Dockerfiles:
* `Dockerfile` is canonical one used to generate the Docker Hub official
image. It downloads GeoNetwork 4.4.0-0 WAR file from sourceforge.
* `Dockerfile.local` needs a `geonetwork.war` file next to it to build
the image.

It also includes two docker-compose configuration files.
* `docker-compose.yml` uses official GeoNetwork image from Docker Hub.
* `docker-compose.dev.yml` can be applied to override the image used in
`docker-compose.yml` and build the GeoNetwork image using `Dockerfile.local`.


### Pre-built image

To use the pre-built image you can use the `docker-compose.yml` file provided
in this directory:

```shell script
docker-compose up
```

### Local image

To be able to generate an elasticsearch-ready docker image, you will have:

1. Build your geonetwork.war (https://geonetwork-opensource.org/manuals/trunk/en/maintainer-guide/installing/installing-from-source-code.html#the-quick-way)

2. Clone this repository

```shell script
git clone https://github.com/geonetwork/docker-geonetwork.git
cd docker-geonetwork/4.4.0
```

3. Get the generated webapp in the current directory, name it `geonetwork.war`

```shell
cp ../../core-geonetwork/web/target/geonetwork.war .
```

4. Run the docker-composition from the current directory:

```shell script
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up --build
```

5. Open http://geonetwork.localhost/geonetwork/ in a browser

## Running with a custom Database

See "Connecting to a postgres database" https://hub.docker.com/_/geonetwork


```shell script
docker run --name geonetwork -d -p 8080:8080 \
-e GEONETWORK_DB_TYPE=postgres \
-e GEONETWORK_DB_HOST=my-db-host \
-e GEONETWORK_DB_PORT=5434 \
-e GEONETWORK_DB_USERNAME=postgres \
-e GEONETWORK_DB_PASSWORD=mysecretpassword \
-e GEONETWORK_DB_NAME=mydbname \
geonetwork:4.4.0
```

## Running with remote Elasticsearch

```shell script
docker run --name geonetwork -d -p 8080:8080 \
-e "GN_CONFIG_PROPERTIES=-Des.host=elasticsearch \
-Des.protocol=http \
-Des.port=9200 \
-Des.url=http://elasticsearch:9200 \
-Dgeonetwork.ESFeaturesProxy.targetUri=http://elasticsearch:9200/gn-features/{_} " \
geonetwork:4.4.0
```

If you have error connecting to the remote Elasticsearch, check the configuration in `config/elasticsearch.yml`:

```yaml
network.host: my-elasticsearch-host
discovery.seed_hosts: []
```

## Running with custom Elasticsearch index names

Add the following options to `GN_CONFIG_PROPERTIES`:

```
-Des.index.records=geo-records
-Des.index.features=geo-features
-Des.index.searchlogs=geo-searchlogs
-Dgeonetwork.ESFeaturesProxy.targetUri=http://elasticsearch:9200/geo-features/{_}
```


## Running with remote Elasticsearch with authentication

Add the `-Des.username=esUserName -Des.password=esPassword` options to `GN_CONFIG_PROPERTIES`.

If using the WFS features harvesting, add the
`-Dgeonetwork.ESFeaturesProxy.username=esReadOnlyUsername -Dgeonetwork.ESFeaturesProxy.password=esPassword` options to `GN_CONFIG_PROPERTIES`.


## Running with remote Kibana

Add the `-Dgeonetwork.HttpDashboardProxy.targetUri=http://kibana:5601` options to `GN_CONFIG_PROPERTIES`.


## Running with remote OGC API Records

Add the `-Dgeonetwork.MicroServicesProxy.targetUri=http://ogc-api-records-service:8080` options to `GN_CONFIG_PROPERTIES`.


## Running with custom security mode

Add the `-Dgeonetwork.security.type=` to set the authentication mode. See available security modes in https://github.com/geonetwork/core-geonetwork/blob/main/web/src/main/webapp/WEB-INF/config-security/config-security.xml#L43-L64 and configuration options in https://github.com/geonetwork/core-geonetwork/blob/main/web/src/main/webapp/WEB-INF/config-security/config-security.properties. See also https://geonetwork-opensource.org/manuals/4.0.x/en/administrator-guide/managing-users-and-groups/authentication-mode.html.


eg. LDAP configuration:
```
-Dgeonetwork.security.type=ldap
-Dldap.host=ldap
-Dldap.port=389
-Dldap.base=dc=geonetwork-opensource,dc=org
-Dldap.base.dn=dc=geonetwork-opensource,dc=org
-Dldap.security.principal=cn=admin,dc=geonetwork-opensource,dc=org
-Dldap.security.credentials=secret
-Dldap.base.search.base=ou=directory
-Dldap.sync.user.search.base=ou=directory
-Dldap.base.dn.pattern=uid={0},ou=directory
```

eg. CAS configuration
```
-Dcas.baseURL=http://localhost:8080/cas
-Dcas.login.url=http://localhost:8080/cas/login
-Dcas.ticket.validator.url=http://cas:8080/cas
-Dgeonetwork.https.url=http://localhost:8080/geonetwork
```


## Running with a custom context path

To run the application in a custom context path, for example in http://geonetwork.localhost/catalogue instead of the default http://geonetwork.localhost/geonetwork use the `WEBAPP_CONTEXT_PATH` environment variable:
```yaml
environment:
WEBAPP_CONTEXT_PATH: /catalogue
```

## Configure the default language

To configure the default application language and bypass browser language detection when redirecting from the base URL use:

```
-Dlanguage.default=fre
-Dlanguage.forceDefault=true
```

## Running behind a proxy

If the catalogue needs to use proxy for HTTP calls, use Java environment variables:

```
-Dhttp.proxyHost=<proxyAddress>
-Dhttp.proxyPort=<proxyPort>
-Dhttps.proxyHost=<proxyAddress>
-Dhttps.proxyPort=<proxyPort>
-Dhttp.nonProxyHosts=<nonProxyHosts>
-Dhttp.proxyUser=<proxyUser>
-Dhttp.proxyPassword=<proxyPassword>
```



## Clustering (experimental)

The clustering mode allows to start more than one GeoNetwork instance.
To enable it use the `scaled` profile. In this mode:
* only one node will be in charge of the harvester scheduler and process the scheduled harvesting tasks
* any node can take a harvesting task manually triggered from the harvesting console
* webserver is configured with sticky session (ie. a user stay on the same node)

First, start the main composition which will start all services (including the main node). Then start new instances with:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to start this up in multiple goes? I remember having all instances (regular + replicas) start up fine when triggering the command below.

```shell script
docker-compose --profile scaled up --scale geonetwork-replica=2 -d
```

Known limitations:
* Harvester / Scheduler needs to be refreshed when the database harvester configuration is modified (the harvesting node refresh the schedule every 2 minutes as a stopgap solution)
* Harvester / Replica can't access the main node harvester log files
* Harvester / Running state is not visible on other nodes
* Settings / When saving application settings, some modules need to be updated:
* log level configuration,
* DOI configuration,
* proxy configuration (use Java environment variable instead of database configuration)
* Thesaurus / Local thesaurus modified in one node are not updated on others.


## Monitoring

A composition is also available for monitoring metrics and logs
for the webserver and the database.

First start the composition without monitoring containers.
In Kibana go to `Manage space` and create a `catalogue-monitor` space.
This space will be populated with default dashboards by metricbeat and filebeat.

Once the space created, use the following to start metricbeat and filebeat:

```shell script
docker-compose -f docker-compose.yml -f docker-compose.monitoring.yml up --build
```

Metricbeat and filebeat needs to authenticate to push into Kibana (GeoNetwork is checking access). Adapt password
if needed in configuration files for `setup.kibana.username` and `setup.kibana.password`.

Once started, sample dashboards analyzing the GeoNetwork API usage are available in `catalogue-log-dashboard.ndjson`.

![Dashboard](catalogue-log-dashboard.png)
Loading