Skip to content

This is a tool for supporting the rapid word collection workshop and post workshop clean-up

License

Notifications You must be signed in to change notification settings

sillsdev/TheCombine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Combine

Frontend Actions Status Frontend Coverage

Backend Actions Status Backend Coverage

GitHub release GitHub version GitHub GitHub contributors

User Interface Semantic Domains User Guide

A rapid word collection tool. See the User Guide for uses and features.

Table of Contents

  1. Getting Started with Development
    1. Install Required Software
    2. Prepare The Environment
    3. Python
      1. Windows Python Installation
      2. Linux Python Installation
      3. macOS Python Installation
      4. Python Packages
    4. Load Semantic Domains
  2. Available Scripts
    1. Running in Development
    2. Using OpenAPI
    3. Running the Automated Tests
    4. Import Semantic Domains
    5. Generate License Reports
    6. Inspect Database
    7. Add or Update Dictionary Files
    8. Cleanup Local Repository
    9. Generate Installer Script for The Combine
    10. Generate Tutorial Video Subtitles
  3. Setup Local Kubernetes Cluster
    1. Install Rancher Desktop
    2. Install Docker Desktop
    3. Install Kubernetes Tools
  4. Setup The Combine
    1. Install Required Charts
    2. Build The Combine Containers
    3. Setup Environment Variables
    4. Install/Update The Combine
    5. Connecting to your Cluster
    6. Rancher Dashboard
  5. Maintenance
    1. Development Environment
    2. Kubernetes Environment
  6. User Guide
  7. Continuous Integration and Continuous Deployment
    1. On Pull Request
    2. On Release
  8. Production
  9. Learn More

Getting Started with Development

Install Required Software

  1. Clone this repo:

    git clone https://github.com/sillsdev/TheCombine.git
  2. Chocolatey (Windows only): a Windows package manager.

  3. Node.js 20 (LTS)

    • On Windows, if using Chocolatey: choco install nodejs-lts
    • On Ubuntu, follow this guide using the appropriate Node.js version.
  4. .NET 8.0 SDK

  5. MongoDB provides instructions on how to install the current release of MongoDB.

    • On Windows, if using Chocolatey: choco install mongodb

    After installation:

    • Add mongo's /bin directory to your PATH environment variable.
    • Disable automatically start of the mongod service on your development host.
    • If mongosh is not a recognized command, you may have to separately install the MongoDB Shell and add its /bin to your PATH.
    • If mongoimport is not a recognized command, you may have to separately install the MongoDB Database Tools and add its /bin to your PATH.
  6. VS Code.

    • When you open this repo folder in VS Code, it should recommend the extensions used in this project (see .vscode/extensions.json).
  7. Python: The Python section of this document has instructions for installing Python 3 on each of the supported platforms and how to setup your virtual environment.

  8. FFmpeg and add its /bin to your PATH.

    • On Mac:
      • If using homebrew: brew install ffmpeg
      • If manually installing from the FFmpeg website, install both ffmpeg and ffprobe
  9. dotnet-reportgenerator dotnet tool update --global dotnet-reportgenerator-globaltool --version 5.0.4

  10. nuget-license dotnet tool update --global nuget-license

  11. Tools for generating the self installer (Linux only):

    • makeself - a tool to make self-extracting archives in Unix
    • pandoc - a tool to convert Markdown documents to PDF.
    • weasyprint a PDF engine for pandoc.

    These can be installed on Debian-based distributions by running:

    sudo apt install -y makeself pandoc weasyprint

Prepare the Environment

  1. (Optional) If you want the email services to work you will need to set the following environment variables. These COMBINE_SMTP_ values must be kept secret, so ask your email administrator to supply them. Set them in your .profile (Linux or Mac 10.14-), your .zprofile (Mac 10.15+), or the System app (Windows).

    • COMBINE_EMAIL_ENABLED=true
    • COMBINE_SMTP_SERVER
    • COMBINE_SMTP_PORT
    • COMBINE_SMTP_USERNAME
    • COMBINE_SMTP_PASSWORD
    • COMBINE_SMTP_ADDRESS
    • COMBINE_SMTP_FROM
  2. (Optional) To opt in to segment.com analytics to test the analytics during development:

    # For Windows, use `copy`.
    cp .env.local.template .env.local
  3. Run npm start from the project directory to install dependencies and start the project.

  4. Consult our C# and TypeScript style guides for best coding practices in this project.

Python

Python (3.12 recommended) is required to run the scripts that are used to initialize and maintain the cluster. Note that the commands for setting up the virtual environment must be run from the top-level directory for The Combine source tree.

Windows Python Installation

  • Navigate to the Python Downloads page.

  • Select the "Download Python" button at the top of the page. This will download the latest appropriate x86-64 executable installer.

  • Once Python is installed, create an isolated Python virtual environment using the py launcher installed globally into the PATH.

    py -m venv venv
    venv\Scripts\activate

Linux Python Installation

The python3 package is included in the Ubuntu distribution. To install the pip and venv modules for Python 3, run the following commands:

sudo apt update
sudo apt install python3-pip python3-venv

Create and activate an isolated Python virtual environment

python3 -m venv venv
# This command is shell-specific, for the common use case of bash:
source venv/bin/activate

macOS Python Installation

Install Homebrew.

Install Python 3 using Homebrew:

brew install python

Create and activate isolated Python virtual environment:

python3 -m venv venv
source venv/bin/activate

Python Packages

Important: All Python commands and scripts should be executed within a terminal using an activated Python virtual environment. This will be denoted with the (venv) prefix on the prompt.

With an active virtual environment, install Python development requirements for this project:

python -m pip install --upgrade pip pip-tools
python -m piptools sync dev-requirements.txt

The following Python scripts can now be run from the virtual environment.

To perform automated code formatting of Python code:

tox -e fmt

To run all Python linting steps:

tox

To upgrade all pinned dependencies:

python -m piptools compile --upgrade dev-requirements.in

To upgrade the pinned dependencies for the Maintenance container:

cd maintenance
python -m piptools compile --upgrade requirements.in

Load Semantic Domains

Data Entry will not work in The Combine unless the semantic domains have been loaded into the database. Follow the instuctions in Import Semantic Domains below to import the domains from at least one of the semantic domains XML files (which each contain domain data in English and one other language.)

Available Scripts

Running in Development

In the project directory, you can run:

npm start

Note: To avoid browser tabs from being opened automatically every time the frontend is launched, set BROWSER=none environment variable.

Installs the necessary packages and runs the app in the development mode.

Open http://localhost:3000 to view it in the browser.

npm run frontend

Runs only the front end of the app in the development mode.

npm run backend

Runs only the backend.

npm run database

Runs only the mongo database.

npm run build

Builds the app for production to the build folder.

It correctly bundles React in production mode and optimizes the build for the best performance.

The build is minified and the filenames include the hashes.

Your app is ready to be deployed!

See the section about deployment for more information.

npm run analyze

Run after npm run build to analyze the contents build bundle chunks.

Using OpenAPI

You need to have run npm start or npm run backend first.

To browse the auto-generated OpenAPI UI, browse to http://localhost:5000/openapi.

Regenerate OpenAPI bindings for frontend

First, you must install the Java Runtime Environment (JRE) 8 or newer as mentioned in the openapi-generator README.

  • For Windows: Install OpenJDK
  • For Ubuntu: sudo apt install default-jre
  • For macOS: brew install adoptopenjdk

After that, run the following script in your Python virtual environment to regenerate the frontend OpenAPI bindings in place:

python scripts/generate_openapi.py

Running the Automated Tests

npm test

Run all backend and frontend tests.

npm run test-backend

Run all backend unit tests.

To run a subset of tests, use the --filter option.

# Note the extra -- needed to separate arguments for npm vs script.
npm run test-backend -- --filter FullyQualifiedName~Backend.Tests.Models.ProjectTests

npm run test-frontend

Launches the test runners in the interactive watch mode. See the section about running tests for more information.

To run a subset of tests, pass in the name of a partial file path to filter:

# Note the extra -- needed to separate arguments for npm vs script.
npm run test-frontend -- DataEntry

npm run test-*:coverage

Launches the test runners to calculate the test coverage of the frontend or backend of the app.

Frontend Code Coverage Report

Run:

npm run test-frontend:coverage

To view the frontend code coverage open coverage/lcov-report/index.html in a browser.

Backend Code Coverage Report

Run:

npm run test-backend:coverage

Generate the HTML coverage report:

npm run gen-backend-coverage-report

Open coverage-backend/index.html in a browser.

npm run test-frontend:debug

Runs Jest tests for debugging, awaiting for an attach from an IDE.

For VSCode, run the Debug Jest Tests configuration within the Run tab on the left taskbar.

npm run fmt-backend

Automatically format the C# source files in the backend.

npm run lint

Runs ESLint on the codebase to detect code problems that should be fixed.

npm run lint:fix-layout

Run ESLint and apply suggestion and layout fixes automatically. This will sort and group imports.

npm run fmt-frontend

Auto-format frontend code in the src folder.

Import Semantic Domains

To import Semantic Domains from the XML files in ./deploy/scripts/semantic_domains/xml. Run from within a Python virtual environment.

  1. Generate the files for import into the Mongo database:

    cd ./deploy/scripts
    python sem_dom_import.py <xml_filename> [<xml_filename> ...]

    where <xml_filename> is the name of the file(s) to import. Currently each file contains English and one other language.

  2. Start the database:

    npm run database
  3. Import the files that were created.

    There are two files that were created for each language in step 1, a nodes.json and a tree.json. The nodes.json file contains the detailed data for each node in the semantic domain tree; the tree.json file contains the tree structure of the semantic domains. To import the semantic domain data, run:

    cd ./deploy/scripts/semantic_domains/json
    mongoimport -d CombineDatabase -c SemanticDomains nodes.json --mode=upsert --upsertFields=id,lang,guid
    mongoimport -d CombineDatabase -c SemanticDomainTree tree.json --mode=upsert --upsertFields=id,lang,guid

Generate License Reports

To generate a summary of licenses used in production

npm run license-summary-backend
npm run license-summary-frontend

To generate a full report of the licenses used in production that is included in the user guide:

npm run license-report-backend
npm run license-report-frontend

Note: This should be performed each time production dependencies are changed.

Inspect Database

To browse the database locally during development, open MongoDB Compass Community.

  1. Under New Connection, enter mongodb://localhost:27017
  2. Under Databases, select CombineDatabase

Add or Update Dictionary Files

The dictionary files for spell-check functionality in The Combine are split into parts to allow lazy-loading, for the sake of devices with limited bandwidth. There are scripts for generating these files in src/resources/dictionaries/; files in this directory should not be manually edited.

The bash script scripts/fetch_wordlists.sh is used to fetch dictionary files for a given language (e.g., es) from the LibreOffice dictionaries and convert them to raw wordlists (e.g., src/resources/dictionaries/es.txt). Execute the script with no arguments for its usage details. Any language not currently supported can be manually added as a case in this script.

./scripts/fetch_wordlist.sh

The python script scripts/split_dictionary.py takes a wordlist textfile (e.g., src/resources/dictionaries/es.txt), splits it into multiple TypeScript files (e.g., into src/resources/dictionaries/es/ with index file .../es/index.ts), and updates src/resources/dictionaries/index.ts accordingly. Run the script within a Python virtual environment, with -h/--help to see its usage details.

python scripts/split_dictionary.py --help

For some languages, the wordlist is too large for practical use. Generally try to keep the folder for each language under 2.5 MB, to avoid such errors as FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory in the Kubernetes build. For smaller folder sizes, default maximum word-lengths are automatically imposed for some languages: (ar, es, fr, pt, ru). Use -m/--max to override the defaults, with -m -1 to force no limit.

Adjust the -t/--threshold and -T/--Threshold parameters to split a wordlist into more, smaller files; e.g.:

  • python scripts/split_dictionary.py -l hi -t 1000
  • python scripts/split_dictionary.py -l sw -t 1500

The top of each language's index.ts file states which values of -m, -t, and -T were used for that language.

Cleanup Local Repository

It's sometimes possible for a developer's local temporary state to get out of sync with other developers or CI. This script removes temporary files and packages while leaving database data intact. This can help troubleshoot certain types of development setup errors. Run from within a Python virtual environment.

python scripts/cleanup_local_repo.py

Generate Installer Script for The Combine (Linux only)

To generate the installer script, run the following commands starting in the project top level directory:

cd installer
./make-combine-installer.sh combine-release-number

where combine-release-number is the Combine release to be installed, e.g. v2.1.0.

Options:

  • --net-install - build an installer that will download the required images at installation time. The default is to package the images in the installation script.

To update the PDF copy of the installer README.md file, run the following from the installer directory:

pandoc --pdf-engine=weasyprint README.md -o README.pdf

Generate Tutorial Video Subtitles

Tutorial video transcripts are housed in docs/tutorial_subtitles, together with timestamps aligning transcripts with the corresponding videos and any transcript translations downloaded from Crowdin. To generate subtitle files (and optionally attach them to a video file), run from within a Python virtual environment:

python scripts/subtitle_tutorial_video.py -s <subtitles_subfolder_name> [-i <input_video_path> -o <output_video_path] [-v]

Setup Local Kubernetes Cluster

This section describes how to create a local Kubernetes cluster using either Rancher Desktop or Docker Desktop.

Advantages of Rancher Desktop:

  1. runs the same Kubernetes engine, k3s, that is used by The Combine when installed on a NUC; and
  2. includes the Rancher User Interface for easy inspection and management of Kubernetes resources:

alt text

Advantages of Docker Desktop:

  1. can run with fewer memory resources; and
  2. simpler to navigate to the running application from your web browser.

The steps to install The Combine in a local Kubernetes client are:

  1. Install Rancher Desktop OR Install Docker Desktop
  2. Install Kubernetes Tools
  3. Setup The Combine

Install Rancher Desktop

Install Rancher Desktop to create a local Kubernetes cluster to test The Combine when running in containers. (Optional. Only needed for running under Kubernetes.)

When Rancher Desktop is first run, you will be prompted to select a few initial configuration items:

alt text

  1. Verify that Enable Kubernetes is checked.
  2. Select the Kubernetes version marked as stable, latest.
  3. Select your container runtime, either containerd or dockerd (moby):
    • containerd matches what is used on the NUC and uses the k3s Kubernetes engine. It requires that you set the CONTAINER_CLI environment variable to nerdctl before running the build.py script.
    • dockerd uses the k3d (k3s in docker).
  4. Select Automatic or Manual path setup.
  5. Click Accept.

The Rancher Desktop Main Window will be displayed as it loads the Kubernetes environment. While the page is displayed, click the Settings icon (gear icon in the upper-right corner). The settings dialog will be displayed: alt text

  1. Click Kubernetes in the left-hand pane.
  2. Uncheck the Enable Traefik checkbox.

Install Docker Desktop

Install Docker Desktop from https://docs.docker.com/get-docker/.

Notes for installing Docker Desktop in Linux:

  1. Docker Desktop requires a distribution running the GNOME or KDE Desktop environment.

  2. If you installed docker or docker-compose previously, remove them:

    sudo apt purge docker-ce docker-ce-cli containerd.io
    sudo apt autoremove
    if [ -L /usr/bin/docker-compose ] ; then sudo rm /usr/bin/docker-compose ; fi
    if [ -x /usr/local/bin/docker-compose ] ; then sudo rm /usr/local/bin/docker-compose ; fi

Once Docker Desktop has been installed, start it, and set it up as follows:

  1. Click the gear icon in the upper right to open the settings dialog;
  2. Click on the Resources link on the left-hand side and set the Memory to at least 4 GB (see Note);
  3. Click on the Kubernetes link on the left-hand side;
  4. Select Enable Kubernetes and click Apply & Restart;
  5. Click Install on the dialog that is displayed.

Note:

Normally, there is a slider to adjust the Memory size for the Docker Desktop virtual machine. On Windows systems using the WSL 2 backend, there are instructions for setting the resources outside of the Docker Desktop application.

Install Kubernetes Tools

If the following tools were not installed with either Rancher Desktop or Docker Desktop, install them from these links:

  1. kubectl
    • On Windows, if using Chocolatey: choco install kubernetes-cli
  2. helm
    • On Windows, if using Chocolatey: choco install kubernetes-helm

Setup The Combine

This section describes how to build and deploy The Combine to your Kubernetes cluster. Unless specified otherwise, all of the commands below are run from The Combine's project directory and are run in an activated Python virtual environment. (See the Python section to create the virtual environment.)

Install Required Charts

Install the required charts by running:

python deploy/scripts/setup_cluster.py --type development

deploy/scripts/setup_cluster.py assumes that the kubectl configuration file is setup to manage the desired Kubernetes cluster. For most development users, there will only be the Rancher Desktop/Docker Desktop cluster to manage and the installation process will set that up correctly. If there are multiple clusters to manage, the --kubeconfig and --context options will let you specify a different cluster.

Run the script with the --help option to see possible options for the script.

Build The Combine Containers

Build The Combine containers by running the build script in an activated Python virtual environment from TheCombine's project directory. (See the Python section to create the virtual environment.)

python deploy/scripts/build.py

Notes:

  • If you are using Rancher Desktop with containerd for the container runtime, set the following environment variable in your user profile:

    export CONTAINER_CLI="nerdctl"

    If you are using Rancher Desktop with the dockerd container runtime or Docker Desktop, clear this variable or set its value to docker.

  • Run with the --help option to see all available options.

  • If you see errors like:

    => ERROR [internal] load metadata for docker.io/library/nginx:1.21        0.5s

    pull the image directly and re-run the build. In this case, you would run:

    docker pull nginx:1.21
  • If --tag is not used, the image will be untagged. When running or pulling an image with the tag latest, the newest, untagged image will be pulled.

  • --repo and --tag are not specified under normal development use.

Setup Environment Variables

Before installing The Combine in Kubernetes, you need to set the following environment variables: COMBINE_CAPTCHA_SECRET_KEY, COMBINE_JWT_SECRET_KEY. For development environments, you can use the values defined in Backend/Properties/launchSettings.json. Set them in your .profile (Linux or Mac 10.14-), your .zprofile (Mac 10.15+), or the System app (Windows).

Note: The following is optional for Development Environments.

In addition to the environment variables defined in Prepare the Environment, you may setup the following environment variables:

  • AWS_ACCOUNT
  • AWS_DEFAULT_REGION
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

These variables will allow The Combine to:

  • pull released and QA software images from AWS Elastic Container Registry (ECR);
  • create backups and push them to AWS S3 storage; and
  • restore The Combine's database and backend files from a backup stored in AWS S3 storage.

The Combine application will function in a local cluster without these AWS_ variables set.

Install/Update The Combine

Install the Kubernetes resources to run The Combine by running:

python deploy/scripts/setup_combine.py [--target <target_name>] [--tag <image_tag>]

The default target is localhost; the default tag is latest. For development testing the script will usually be run with no arguments.

If an invalid target is entered, the script will list available targets and prompt the user his/her selection. deploy/scripts/setup_combine.py assumes that the kubectl configuration file is setup to manage the desired Kubernetes cluster. For most development users, there will only be the Rancher Desktop/Docker Desktop cluster to manage and the installation process will set that up correctly. If there are multiple clusters to manage, the --kubeconfig and --context options will let you specify a different cluster.

Run the script with the --help option to see possible options for the script.

When the script completes, the resources will be installed on the specified cluster. It may take a few moments before all the containers are up and running. If you are using Rancher Desktop, you can use the Rancher Dashboard to see when the cluster is ready. Otherwise, run kubectl -n thecombine get deployments or kubectl -n thecombine get pods. For example,

$ kubectl -n thecombine get deployments
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
backend       1/1     1            1           10m
database      1/1     1            1           10m
frontend      1/1     1            1           10m
maintenance   1/1     1            1           10m

or

$ kubectl -n thecombine get pods
NAME                           READY   STATUS    RESTARTS   AGE
backend-5657559949-z2flp       1/1     Running   0          10m
database-794b4d956f-zjszm      1/1     Running   0          10m
frontend-7d6d79f8c5-lkhhz      1/1     Running   0          10m
maintenance-7f4b5b89b8-rhgk9   1/1     Running   0          10m

Connecting to Your Cluster

Setup Port Forwarding

Rancher Desktop only!

To connect to The Combine user interface on Rancher Desktop, you need to setup port forwarding.

  1. From the Rancher Desktop main window, click on Port Forwarding on the left-hand side.
  2. Click the Forward button to the left of the https port for ingress-controller-ingress-nginx-controller in the ingress-nginx namespace: alt text
  3. A random port number is displayed. You may change it or accept the value and click the checkmark.

Note that the port forwarding is not persistent; you need to set it up whenever Rancher Desktop is restarted.

Connecting to The Combine

You can connect to The Combine by entering the URL https://thecombine.localhost in the address bar of your web browser. (https://thecombine.localhost:<portnumber> for Rancher Desktop)

Notes:

  1. If you do not specify the https://, your browser may do a web search instead of navigating to The Combine.
  2. By default self-signed certificates are used, so you will need to accept a warning in the browser.

Rancher Dashboard

The Rancher Dashboard shows an overview of your Kubernetes cluster. The left-hand pane allows you to explore the different Kubernetes resources that are deployed in the cluster. This includes viewing configuration, current states, and logs:

alt text

To open the Rancher Dashboard, right-click on the Rancher Desktop icon in the system tray and select Dashboard from the pop-up menu:

alt text

Maintenance

The maintenance scripts enable certain maintenance tasks on your instance of TheCombine. TheCombine may be running in either a development environment or the production/qa environment.

Development Environment

The following maintenance tasks can be performed in the development environment. To run TheCombine in the development environment, run npm start from the project directory. Unless specified otherwise, each of the maintenance commands are to be run from the project directory.

Create a New Admin User (Development)

Task: create a new user who is a site administrator

Commands

  • set/export COMBINE_ADMIN_PASSWORD

  • set/export COMBINE_ADMIN_EMAIL

  • run

    cd Backend
    dotnet run create-admin-username=admin

Drop Database

Task: completely erase the current Mongo database

Run:

npm run drop-database

Grant Admin Rights

Task: grant site admin rights for an existing user

Run:

# Note the '--' before the user name
npm run set-admin-user -- <USERNAME>

Kubernetes Environment

The following maintenance tasks can be performed in the Kubernetes environment. The Kubernetes cluster may be one of the production or QA clusters or the local development cluster. For most of these tasks, the Rancher Dashboard provides a more user-friendly way to maintain and manage the cluster.

For each of the kubectl commands below:

  • you must have a kubectl configuration file that configures the connection to the kubernetes cluster to be maintained. The configuration file needs to installed at ${HOME}/.kube/config or specified in the KUBECONFIG environment variable.
  • the kubectl commands can be run from any directory
  • any of the Python scripts (local or remote using kubectl) can be run with the --help option to see more usage options.

Stopping The Combine

To stop The Combine without deleting it, you scale it back to 0 replicas running:

kubectl -n thecombine scale --replicas=0 deployments frontend backend maintenance database

You can restart the deployments by setting --replicas=1.

Deleting Helm Charts

Deleting a helm chart will delete all Kubernetes resources including any persistent data or any data stored in a container.

In addition to clearing out old data, there may be cases where existing charts need to be deleted and re-installed instead of upgraded, for example, when a configuration change requires changes to an immutable attribute of a resource.

To delete a chart, first list all of the existing charts:

$ helm list -A
NAME                NAMESPACE       REVISION    UPDATED                                 STATUS      CHART                   APP VERSION
cert-manager        cert-manager    3           2022-02-28 11:27:12.141797222 -0500 EST deployed    cert-manager-v1.7.1     v1.7.1
ingress-controller  ingress-nginx   3           2022-02-28 11:27:15.729203306 -0500 EST deployed    ingress-nginx-4.0.17    1.1.1
rancher             cattle-system   1           2022-03-11 12:46:06.962438027 -0500 EST deployed    rancher-2.6.3           v2.6.3
thecombine          thecombine      2           2022-03-11 11:41:38.304404635 -0500 EST deployed    thecombine-0.7.14       2.0.0

Using the chart name and namespace, you can then delete the chart:

helm -n <chart_namespace> delete <chart_name>

where <chart_namespace> and <chart_name> are the NAMESPACE and NAME respectively of the chart you want to delete. These are listed in the output of helm list -A.

Checking The System Status

Once The Combine is installed, it is useful to be able to see the state of the system and to look at the logs. The Combine is setup as four deployments:

  • frontend
  • backend
  • database
  • maintenance

Each deployment definition is used to create a pod that runs the docker image.

To see the state of the deployments, run:

$ kubectl -n thecombine get deployments
NAME          READY   UP-TO-DATE   AVAILABLE   AGE
database      1/1     1            1           3h41m
maintenance   1/1     1            1           3h41m
backend       1/1     1            1           3h41m
frontend      1/1     1            1           3h41m

Similarly, you can view the state of the pods:

$ kubectl -n thecombine get pods
NAME                           READY   STATUS      RESTARTS        AGE
database-794b4d956f-g2n5k      1/1     Running     1 (3h51m ago)   3h58m
ecr-cred-helper--1-w9xxp       0/1     Completed   0               164m
maintenance-85644b9c76-55pz8   1/1     Running     0               130m
backend-69b77c46c5-8dqlv       1/1     Running     0               130m
frontend-c94c5747c-pz6cc       1/1     Running     0               60m

Use the logs command to view the log file of a pod; you can specify the pod name listed in the output of the kubectl -n thecombine get pods command or the deployment, for example, to view the logs of the frontend, you would run:

kubectl -n thecombine logs frontend-c94c5747c-pz6cc

or

kubectl -n thecombine logs deployment/frontend

If you want to monitor the logs while the system is running, add the --follow option to the command.

Add a User to a Project

Task: add an existing user to a project

Run:

kubectl exec -it deployment/maintenance -- add_user_to_proj.py --project <PROJECT_NAME> --user <USER>

For additional options, run:

kubectl exec -it deployment/maintenance -- add_user_to_proj.py --help`

Backup TheCombine

Task: Backup the CombineDatabase and the Backend files to the Amazon Simple Storage Service (S3).

Run:

kubectl exec -it deployment/maintenance -- combine_backup.py [--verbose]

Notes:

  1. The backup command can be run from any directory.
  2. The daily backup job on the server will also clean up old backup for the machine that is being backed up. This is not part of combine_backup.py; backups made with this script must be managed manually. See the AWS CLI Command Reference (s3) for documentation on how to use the command line to list and to manage the backup objects.

Delete a Project

Task: Delete a project

Run:

kubectl exec -it deployment/maintenance -- rm_project.py <PROJECT_NAME>

You may specify more than one <PROJECT_NAME> to delete multiple projects.

Restore TheCombine

Task: Restore the CombineDatabase and the Backend files from a backup stored on the Amazon Simple Storage Service (S3).

Run:

kubectl exec -it deployment/maintenance -- combine_restore.py [--verbose] [BACKUP_NAME]

Note:

The restore script takes an optional backup name. This is the name of the backup in the AWS S3 bucket, not a local file. If the backup name is not provided, the restore script will list the available backups and allow you to choose one for the restore operation.

User Guide

The User Guide found at https://sillsdev.github.io/TheCombine is automatically built from the master branch.

To locally build the user guide and serve it dynamically (automatically reloading on change), run the following from your Python virtual environment:

tox -e user-guide-serve

To locally build the user guide statically into docs/user-guide/site:

tox -e user-guide

Continuous Integration and Continuous Deployment

On Pull Request

When a Pull Request (PR) is created and for each push to the PR branch, a set of CI tests are run. When all the CI tests pass and the PR changes have been reviewed and approved by a team member, then the PR may be merged into the master branch. When the merge is complete, The Combine software is built, pushed to the AWS ECR Private registry, and deployed to the QA server:

sequenceDiagram
   actor Author
   actor Reviewer
   participant github as sillsdev/TheCombine
   participant gh_runner as GitHub Runner
   participant sh_runner as Self-Hosted Runner
   participant reg as AWS Private Registry
   participant server as QA Server
   Author ->> github: create Pull Request(work_branch)
   activate github
   par
      loop for each CI test
        Note over github,gh_runner: CI tests are run concurrently
        github ->> gh_runner: start CI test
         activate gh_runner
            gh_runner ->> gh_runner: checkout work_branch
            gh_runner ->> gh_runner: run test
            gh_runner -->> github: test passed
         deactivate gh_runner
      end
   and
      github ->> Reviewer: request review
      Reviewer -->> github: approved
   end
   github ->> github: merge work_branch to master
   github ->> github: delete work_branch
   github ->> gh_runner: run deploy_qa workflow
   activate gh_runner
   loop component in (frontend, backend, database, maintenance)
      Note right of gh_runner: components are built concurrently
      gh_runner ->> gh_runner: checkout master
      gh_runner ->> gh_runner: build component
      gh_runner ->> reg: push component image(image_tag)
      gh_runner -->> github: build complete(image_tag)
   end
   deactivate gh_runner
   github ->> sh_runner: deploy to QA server(image_tag)
   activate sh_runner
   loop deployment in (frontend, backend, database, maintenance)
      sh_runner -) server: update deployment image(image_tag)
      server ->> reg: pull image(image_tag)
      reg -->> server: updated image(image_tag)
   end
   deactivate sh_runner
Loading

On Release

When a team member creates a release on The Combine's GitHub project page, a Release tag is created on the master branch, the software is built and pushed to the AWS ECR Public registry and then deployed to the production server.

sequenceDiagram
   actor Developer
   participant github as sillsdev/TheCombine
   participant gh_runner as GitHub Runner
   participant sh_runner as Self-Hosted Runner
   participant reg as AWS Public Registry
   participant server as Production Server
   Developer ->> github: create Release
   github ->> github: create release tag on master branch
   github ->> gh_runner: run deploy_release workflow
   activate gh_runner
   loop component in (frontend, backend, database, maintenance)
      Note right of gh_runner: components are built concurrently
      gh_runner ->> gh_runner: checkout release tag
      gh_runner ->> gh_runner: build component
      gh_runner ->> reg: push component image(image_tag)
      gh_runner -->> github: build complete(image_tag)
   end
   deactivate gh_runner
   github ->> sh_runner: deploy to Production server(image_tag)
   activate sh_runner
   loop deployment in (frontend, backend, database, maintenance)
      sh_runner -) server: update deployment image(image_tag)
      server ->> reg: pull image(image_tag)
      reg -->> server: updated image(image_tag)
   end
   deactivate sh_runner
Loading

Production

The process for configuring and deploying TheCombine for production targets is described in docs/deploy/README.md.

Learn More

Development Tools

Database (MongoDB)

Backend (C# + ASP.NET)

Frontend (Typescript + React + Redux)

Kubernetes/Helm