Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: deploy a sharded MongoDB #168

Merged
merged 1 commit into from
Oct 9, 2024
Merged

feat: deploy a sharded MongoDB #168

merged 1 commit into from
Oct 9, 2024

Conversation

tschneider-aneo
Copy link
Contributor

@tschneider-aneo tschneider-aneo commented Sep 2, 2024

Motivation

Described in AEP 004. This PR is meant to treat a previsible bottleneck on ArmoniK's database by enabling a sharded architecture to it.

Description

This PR adds a new module mongodb-sharded that calls a Helm chart deploying a sharded MongoDB (by default Bitnami's). It is both out-of-the-box and configurable : number of shards, number of replicas, node selectors and labels for each MongoDB entities, persistence, etc.

Testing

The two examples modules test this new module : one deploys it with a complete redefinition of the configuration, the other with all the default values

Impact

It is expected to enhance ArmoniK's strong scalability at the cost of simple database configuration. A sharded database is indeed more complex.

Warning

Theses developments were made using Bitnami's mongodb-sharded image and Helm chart. It is not guaranteed that they work without using these particular resources.

As Bitnami's images are not verified with Docker Scout, we advice to adapt Bitnami's mongodb-sharded image to make it compliant with your personnal or business security requirements.

Additional Information

To better understand how MongoDB sharding works see MongoDB's documentation.

Checklist

  • My code adheres to the coding and style guidelines of the project.
  • I have performed a self-review of my code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • I have thoroughly tested my modifications and added tests when necessary.
  • Tests pass locally and in the CI.
  • I have assessed the performance impact of my modifications.

aneojgurhem added a commit to aneoconsulting/ArmoniK.Core that referenced this pull request Oct 1, 2024
# Motivation

Following aneoconsulting/ArmoniK.Infra#168, this
PR aims to enable the deployment of a sharded MongoDB as ArmoniK's
database.

A more complete motivation is provided in AEP 004
(aneoconsulting/ArmoniK.Community#48). This PR
is meant to treat a previsible bottleneck on ArmoniK's database by
enabling a sharded architecture to it.

For now, when a sharded database is specified, the following collections
will be sharded :
- Result
- TaskData
- SessionData

# Description

This PR adds : 
- A method `ShardCollectionAsync` to the `IMongoDataModelMapping`
interface.
- 2 new `MongoOptions` : a string `AuthSource` and a boolean `Sharding`.
- A new class `ShardingExt` containing an extension method
`shardCollection` for the `IClientSessionHandle` interface.

When the `MongoOption` `Sharding` is true, the `MongoCollectionProvider`
calls `ShardCollectionAsync`. Then the implementation depends on whether
the collection is wanted to be sharded :
- If the collection has to be sharded, the `shardCollection` extension
method will be called.
- If the collection isn't wanted to be sharded, the method directly
returns a complete `Task`.

The new `MongoOption` `AuthSource` is required because the `MongoClient`
has to authenticate as an administrator to be able to shard a
collection.

# Testing

It has been tested manually. Since these developmentsts are very coupled
with ArmoniK's database, it is difficult to write unit tests. Anyhow, it
is still possible to have automatized integration tests that would
verify in a first time if a deployment of ArmoniK using these
deployments succeeds, and in a second time if sharding is indeed enabled
on the right collections. Workflows testing a deployment of ArmoniK with
a sharded MongoDB are currently being studied.

# Impact

In the end, it is expected to enhance ArmoniK's strong scalability.

# Additional Information

To better understand how MongoDB sharding works see [MongoDB's
documentation](https://www.mongodb.com/docs/manual/sharding/).

To understand why this new MongoOption is required, see : MongoDB's
documentation on the [shardCollection database
command](https://www.mongodb.com/docs/manual/reference/command/shardCollection/#mongodb-dbcommand-dbcmd.shardCollection)
and on [connection string's authSource
option](https://www.mongodb.com/docs/manual/reference/connection-string-options/#mongodb-urioption-urioption.authSource)

# Checklist

- [X] My code adheres to the coding and style guidelines of the project.
- [X] I have performed a self-review of my code.
- [X] I have commented my code, particularly in hard-to-understand
areas.
- [ ] I have made corresponding changes to the documentation.
- [X] I have thoroughly tested my modifications and added tests when
necessary.
- [X] Tests pass locally and in the CI.
- [ ] I have assessed the performance impact of my modifications.
storage/onpremise/mongodb-sharded/configmap.tf Outdated Show resolved Hide resolved
storage/onpremise/mongodb-sharded/configmap.tf Outdated Show resolved Hide resolved
storage/onpremise/mongodb-sharded/locals.tf Outdated Show resolved Hide resolved
storage/onpremise/mongodb-sharded/secrets.tf Show resolved Hide resolved
storage/onpremise/mongodb-sharded/versions.tf Outdated Show resolved Hide resolved
@tschneider-aneo tschneider-aneo force-pushed the ts/shard-mongodb branch 3 times, most recently from 0063973 to e2aacac Compare October 8, 2024 13:05
@tschneider-aneo tschneider-aneo merged commit b57529a into main Oct 9, 2024
4 checks passed
@tschneider-aneo tschneider-aneo deleted the ts/shard-mongodb branch October 9, 2024 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants