-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port MongoDB Source to Java #3428
Comments
Mongo is schemaless which will be a tricky situation to support. The most difficult problem we need to work out for the MVP release is how to support incremental sync. To support incremental sync, we need to discover the schema in each collection (table). I propose that we sample 10,000 records (or some other number) in each collection to discover the schema. We should transform the schema as follows:
This connector must support full refresh and incremental sync. |
Past MVP it seems like we could do some additional Normalization work.
If we could do that recursively it would be awesome. This would take the schemaless mongo and get it into a reasonable normalized schema in a relational model. For reasonably well formed mongo documents this would save a ton of custom DBT transforms. |
Many Mongo users use Mongoose to define the schema and communicate with Mongo. It would be cool if users can just drop in the Mongoose schema for each collection, and the source connector can convert them to Json Schema. |
Tell us about the new integration you’d like to have
MongoDB is a critical source to support. Our current connector was contributed by a user. However, while the implementation is generally high quality, it is written in Ruby, and the Airbyte Core team's proficiencies are Java & Python. This means that we are much slower to implement features & bugfixes due to a lack of proficiency in Ruby. So we'd like to port the connector over to one of our core languages in order to offer better SLA & support.
Describe the alternative you are considering or using
Continue to use current Ruby-based connector
Implementation:
test container to use:
https://www.testcontainers.org/modules/databases/mongodb/
Todo:
In case of possible use of jdbc
6: Documentation, Prepare pull request, pass all checks.
in case of need to implement with mongo native driver - :-(
use existing mongo source
┆Issue is synchronized with this Asana task by Unito
Notes
It seems like the JDBC driver provided by unityjdbc is paid. So we have the same case here as it was for BigQuery. @DoNotPanicUA is currently working on db sources refactoring and implementation to make core better for such cases. So there is no value to start working on this ticket until the #4024 and #1876 are not completed. Then we would also need to support non-jdbc tests basics.
Aa this is non JDBC and even non SQL DB additional work in core part would be also required
The text was updated successfully, but these errors were encountered: