-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest/vertica): performance improvement and bug fixes #8328
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: Harshal Sheth <[email protected]>
hsheth2
changed the title
Vertica plugin performance improvement and bug fixes
feat(ingest/vertica): performance improvement and bug fixes
Jul 19, 2023
yoonhyejin
pushed a commit
that referenced
this pull request
Aug 24, 2023
Co-authored-by: Harshal Sheth <[email protected]>
spadhi7
added a commit
to spadhi7/datahub
that referenced
this pull request
Aug 29, 2023
* tag 'v0.10.5': (222 commits) fix(test): increase siblings.js test stability (datahub-project#8542) feat(search): Allow aggregating on facets that are not explicitly part of default filter set (datahub-project#8540) fix(ui) Make multiple small updates to new search and browse (datahub-project#8524) feat(presto-on-hive): allow v1 fieldpaths in the presto-on-hive source (datahub-project#8474) feat(cli): Adds ability to upload recipes to DataHub's UI (datahub-project#8317) feat(browseV2): add browseV2 logic to system update (datahub-project#8506) fix(ingest/json-schema): convert non-string enums to strings (datahub-project#8479) feat(ingestion/tableau): support column level lineage for custom sql (datahub-project#8466) test(ingest): test case statements with sql parser (datahub-project#8437) feat(ingest/vertica): performance improvement and bug fixes (datahub-project#8328) ci: reduce git fetch depth (datahub-project#8473) fix(ingest): remove duplication of tags (datahub-project#8532) docs: small update to homepage (datahub-project#8483) fix(ingest): pin boto3-stubs in CI (datahub-project#8527) feat(siblings): hiding non-existant siblings in FE (datahub-project#8528) fix(ingest/build): Fix sagemaker mypy and flake8 issues (datahub-project#8530) feat(metrics): add metrics for aspect write and bytes (datahub-project#8526) feat(elasticsearch): allow bulk delete (datahub-project#8424) fix(ui): use locale lowercase when filtering columns of an entity in the lineage (datahub-project#8213) fix(auth): ignore case when comparing http headers (datahub-project#8356) ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
community-contribution
PR or Issue raised by member(s) of DataHub Community
ingestion
PR or Issue related to the ingestion of metadata
merge-pending-ci
A PR that has passed review and should be merged once CI is green.
product
PR or Issue related to the DataHub UI/UX
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Checklist
Rearchitected the way queries were being executed to minimize the total number of queries during ingestion. Customers with large catalogs were seeing unacceptable performance. Changed to schema-level capture of information into memory and then iterating through memory to get individual object details. We did testing on our side and with new changes performance improvement was above 90% as compared to the already existing Vertica plugin.
We have removed Oauth metadata features from Vertica as it was exposing security-related pieces of information.
Added integration test for Vertica plugin which covers all our features per datahub request.
we added form-based UI ingestion for the Vertica plugin.
Upgraded vertica dialect from 0.0.1 to 0.0.8 which now supports the latest sqlalchemy features.
Bug fixes and other small improvements.