-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding indexed created_at column to lineage events table #2299
adding indexed created_at column to lineage events table #2299
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2299 +/- ##
=========================================
Coverage 77.01% 77.01%
Complexity 1166 1166
=========================================
Files 222 222
Lines 5307 5307
Branches 424 424
=========================================
Hits 4087 4087
Misses 747 747
Partials 473 473 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
c495a52
to
5d3185c
Compare
api/src/main/resources/marquez/db/migration/V52__lineage_events_created_at_indexed.sql
Outdated
Show resolved
Hide resolved
@prachim-collab, having a timestamp for when an OL event was received on the server is going to be very helpful. Mind also opening an issues to link to your PR about the timestamp usage and any changes to relevant APIs? |
@wslulciuc I have created this new issue #2304 |
6c15266
to
4897ff1
Compare
@@ -0,0 +1,9 @@ | |||
ALTER TABLE lineage_events ADD created_at TIMESTAMP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIMESTAMP WITH TIME ZONE
. I think it's good practice to just always include the time zone in the timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, i kept it as TIMESTAMP following other created_at columns in the schema. But i will update this one to TIMESTAMP WITH TIME ZONE.
Additionally , i was before backfilling old entries with event_time. I have removed that step now because it might be a costly operation during migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the type to TIMESTAMP WITH TIME ZONE
Signed-off-by: Prachi Mishra <[email protected]>
Signed-off-by: Prachi Mishra <[email protected]>
Signed-off-by: Prachi Mishra <[email protected]>
Signed-off-by: Prachi Mishra <[email protected]>
Signed-off-by: Prachi Mishra <[email protected]>
53b5cf7
to
dc70917
Compare
Problem
For analytics use case we need to incrementally copy and export lineage_events to DWH. For this use case there is no good way to identify incrementally created events in database. The current event_time in lineage_events table is client generated and can be back dated.
Closes: #2300
Solution
Checklist
I have tested by deploying the changes in my local database and migration script ran successfully on top of existing schema.
CHANGELOG.md
with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary).sql
database schema migration according to Flyway's naming convention (if relevant)