-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update v61 migration to handle duplicate job names before unique constraint #2464
Conversation
…traint Signed-off-by: Michael Collado <[email protected]>
a78e9e8
to
9912536
Compare
Codecov Report
@@ Coverage Diff @@
## main #2464 +/- ##
=========================================
Coverage 83.53% 83.53%
Complexity 1207 1207
=========================================
Files 231 231
Lines 5503 5503
Branches 267 267
=========================================
Hits 4597 4597
Misses 762 762
Partials 144 144 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
jobs j | ||
WHERE j.uuid = f.uuid | ||
) | ||
UPDATE jobs SET name=(q.simple_name || '_' || q.row) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will set a job with a conflict to {job_name}_{row_num}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but row_number() over (PARTITION BY j.namespace_name, j.name ORDER BY j.created_at) AS row
means that all jobs that have the same name will have an incrementing counter suffixed so we avoid the conflict. E.g., if we had three jobs with the FQN a.b
, the names would be
a.b_1
a.b_2
a.b_3
Thus, we can add the uniqueness constraint because they'll all end up with different names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet, thanks for clarifying and the examples 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your writeup in the PR description was helpful in understanding all that's going on in the migration script. Thanks! And, also, 👍
…traint (MarquezProject#2464) Signed-off-by: Michael Collado <[email protected]> Signed-off-by: Xavier-Cliquennois <[email protected]>
Problem
The v61 migration reapplies a unique constraint on job name and namespace without regard to parent job. Unfortunately, there are some cases when job FQNs are duplicated (mostly due to the issue fixed in #2097 , but also including cases when a job previously had no parent and now has a parent, but the same FQN). This update to the migration renames jobs that have been symlinked to point to newer versions of themselves so that the job FQN doesn't conflict and the unique constraint can be applied.
Any installations that have already applied this migration will not see any new operations on their data, but installations that have duplicates will need this fix for the migration to successfully complete.
I tested this on an installation with ~200,000 job names and was able to complete the migration successfully.
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)