Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update migration query to make it work with existing view #2308

Merged
merged 1 commit into from
Dec 12, 2022

Conversation

fm100
Copy link
Collaborator

@fm100 fm100 commented Dec 9, 2022

Signed-off-by: Minkyu Park [email protected]

Problem

V52 migration fails if the datasets_view is already created by the previous migration, because postgresql does not allow altering the type of the column when it is being used by views.

Closes: #2298

Solution

This PR changes the V52 migration query to drop the view before ALTER. Because repeatable migration runs only when its checksum changes, we have to get the view definition first, and then drop and recreate it as before.

Note: All database schema changes require discussion. Please link the issue for context.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've updated the CHANGELOG.md with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary)
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)
  • You've included a header in any source code files (if relevant)

@boring-cyborg boring-cyborg bot added the api API layer changes label Dec 9, 2022
Comment on lines +8 to +26
SELECT EXISTS (
SELECT * FROM information_schema.views
WHERE table_schema='public' AND table_name='datasets_view'
) INTO datasets_view_exists;

IF datasets_view_exists THEN
-- Altering is not allowed when the column is being used from views. So here,
-- we temporarily drop the view before altering and recreate it.
SELECT view_definition FROM information_schema.views
WHERE table_schema='public' AND table_name='datasets_view'
INTO datasets_view_definition;

DROP VIEW datasets_view;
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR;
EXECUTE format('CREATE VIEW datasets_view AS %s', datasets_view_definition);
ELSE
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR;
END IF;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😳

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an overkill, as

DROP VIEW IF EXISTS datasets_view;

at the beginning of the script should be enough as in PR:
https://github.com/MarquezProject/marquez/pull/2313/files

Copy link
Collaborator

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the view will recreate on its own.

Comment on lines +8 to +26
SELECT EXISTS (
SELECT * FROM information_schema.views
WHERE table_schema='public' AND table_name='datasets_view'
) INTO datasets_view_exists;

IF datasets_view_exists THEN
-- Altering is not allowed when the column is being used from views. So here,
-- we temporarily drop the view before altering and recreate it.
SELECT view_definition FROM information_schema.views
WHERE table_schema='public' AND table_name='datasets_view'
INTO datasets_view_definition;

DROP VIEW datasets_view;
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR;
EXECUTE format('CREATE VIEW datasets_view AS %s', datasets_view_definition);
ELSE
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR;
END IF;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an overkill, as

DROP VIEW IF EXISTS datasets_view;

at the beginning of the script should be enough as in PR:
https://github.com/MarquezProject/marquez/pull/2313/files

@fm100
Copy link
Collaborator Author

fm100 commented Dec 12, 2022

I think the view will recreate on its own.

No, it won't. As the doc describes here, https://flywaydb.org/documentation/concepts/migrations.html#repeatable-migrations , the repeatable migration only runs when their checksum changes, which means that it runs only when the repeatable migration itself changes. Repeatable migration is repeatable, but it doesn't run every time.

Copy link
Collaborator

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fm100 You're right that dropping the view wouldn't be sufficient to go. Thanks for explanation.

Why is this better than modifying R__3_Datasets_view.sql file to get another checksum or making it run each time as described here -> https://flywaydb.org/blog/flyway-timestampsandrepeatables ?

Copy link
Collaborator

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 not to block the progress on this.

@fm100
Copy link
Collaborator Author

fm100 commented Dec 12, 2022

@fm100 You're right that dropping the view wouldn't be sufficient to go. Thanks for explanation.

Why is this better than modifying R__3_Datasets_view.sql file to get another checksum or making it run each time as described here -> https://flywaydb.org/blog/flyway-timestampsandrepeatables ?

Interesting. I don't think that this is better than making repeatable migration run. I did try adding a new line but it didn't change the checksum and this is what I ended up doing. It seems like that white spaces don't change the checksum but comments do. Anyway, unblocking this is pretty important and let's merge this.

@fm100 fm100 merged commit e501e36 into MarquezProject:main Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api API layer changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

V52 database migration breaks existing Marquez installations
3 participants