Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate using project directories instead of project instances #654

Merged
merged 21 commits into from
Jan 27, 2022

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Dec 11, 2021

Description

This PR changes the schema migration code to operate in terms of a project's root directory rather than a project class instance. Additionally, loading and checking the schema version of a project is done using a cascade of multiple schema loaders that support different schemas.

Motivation and Context

The current schema migration logic makes some easily invalidated assumptions about what operations are valid in the course of a migration. The two most problematic assumptions are that 1) config.load_config can load schemas of any schema version, and 2) a valid Project can be created from any schema version by simply skipping the schema compatibility check. The former is invalidated by, for instance, a change in the format of the config file like the one planned for schema version 2. The latter would be invalidated by any change to the Project constructor that assumes the existence of certain keys in the config. The migration to schema version 2 does not immediately break this logic since we are removing project ids, not adding them (see #643 and #644), but there is no reason to assume that we will not make changes in the opposite direction in future schema versions. This PR protects us against such changes.

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

@vyasr vyasr added the refactor Code refactoring label Dec 11, 2021
@vyasr vyasr requested a review from csadorf December 11, 2021 19:20
@vyasr vyasr self-assigned this Dec 11, 2021
@vyasr vyasr requested review from a team as code owners December 11, 2021 19:20
@vyasr vyasr requested review from jennyfothergill and removed request for a team December 11, 2021 19:20
@vyasr
Copy link
Contributor Author

vyasr commented Dec 11, 2021

@csadorf @jennyfothergill apologies in advance if the quality isn’t up to my usual standard. It’s been hard to dedicate a significant block of time, so I’ve been working on this in very brief fits and starts and I figured at this point it would be better to just get feedback on what I had rather than wait longer.

@codecov
Copy link

codecov bot commented Dec 11, 2021

Codecov Report

Merging #654 (096daba) into master (44f59ee) will increase coverage by 0.00%.
The diff coverage is 81.03%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #654   +/-   ##
=======================================
  Coverage   78.42%   78.42%           
=======================================
  Files          65       65           
  Lines        7119     7129   +10     
  Branches     1564     1564           
=======================================
+ Hits         5583     5591    +8     
- Misses       1226     1230    +4     
+ Partials      310      308    -2     
Impacted Files Coverage Δ
signac/__main__.py 71.89% <ø> (ø)
signac/contrib/migration/__init__.py 83.05% <80.00%> (+0.59%) ⬆️
signac/contrib/migration/v0_to_v1.py 83.33% <81.81%> (-16.67%) ⬇️
signac/contrib/project.py 85.20% <100.00%> (-0.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44f59ee...096daba. Read the comment docs.

@vyasr vyasr changed the title Refactor/migration root dir Migrate using project directories instead of project instances Dec 12, 2021
Copy link
Member

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a handful of comments. I think this looks pretty good overall. It's unfortunate that the logic for migrations is so complex, but I can't envision a simpler design.

signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/v0_to_v1.py Outdated Show resolved Hide resolved
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov is complaining a lot about missing tests. Maybe we can restructure the existing tests to more systematically cover those additional cases?

signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/v0_to_v1.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
@vyasr vyasr requested review from csadorf and bdice January 4, 2022 17:53
@vyasr
Copy link
Contributor Author

vyasr commented Jan 4, 2022

Codecov is complaining a lot about missing tests. Maybe we can restructure the existing tests to more systematically cover those additional cases?

Thanks for pointing this out. I've gone through the untested cases and improved test coverage where appropriate. The remaining untested cases fall into two buckets:

  1. Various lines in _get_config_schema_version that will start getting triggered as soon as we have more than two versions and more than a single migration. I'm comfortable leaving these untested for now with the expectation that additional tests in Remove project id #643 will start covering these.
  2. Two exception case in apply_migrations itself. The FileNotFoundError on the lock file I think is something we don't have to do anything about. We frequently leave those types of cases untested in signac since they're there to case subtle filesystem errors rather than logic errors. The other exception case is there to handle the scenario when a migration fails. This code is also IMO not worth testing until Remove project id #643. If either of you can think of a reasonable scenario where we wouldn't fail in _collect_migrations (i.e. an invalid schema version) but where the v1->v2 migration could fail, I'm happy to add that to Remove project id #643.

In general the lack of coverage just reflects that we've added migration logic before we had a need for it in anticipation of future use cases. I made these changes on master because they would have cluttered the PR actually introducing the v1->v2 migration and made them hard to review, and the lack of testability is the tradeoff. We should see that improve with #643 (and I'm sure we'll find additional bugs then, but at least the core logic in place now seems sound).

There is one other change of note that I introduced. While attempting to improve coverage, I realized that because apply_migrations was defined as a generator it was a no-op unless the result was iterated over, i.e. apply_migrations(root) does nothing unless it's wrapped with something like list(apply_migrations(root)) or for (origin, destination) in apply_migrations(root): .... That API doesn't make any sense to me, and being able to iterate over the intermediate states of migration is IMO only really useful for testing purposes, so I've removed the yield statement in apply_migrations so that it simply performs all migrations at once. Let me know if either of you foresee any issues with that or would prefer a different resolution.

@vyasr vyasr mentioned this pull request Jan 8, 2022
18 tasks
Copy link
Member

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this PR. As noted before, it's a little hard to review these changes because we don't actually have a migration defined yet. The code looks stable and general enough, and there is a significant benefit to having this merged to help unblock other work. Let's find the remaining sharp edges when we introduce a real migration!

signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
@vyasr
Copy link
Contributor Author

vyasr commented Jan 10, 2022

@csadorf let me know if you'd like to re-review this. It would be fine IMO to merge this, then work out any further kinks in the context of #643 since that will give us our first real migration to stress test this with anyway, but if you see other issues that you think are worth addressing in the context of this PR we can deal with them first.

@csadorf
Copy link
Contributor

csadorf commented Jan 12, 2022

@csadorf let me know if you'd like to re-review this. It would be fine IMO to merge this, then work out any further kinks in the context of #643 since that will give us our first real migration to stress test this with anyway, but if you see other issues that you think are worth addressing in the context of this PR we can deal with them first.

Thanks, I will review it right now.

Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good overall, just a few minor questions.

signac/contrib/migration/__init__.py Show resolved Hide resolved
signac/contrib/migration/__init__.py Outdated Show resolved Hide resolved
signac/contrib/migration/__init__.py Show resolved Hide resolved
signac/contrib/migration/v0_to_v1.py Show resolved Hide resolved
@vyasr
Copy link
Contributor Author

vyasr commented Jan 27, 2022

@csadorf I'm treating our discussion at the meeting as approval and I'm going to go ahead and merge this PR now.

@vyasr vyasr merged commit f0307fd into master Jan 27, 2022
@vyasr vyasr deleted the refactor/migration_root_dir branch January 27, 2022 05:52
@b-butler b-butler added this to the v1.8.0 milestone Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants