Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cases where a revision dependency graph contains a loop or cycle. #757

Closed
lkpdn opened this issue Nov 21, 2020 · 9 comments
Closed

Cases where a revision dependency graph contains a loop or cycle. #757

lkpdn opened this issue Nov 21, 2020 · 9 comments
Labels
bug Something isn't working versioning model

Comments

@lkpdn
Copy link
Contributor

lkpdn commented Nov 21, 2020

Preface

From an admin point of view, (a) looks like just being ignored because iterating through all revisions (e.g., alembic branches) does not start in the first place. This is because "heads" must be empty after _revision_map discards all reachable revisions.
On the other hand, (b) raises revision.RangeNotAncestorError when an admin tries to do the same traversal because in this case "bases" must have been empty while "heads" is kept undiscarded.

Question

IMHO, such different behaviours for (a) and (b) are confusing. It would be better if

  • exceptions are raised for all those cases
  • error message for (a)-1 and (b)-1 should be like: "Revision %s is a self-loop. Set down_revision to a correct value or None"
  • error message for (a)-2 and (b)-2 should be like: "Revision dependency graph is cyclic."

Any thoughts?

@lkpdn lkpdn added the requires triage New issue that requires categorization label Nov 21, 2020
@zzzeek zzzeek added bug Something isn't working and removed requires triage New issue that requires categorization labels Nov 21, 2020
@zzzeek
Copy link
Member

zzzeek commented Nov 21, 2020

Preface

* `script_location` may contain one or more connected dependency graphs but let us focus on just one of them for simplicity's sake.

* arguable situations:
  
  * (a). when it's a **strongly-connected graph**,
    
    * (1). which contains **loop(s)**: Suppose a situation where an app developer mistakenly sets both `revision` and `down_revision` to the same value, which is not `None`. Also let this contain the case where there is just one revision with a self-loop.
    * (2). which contains **cycle(s)** (i.e., cyclic): due to misconfiguration again.
  * (b). when it's a **unilaterally-connected graph**,
    
    * (1). _ditto_
    * (2). _ditto_

From an admin point of view, (a) looks like just being ignored because iterating through all revisions (e.g., alembic branches) does not start in the first place. This is because "heads" must be empty after _revision_map discards all reachable revisions.
On the other hand, (b) raises revision.RangeNotAncestorError when an admin tries to do the same traversal because in this case "bases" must have been empty while "heads" is kept undiscarded.

Question

IMHO, such different behaviours for (a) and (b) are confusing. It would be better if

* exceptions are raised for all those cases

sounds fine to me

* error message for (a)-1 and (b)-1 should be like: "Revision %s is a self-loop. Set down_revision to a correct value or None"

seems fine

* error message for (a)-2 and (b)-2 should be like: "Revision dependency graph is cyclic."

if you can establish this behavior for me I'm all for it. the branch logic is really complicated and simplifying it would be great if you have cycles to spare. test cases for new behaviors / error messages / etc. would be in https://github.com/sqlalchemy/alembic/blob/master/tests/test_revision.py most likely.

Any thoughts?

if you have dev cycles to spare for a working PR, I'd love to review it.

@sqla-tester
Copy link
Collaborator

Koichiro Den has proposed a fix for this issue in the master branch:

Raise an exception if any loop or cycle found in a revision graph constructed. https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/2379

lkpdn pushed a commit to lkpdn/alembic that referenced this issue Nov 22, 2020
@zzzeek
Copy link
Member

zzzeek commented Nov 23, 2020

looked at some of the cycles. the logic that I wrote, which I continue to find pretty hard to work with, is at https://github.com/sqlalchemy/alembic/blob/master/alembic/script/revision.py#L750 . apparently it will iterate revision graphs with cycles without raising anyhting, it just gives you an invalid traversal (this is without your patch that prevents such a tree from existing):

class ThingTest(DownIterateTest):
    def test_foo(self):
        self.map = RevisionMap(
            lambda: [
                Revision("a", ()),
                Revision("b", "a"),
                Revision("c", ("b", "e")),
                Revision("d", "c"),
                Revision("e", "d"),
            ]
        )
        self._assert_iteration("e", "a", ['e', 'd', 'c', 'b', 'a'])
        self._assert_iteration("c", "a", ['c', 'b', 'a', 'e', 'd'])
        self._assert_iteration("d", "a", ['d', 'c', 'b', 'a', 'e'])

I thought at first we'd be raising the cycle errors inside of _iterate_revisions, rather than as a separate sanity check. ultimately it doesnt matter i was just hoping that someone could make that logic followable.

@lkpdn
Copy link
Contributor Author

lkpdn commented Nov 24, 2020

The case above is depicted below and apparently it's unilaterally-connected, but the behaviour is like (a-1).
graph01
The original issue description was incorrect. It's not strongly-connected vs unilaterally-connected but rather empty heads vs non-empty-heads with empty base. I'll correct the description with strike-through lines. Thanks.

I think the most easy-to-grasp way is to run a simple BFS for each revision addition to the map, but the original logic is lightweight and for me it looks not bad so I introduced somewhat indirect detection.
On the other hand, if we want to allow such a graph to exist in the first place and traversals like b -> a to be allowed without noticing the cycle existence, I think lazily running BFS for each _iterate_revision is an option.

@zzzeek
Copy link
Member

zzzeek commented Nov 24, 2020

I think it's fine if we notice the cycle at startup time. alembic is set for version 1.5 where we are making some bigger changes like dropping old SQLAlcehmy version support and some adjustments to the transaction model for SQLAlhcemy 1.4 so if a set of revisions has a cycle, detecting it up front is fine, at worst we could make it just a warning and not an error but I think as an error is OK.

@lkpdn
Copy link
Contributor Author

lkpdn commented Nov 24, 2020

All right, thanks for your quick response!

lkpdn pushed a commit to lkpdn/alembic that referenced this issue Nov 28, 2020
lkpdn pushed a commit to lkpdn/alembic that referenced this issue Dec 1, 2020
@zzzeek
Copy link
Member

zzzeek commented Dec 1, 2020

from start to finish I'm very impressed! please feel free to contribute to SQLAlchemy / Alembic any time. as published elsewhere the developer community hangs out on https://gitter.im/sqlalchemy/devel .

@lkpdn
Copy link
Contributor Author

lkpdn commented Dec 1, 2020

Glad to hear that, thanks for your patience and cooperation!

@Nishimirwe
Copy link

How to effectively then remove the cycle?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working versioning model
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants