Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repro: downstream: reproducing in wrong order #3602

Closed
liamtoran opened this issue Apr 7, 2020 · 3 comments · Fixed by #3689
Closed

repro: downstream: reproducing in wrong order #3602

liamtoran opened this issue Apr 7, 2020 · 3 comments · Fixed by #3689
Assignees
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do

Comments

@liamtoran
Copy link

DVC version : 0.92.0 (PyPI) + MacOS

It seems dvc repro --downstream (could also be the -f option) sometimes executes DAGs in the wrong order.

Steps to reproduction:

dvc run -o 1 -o 2 -f step1.dvc 'touch 1 | touch 2'
dvc run -d 1 -o 3 -f step2.dvc 'touch 3'
dvc run -d 3 -d 2 -o 4 -f step3.dvc 'touch 4'
dvc repro --downstream -f step1.dvc

Last bit does :

Running command:                                                        
	touch 1 | touch 2

Running command:
	touch 4

Running command:
	touch 3

Even though the DAG looks like this :

             +-----------+        
             | step1.dvc |        
             +-----------+        
            ***          ***      
           *                *     
         **                  ***  
+-----------+                   * 
| step2.dvc |                ***  
+-----------+               *     
            ***          ***      
               *        *         
                **    **          
             +-----------+        
             | step3.dvc |        
             +-----------+   

In this case step2 (touch 3) should be executed before step3 (touch 4)

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Apr 7, 2020
@efiop efiop added bug Did we break something? p1-important Important, aka current backlog of things to do labels Apr 13, 2020
@triage-new-issues triage-new-issues bot removed triage Needs to be triaged labels Apr 13, 2020
@efiop efiop added p0-critical Critical issue. Needs to be fixed ASAP. research and removed p1-important Important, aka current backlog of things to do labels Apr 13, 2020
@efiop efiop added p1-important Important, aka current backlog of things to do and removed p0-critical Critical issue. Needs to be fixed ASAP. labels Apr 21, 2020
@pmrowla pmrowla self-assigned this Apr 24, 2020
@pmrowla
Copy link
Contributor

pmrowla commented Apr 28, 2020

Can be reproduced with

def test_repro_downstream(dvc):

    assert (
        main(
            [
                "run",
                "-o",
                "A",
                "-o",
                "B",
                "-f",
                "step1.dvc",
                "echo 'A'>A; echo 'B'>B",
            ]
        )
        == 0
    )
    assert (
        main(["run", "-d", "A", "-o", "C", "-f", "step2.dvc", "echo 'C'>C"])
        == 0
    )
    assert (
        main(
            [
                "run",
                "-d",
                "C",
                "-d",
                "B",
                "-o",
                "D",
                "-f",
                "step3.dvc",
                "echo 'D'>D",
            ]
        )
        == 0
    )
    evaluation = dvc.reproduce("step1.dvc", downstream=True, force=True)
    assert len(evaluation) == 3
    assert evaluation[0].relpath == "step1.dvc"
    assert evaluation[1].relpath == "step2.dvc"
    assert evaluation[2].relpath == "step3.dvc"

@pmrowla
Copy link
Contributor

pmrowla commented Apr 28, 2020

Looks like the issue is that for --downstream, we are reversing the DAG, and then traversing the reversed graph w/DFS pre-ordered search. In the example case, using pre-ordered search provides no guarantee that step3 is only visited after step2.

networkx dfs_preorder_nodes() returns them in some arbitrary order, that is probably just dependent on default python set behavior - in the example test case, if I change the name of the final stage to 3.dvc, the test actually passes in OSX python 3.7.

I think what we should be doing is traversing the reversed graph w/reverse post-ordered search, to ensure that dependencies are run in the correct order.

@pmrowla
Copy link
Contributor

pmrowla commented Apr 28, 2020

Also reproducible using our existing repro --downstream test (

class TestReproDownstream(TestDvc):
)

For the graph

    #       E
    #      / \
    #     D   F
    #    / \   \
    #   B   C   G
    #    \ /
    #     A

downstream A is run as [A, B, D, E, C] even though C should be run before D

@pmrowla pmrowla removed the research label Apr 28, 2020
pmrowla added a commit to pmrowla/dvc that referenced this issue Apr 28, 2020
efiop pushed a commit that referenced this issue Apr 28, 2020
* tests: ensure repro --downstream preserves dependency order

* repo: use reverse post-order DFS in repro --downstream

- Fixes #3602

* make test_repro and test_repro_multistage consistent

* rebase master, update multistage test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p1-important Important, aka current backlog of things to do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants