Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutil.copytree fails in local.py run_command #689

Closed
adammarples opened this issue Nov 17, 2023 · 5 comments
Closed

shutil.copytree fails in local.py run_command #689

adammarples opened this issue Nov 17, 2023 · 5 comments
Labels
area:dependencies Related to dependencies, like Python packages, library versions, etc area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc dbt:run Primarily related to dbt run command or functionality execution:local Related to Local execution environment

Comments

@adammarples
Copy link
Contributor

astronomer-cosmos==0.7.3

I periodically get failures like this when running a simple dbt run --models on a local operator.

2023-11-17T12:29:09.6922654Z Traceback (most recent call last):
2023-11-17T12:29:09.6923817Z   File "/usr/local/lib/python3.11/site-packages/cosmos/providers/dbt/core/operators/local.py", line 323, in execute
2023-11-17T12:29:09.6924959Z     result = self.build_and_run_cmd(context=context)
2023-11-17T12:29:09.6925513Z              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-17T12:29:09.6926833Z   File "/usr/local/lib/python3.11/site-packages/cosmos/providers/dbt/core/operators/local.py", line 235, in build_and_run_cmd
2023-11-17T12:29:09.6928139Z     return self.run_command(cmd=dbt_cmd, env=env, context=context)
2023-11-17T12:29:09.6928940Z            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-17T12:29:09.6930205Z   File "/usr/local/lib/python3.11/site-packages/cosmos/providers/dbt/core/operators/local.py", line 183, in run_command
2023-11-17T12:29:09.6931313Z     shutil.copytree(
2023-11-17T12:29:09.6931849Z   File "/usr/local/lib/python3.11/shutil.py", line 561, in copytree
2023-11-17T12:29:09.6932703Z     return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
2023-11-17T12:29:09.6933426Z            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-11-17T12:29:09.6934128Z   File "/usr/local/lib/python3.11/shutil.py", line 515, in _copytree
2023-11-17T12:29:09.6934800Z     raise Error(errors)
2023-11-17T12:29:09.6937525Z shutil.Error: [('/usr/local/airflow/dags/dbt/vulcan/tests/generic/qc/.my_test.md.EDIhpJ', '/tmp/tmplbyb8o0q/dbt_project/tests/generic/qc/.my_test.md.EDIhpJ', "[Errno 2] No such file or directory: '/usr/local/airflow/dags/dbt/vulcan/tests/generic/qc/.my_test.md.EDIhpJ'")]
2023-11-17T12:29:09.6942173Z [2023-11-17T12:23:57.364+0000] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=azure_ci_end_to_end, task_id=azure_ci_end_to_end_stage.TABLE.TABLE_run, execution_date=20231117T121825, start_date=20231117T122356, end_date=20231117T122357
2023-11-17T12:29:09.6948352Z [2023-11-17T12:23:57.957+0000] {standard_task_runner.py:104} ERROR - Failed to execute job 73 for task azure_ci_end_to_end_stage.TABLE.TABLE_run ([('/usr/local/airflow/dags/dbt/vulcan/tests/generic/qc/.my_test.md.EDIhpJ', '/tmp/tmplbyb8o0q/dbt_project/tests/generic/qc/.my_test.md.EDIhpJ', "[Errno 2] No such file or directory: '/usr/local/airflow/dags/dbt/vulcan/tests/generic/qc/.my_test.md.EDIhpJ'")]; 66)

In this instance it is failing to copy a test documentation file called my_test.md but it happens with model files and others too.

The issue is that run_command calls shutil.copytree to copy the contents of the project directory. This fails when it says that a source file doesn't exist. This is odd for two reasons.

  1. shutil.copytree shouldn't be attempting to move files which don't exist, which means that the files are being modified while it is running. I think it has something to do with this Possible race condition with multiple DbtTaskGroups starting at the same time. #226 because I do have two task groups chained together.

  2. The files have dot prefixes and random suffixes like .my_test.md.EDIhpJ which suggest that they are tempfiles or partial files.

I suspect that the answer will be to upgrade to the latest version but I am hesitant to do so because last time I ran into this issue #642

Would very much appreciate a hint on what is happening and maybe I could fix it.

Copy link

dosubot bot commented Nov 17, 2023

🤖

Hello @adammarples,

I'm Dosu, a bot here to help you with your issue while we wait for a human maintainer. I can assist with troubleshooting bugs, answering questions, and guiding you on how to contribute.

Please bear with me as I review your issue in detail. I'll get back to you shortly with a more comprehensive response.

@dosubot dosubot bot added area:dependencies Related to dependencies, like Python packages, library versions, etc area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc dbt:run Primarily related to dbt run command or functionality execution:local Related to Local execution environment priority:high High priority issues are blocking or critical issues without a workaround and large impact labels Nov 17, 2023
Copy link

dosubot bot commented Nov 17, 2023

🤖

Hello @adammarples,

I'm Dosu, a bot here to assist you while we wait for a human maintainer. I can help troubleshoot bugs, answer questions, and guide you on your journey as a contributor. Don't hesitate to ask if you need anything!

Based on the traceback you provided, it seems like the issue is related to a race condition where the shutil.copytree function is trying to copy files that are being modified or deleted concurrently. This is likely due to multiple DbtTaskGroups starting at the same time, as you suspected.

A similar issue was reported in the Cosmos repository (#226) and was resolved by running each operation in a temporary directory. This change was introduced in the commit f443493 on April 14, 2023.

As for the files with dot prefixes and random suffixes like .my_test.md.EDIhpJ, these are indeed temporary files. They are created by dbt when it compiles the project and are meant to be deleted after the operation is completed. However, if the operation is interrupted or fails, these files might not be deleted and could cause issues with subsequent operations.

Regarding your concern about upgrading to the latest version, it's understandable that you're hesitant due to the issue you encountered before (#642). However, it's worth noting that software updates not only introduce new features but also include bug fixes and performance improvements. Therefore, it's generally recommended to keep your software up-to-date.

If you decide to upgrade, make sure to test the new version in a controlled environment before deploying it to production. If you encounter any issues, don't hesitate to report them. The Cosmos community is always here to help.

If you decide not to upgrade, you could try to implement a similar solution to the one used to fix issue #226. This would involve modifying the run_command method in the DbtLocalBaseOperator class to run each operation in a temporary directory. However, this could be a complex task depending on your familiarity with the Cosmos codebase and Python programming.

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@tatiana
Copy link
Collaborator

tatiana commented Nov 17, 2023

@adammarples is there any particular reason why you're using astronomer-cosmos==0.7.3, and not the latest releases (we're currently at astronomer-cosmos==1.2.4?

I just replied to your comment on #642, I'm sorry, I missed it before.

@tatiana tatiana removed the priority:high High priority issues are blocking or critical issues without a workaround and large impact label Nov 18, 2023
@adammarples
Copy link
Contributor Author

@tatiana the reason I'm not updating from 0.7.3 to latest is because the signature for DbtTaskGroup has changed and it would involve some new coding and testing. In my tests in upgrading to the latest version I have run into issues #642 #693 #694 which has made it impossible for me to use the select methods that I need in this case

Copy link

dosubot bot commented Feb 19, 2024

Hi, @adammarples,

I'm helping the Cosmos team manage their backlog and am marking this issue as stale. From what I understand, you reported an issue involving a failure to copy certain files when the shutil.copytree function is called. The source files that don't exist have dot prefixes and random suffixes, indicating they may be temp or partial files. There were discussions about a potential race condition, insights provided by the bot Dosu, and challenges faced in upgrading to the latest version due to a previous experience.

Could you please confirm if this issue is still relevant to the latest version of the Cosmos repository? If it is, please let the Cosmos team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation.

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 19, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 26, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dependencies Related to dependencies, like Python packages, library versions, etc area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc dbt:run Primarily related to dbt run command or functionality execution:local Related to Local execution environment
Projects
None yet
Development

No branches or pull requests

2 participants