Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage cleaner] Improve handling of temp directories and files #396

Merged
merged 9 commits into from
Dec 11, 2023

Conversation

2015aroras
Copy link
Collaborator

The current implementation has a few issues with respect to temp directories and files. Below I list the issues and their corresponding fixes in this PR.

Operations do not delete all the temp content (e.g. unarchived directories) added to the user-provided temp directory.
Handle temporary directory creation and completion in perform_operation, which is shared by all operations.

Bad run deletion can take several runs as input, so the number of uncleaned artifacts can easily grow.
Added temp directory cleanup after each run.

The unsharding temp directory is not under the user-provided temp directory.
Changed unsharding temp directory creation to fix this.

The majority of the additions/deletions were caused from moving perform_operation down the file (77c648a).

Copy link
Member

@epwalsh epwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one suggestion

@@ -729,6 +733,11 @@ def delete_bad_runs(run_paths: List[str], config: DeleteBadRunsConfig):
log.info("Starting to check if run %s should be deleted", run_path)
_delete_if_bad_run(storage, run_path, config)

# Delete temp dir after each run to avoid memory bloat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Delete temp dir after each run to avoid memory bloat
# Delete temp dir after each run to avoid storage bloat

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 1000 to 1002
if Path(temp_dir).is_dir():
log.info("Deleting temp dir %s", temp_dir)
shutil.rmtree(temp_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably run even if the cleaning op fails, so you might want to add a try: ... finally: ... block.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2015aroras 2015aroras merged commit 5d03d38 into main Dec 11, 2023
10 checks passed
@2015aroras 2015aroras deleted the shanea/storage-cleaner-delete-temp-files branch December 11, 2023 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants