Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add context about failing branches during error in used_cache() #5038

Closed
wants to merge 2 commits into from

Conversation

courentin
Copy link
Contributor

This PR is an implementation of #5037.
It is my first contribution to the dvc project, feel free to challenge it.

@courentin
Copy link
Contributor Author

This needs to be updated

cache.update(used_cache, suffix=suffix)
except DvcException as e:
logger.error(f"Cache for '{branch}' could not be collected")
if ignore_dvc_exceptions:
Copy link
Contributor

@efiop efiop Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like using --force would be intuitive here.

Suggested change
if ignore_dvc_exceptions:
if ignore_dvc_exceptions or force:

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest using force and ignore_dvc_exceptions or just force?
Indeed, --force seems more coherent with options across dvc commands

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@courentin Seems like --force would be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, --force is already used in the gc command "Force garbage collection - automatically agree to all prompts".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efiop Don't we want to differentiate --force where the behaviour is "Force garbage collection - automatically agree to all prompts" and --skip-failing-collect so that we do one without the other?

WDYT?

I'd like to merge this PR so that our team can clean their workspace :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@courentin Do you have a scenario in which those could be used separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say if we want to ignore failing commits but not agree to all prompts. But to be honest I'm fine with using only --force, I let you decide what's best for the product ;)

Copy link
Contributor

@efiop efiop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! πŸ™

Ideally, this should skip particular malformed dvcfiles and not the whole branch. We have similar requests for metrics/params/plots parsing too. But this would be a great start too.

@courentin
Copy link
Contributor Author

Yes indeed, not sure how to do it however

@skshetry
Copy link
Member

skshetry commented Dec 9, 2020

Yes indeed, not sure how to do it however

There's not an easy way right now for this. I'd imagine something like the following, as stage collection error can be handled using a callback:

def func(path, exc):
  print(f"Collecting '{path}' failed and was skipped.")

with repo.handle_error("stage_collection", func):
   stages = repo.stages

Something like that should work for the CLI usage, but as an API, it might not as we may have collected the stages before running
metrics.show() for example.

@efiop @pmrowla @pared if you have better suggestions.

@efiop
Copy link
Contributor

efiop commented Dec 9, 2020

@skshetry Maybe just something generic and explicit like a onerror(similar to os.walk) for collect* methods?

@courentin
Copy link
Contributor Author

courentin commented Dec 15, 2020

Do you mind if after finding a good wording for this option, I try to merge this PR as it is and we can discuss a more general approach in an issue or discussion?
(I'm sorry I can't commit myself too much on dvc even tho I love it)

@skshetry
Copy link
Member

skshetry commented Dec 16, 2020

Do you mind if after finding a good wording for this option, I try to merge this PR as it is and we can discuss a more general approach in an issue or discussion?

We don't expect you to handle this right away. This was just us discussing a long-term solution, sorry for the noise here.
The PR looks good, I'm fine with either of the --force or the --skip-failing-collect, as long as they are well documented (just don't like the conditional flag on gc, that's all :) ).

It'd be great if you could respond to #5037 (comment) and #5066, that'd help us prioritize (if those features would be helpful for you). Thanks a lot for the contributions, @courentin.

@efiop
Copy link
Contributor

efiop commented Sep 17, 2021

Closing as outdated. @courentin Thank you for your patience πŸ™ , we've readjusted error handling for our metrics/plots/etc, and brancher is on our todo list for the near future. So please stay tuned.

@efiop efiop closed this Sep 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants