Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics diff with missing old value #3469

Closed
4 tasks done
dmpetrov opened this issue Mar 11, 2020 · 12 comments · Fixed by #3576
Closed
4 tasks done

metrics diff with missing old value #3469

dmpetrov opened this issue Mar 11, 2020 · 12 comments · Fixed by #3576
Assignees
Labels
bug Did we break something? enhancement Enhances DVC p1-important Important, aka current backlog of things to do

Comments

@dmpetrov
Copy link
Member

dmpetrov commented Mar 11, 2020

If metrics file(s) has only no old version (it was just created):

$ dvc metrics diff HEAD^
WARNING: Metrics file 'metrics/eval.json' does not exist at the reference 'HEAD^'.
WARNING: Metrics file 'metrics/train.json' does not exist at the reference 'HEAD^'.
ERROR: unexpected error - 'HEAD^'

Expected the same output with no difference:

$ dvc metrics diff HEAD^
       Path             Metric       Value         Change
metrics/train.json   took            0.004   -
metrics/train.json   num_steps       2500    -
metrics/train.json   learning_rate   0.015   -
metrics/eval.json    accuracy        0.897   -

In JSON - the same output with no old values.

Also, the error message is misleading: ERROR: unexpected error - 'HEAD^'. It should be something like ERROR: the old version of the metric file is not found

Another problem is writing status to STDOUT (it should be in STDIN).

$ dvc metrics diff HEAD
No changes.

Summary:

  • Handle metrics diff based on one version only
  • Make sure JSON supports missing old metrics value
  • Better output error message
  • Make sure only the table is presented in STDOUT. All other messages (like "No changes.") should be in STDERR.
@dmpetrov dmpetrov added bug Did we break something? enhancement Enhances DVC p1-important Important, aka current backlog of things to do labels Mar 11, 2020
@dmpetrov dmpetrov mentioned this issue Mar 25, 2020
efiop added a commit to efiop/dvc that referenced this issue Mar 25, 2020
efiop added a commit that referenced this issue Mar 25, 2020
@efiop efiop self-assigned this Mar 31, 2020
@efiop
Copy link
Contributor

efiop commented Apr 1, 2020

Make sure only the table is presented in STDOUT. All other messages (like "No changes.") should be in STDERR.

Not sure why it should be like that. Table as well as No changes. are for the user, not really meant to be parsable. For parsing we have --show-json. I don't think we do that anywhere else in the dvc.

@dmpetrov
Copy link
Member Author

dmpetrov commented Apr 2, 2020

I'd prefer to have an empty output in this case. No need in a special message. It is a usual approach for many commands including git diff.

@efiop
Copy link
Contributor

efiop commented Apr 2, 2020

@dmpetrov But we've been talking about dvc commands being too silent, hence why we've introduced such summary messages everywhere. No changes is consistent here with the rest of the commands.

@shcheklein
Copy link
Member

Don't have a strong opinion on this (git diff does not show anything, right? but on the other hand it outputs patch by default and DVC outputs some human-readable table - so not a direct comparison for sure).

It's better to be consistent with other DVC "diff" commands though.

@efiop
Copy link
Contributor

efiop commented Apr 2, 2020

@dmpetrov noticed that it makes it not possible to use dvc metrics diff in shell like [ -z "dvc metrics diff ^HEAD" ] which is bad. Moving to stderr, makes total sense to me now.

@efiop
Copy link
Contributor

efiop commented Apr 2, 2020

Also probably time to use print there instead of logger.info...

@shcheklein
Copy link
Member

@efiop do we have the same problem (not sure though why --json could not be used for example in the same bash check) for other commands? again, just to make sure that we are consistent more or less

@dmpetrov
Copy link
Member Author

dmpetrov commented Apr 2, 2020

All the "information" commands need output nothing if there no information. This way you can easily use them from scripts like

$ git diff
$ if [[ -z `git diff`]]; then 
   # do something
fi

Otherwise, we are pushing people to parse the outputs - if [[ 'dvc metrics diff' = "No changes" ]]. Clear anti-pattern.

For "action" commands (not "information") like dvc checkout it is fine (and usually preferable) to output human-readable information.

@shcheklein
Copy link
Member

Would you consider git status an information or action command? :)

@shcheklein
Copy link
Member

I really think that the reason behind git diff is that it's not meant to be consumed by a human even if there are some changes in the first place. git diff with its default options is similar to dvc diff --json.

@dmpetrov
Copy link
Member Author

dmpetrov commented Apr 2, 2020

Would you consider git status an information or action command? :)

Yeah. That's the thought one :) It is kind of informational but does some advanced stuff. It might be better for status to be silent in these cases.

I really think that the reason behind git diff is that it's not meant to be consumed by a human even if there are some changes in the first place.

Hm.. I thought it is often used command. I use it a lot. You are right in the sense that it is optimized a bit for machines. This should be the case for many informational commands.

@shcheklein
Copy link
Member

This should be the case for many informational commands.

don't see any strong reason for this. It's not white and black. And I think people who write scripts are totally fine with a quite option, json or whatever. They have way higher tolerance in my mind than regular data scientists who can be surprised (?)

efiop added a commit to efiop/dvc that referenced this issue Apr 2, 2020
Makes it easier to use in shell scripts. E.g. `-z $(dvc metrics diff)`, which
has the same semantics as `git diff`. Also syncs up with the behaviour
of `dvc diff`.

Fixes iterative#3469
efiop added a commit that referenced this issue Apr 2, 2020
Makes it easier to use in shell scripts. E.g. `-z $(dvc metrics diff)`, which
has the same semantics as `git diff`. Also syncs up with the behaviour
of `dvc diff`.

Fixes #3469
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? enhancement Enhances DVC p1-important Important, aka current backlog of things to do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants