-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add tensorboard delete command to CLI #8227
Conversation
✅ Deploy Preview for determined-ui canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add an end to end test for this that verifies the TB files are actually deleted. i think just an e2e cpu is fine. if we want per-storage manager tests we can integration test it, i just would like at least one e2e feature test for this.
Docsite preview being generated for this PR. |
17fce1f
to
206ceb4
Compare
"delete", | ||
delete_tensorboard, | ||
"delete TensorBoard files associate with the proived experiment ID", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@azhou-determined or someone from ml sys should review this change.
personally i think it could be a bit confusing for a user. maybe det e delete-tensorboard-files :ID
is more clear about what it does but a pain to type.
"tensorboard" in this context so far means "a running tensorboard application" not "tensorboard files" so "det tensorboard delete ..." makes me think it would do something similar to "det tensorboard kill ...".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed that det tensorboard delete
reads like you want to delete the tensorboard instance.
maybe something like delete-files
, or delete-experiment
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like det tensorboard delete-files --experiment-id $ID
or det experiment delete-tensorboard-files $ID
. preferring the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion with @wes-turner & @stoksc. We went with det e delete-tb-files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python side 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great
a0aeea3
to
f818046
Compare
f818046
to
5578b82
Compare
return nil, err | ||
} | ||
|
||
exp, err := db.ExperimentByID(context.TODO(), int(req.ExperimentId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this sorry, but we should be passing ctx
here and on 2991.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: #8332
Description
Creating a CLI command for deleting tensorboard files from a given experiment ID. We want some of the existing logic for checkpointGC to perform the file deleteion.
Usage of command would be
det tensorboards delete <experiment-id>
Test Plan
Create an experiment
Example:
det e create examples/tutorials/mnist_pytorch/const.yaml examples/tutorials/mnist_pytorch/
Verify that Tensorboard files exist:
Delete tensorboard files:
det e delete-tb-files <exp-id>
Check that files are deleted:
Checklist
docs/release-notes/
.See Release Note for details.
Ticket
DET-9865