-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add master logs [DET-3680] #1007
Conversation
4c00290
to
21a1882
Compare
master/internal/api_master.go
Outdated
offset, limit := effectiveOffsetNLimit(int(req.Offset), int(req.Limit), total) | ||
|
||
for { | ||
for _, log := range a.m.logs.Entries(offset, -1, limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stoksc wouldn't this pin the CPU if there are no logs to send and the user has req.Follow set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks like it would to me, nice catch. We could add a tiny timeout if it grabs a slice of entries without any logs (there are other things I can think of doing, but they're an order of magnitude more work)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright that sounds good for now. I rather reserve more work for another PR so we can wrap this up without much change to the original work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch with the for {}
that's going to kill the cpu.
master/internal/api_master.go
Outdated
offset, limit := effectiveOffsetNLimit(int(req.Offset), int(req.Limit), total) | ||
|
||
for { | ||
for _, log := range a.m.logs.Entries(offset, -1, limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks like it would to me, nice catch. We could add a tiny timeout if it grabs a slice of entries without any logs (there are other things I can think of doing, but they're an order of magnitude more work)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks good
Co-authored-by: Bradley Laney <[email protected]>
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.
Description
a port of #988
Test Plan
Commentary (optional)