feat: add master logs [DET-3680] #1007

hamidzr · 2020-08-04T17:39:43Z

Description

a port of #988

Test Plan

Commentary (optional)

master/internal/api_master.go

master/internal/api_trials.go

master/internal/api_master.go

hamidzr · 2020-08-04T18:11:45Z

master/internal/api_master.go

+	offset, limit := effectiveOffsetNLimit(int(req.Offset), int(req.Limit), total)
+
+	for {
+		for _, log := range a.m.logs.Entries(offset, -1, limit) {


@stoksc wouldn't this pin the CPU if there are no logs to send and the user has req.Follow set?

Yep, looks like it would to me, nice catch. We could add a tiny timeout if it grabs a slice of entries without any logs (there are other things I can think of doing, but they're an order of magnitude more work)?

alright that sounds good for now. I rather reserve more work for another PR so we can wrap this up without much change to the original work

stoksc

nice catch with the for {} that's going to kill the cpu.

master/internal/api_master.go

stoksc · 2020-08-04T18:47:07Z

master/internal/api_master.go

+	offset, limit := effectiveOffsetNLimit(int(req.Offset), int(req.Limit), total)
+
+	for {
+		for _, log := range a.m.logs.Entries(offset, -1, limit) {


Yep, looks like it would to me, nice catch. We could add a tiny timeout if it grabs a slice of entries without any logs (there are other things I can think of doing, but they're an order of magnitude more work)?

stoksc

👍 looks good

master/internal/api_master.go

Co-authored-by: Bradley Laney <[email protected]>

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

cla-bot bot added the cla-signed label Aug 4, 2020

hamidzr mentioned this pull request Aug 4, 2020

feat: add master logs endpoint #988

Closed

hamidzr commented Aug 4, 2020

View reviewed changes

master/internal/api_master.go Outdated Show resolved Hide resolved

hamidzr commented Aug 4, 2020

View reviewed changes

master/internal/api_master.go Outdated Show resolved Hide resolved

hamidzr commented Aug 4, 2020

View reviewed changes

master/internal/api_trials.go Show resolved Hide resolved

hamidzr requested a review from stoksc August 4, 2020 17:43

Jonathan Ben-tzur and others added 2 commits August 4, 2020 10:46

feat: add master logs endpoint

03ed05f

define helpers for effective offset and limit calculation

21a1882

hamidzr force-pushed the 3680-master-logs branch from 4c00290 to 21a1882 Compare August 4, 2020 17:46

hamidzr self-assigned this Aug 4, 2020

hamidzr commented Aug 4, 2020

View reviewed changes

master/internal/api_master.go Show resolved Hide resolved

hamidzr commented Aug 4, 2020

View reviewed changes

stoksc reviewed Aug 4, 2020

View reviewed changes

hamidzr added 3 commits August 5, 2020 13:54

simplify offset calculation

b97a297

add a small delay for rechecking log availability

550ee8e

reset limit on each check to avoid overflow

25a1e09

hamidzr requested a review from stoksc August 5, 2020 21:13

hamidzr assigned stoksc and unassigned hamidzr Aug 5, 2020

hamidzr marked this pull request as ready for review August 5, 2020 21:13

stoksc approved these changes Aug 6, 2020

View reviewed changes

master/internal/api_master.go Outdated Show resolved Hide resolved

stoksc assigned hamidzr and unassigned stoksc Aug 6, 2020

hamidzr and others added 2 commits August 6, 2020 10:13

use switch statement to simplify effectiveLimit

d00c8cc

Co-authored-by: Bradley Laney <[email protected]>

update variable names and effectiveLimit comment

259a52a

hamidzr merged commit 7767524 into determined-ai:master Aug 6, 2020

hamidzr deleted the 3680-master-logs branch August 6, 2020 18:03

rb-determined-ai pushed a commit that referenced this pull request Oct 5, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

a3a7412

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

rb-determined-ai pushed a commit that referenced this pull request Oct 10, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

5a26fbb

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

stoksc pushed a commit that referenced this pull request Oct 17, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

4c30cc9

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

rb-determined-ai pushed a commit that referenced this pull request Oct 27, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

c9baed4

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

rb-determined-ai pushed a commit that referenced this pull request Oct 31, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

a6ab16b

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

rb-determined-ai pushed a commit that referenced this pull request Nov 2, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

de78c15

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

rb-determined-ai pushed a commit that referenced this pull request Nov 2, 2023

test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)

760a738

New test of HuggingFace examples/hf_trainer_api/hf_language_modeling on pytorch2. Enabled for slurm_gpu + distributed marks.

dannysauer added this to the 0.13.0 milestone Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add master logs [DET-3680] #1007

feat: add master logs [DET-3680] #1007

hamidzr commented Aug 4, 2020 •

edited

Loading

hamidzr Aug 4, 2020 •

edited

Loading

stoksc Aug 4, 2020

hamidzr Aug 5, 2020 •

edited

Loading

stoksc left a comment

stoksc Aug 4, 2020

stoksc left a comment

feat: add master logs [DET-3680] #1007

feat: add master logs [DET-3680] #1007

Conversation

hamidzr commented Aug 4, 2020 • edited Loading

Description

Test Plan

Commentary (optional)

hamidzr Aug 4, 2020 • edited Loading

Choose a reason for hiding this comment

stoksc Aug 4, 2020

Choose a reason for hiding this comment

hamidzr Aug 5, 2020 • edited Loading

Choose a reason for hiding this comment

stoksc left a comment

Choose a reason for hiding this comment

stoksc Aug 4, 2020

Choose a reason for hiding this comment

stoksc left a comment

Choose a reason for hiding this comment

hamidzr commented Aug 4, 2020 •

edited

Loading

hamidzr Aug 4, 2020 •

edited

Loading

hamidzr Aug 5, 2020 •

edited

Loading