Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix missing task stats for queued tasks #9745

Merged
merged 12 commits into from
Jul 30, 2024
2 changes: 2 additions & 0 deletions master/internal/rm/db.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,14 @@ func FetchAvgQueuedTime(pool string) (
})
}
today := float32(0)
nilDate := "0001-01-01" // treat task stats with missing start time. bb7020a404b
subq := db.Bun().NewSelect().TableExpr("allocations").Column("allocation_id").
Where("resource_pool = ?", pool).
Where("start_time >= CURRENT_DATE")
err = db.Bun().NewSelect().TableExpr("task_stats").ColumnExpr(
"avg(extract(epoch FROM end_time - start_time))",
).Where("event_type = ?", "QUEUED").
Where("start_time IS NOT NULL AND start_time != ?", nilDate).
Where("end_time >= CURRENT_DATE AND allocation_id IN (?) ", subq).
Scan(context.TODO(), &today)
if err != nil {
Expand Down
6 changes: 5 additions & 1 deletion master/internal/task/allocation.go
Original file line number Diff line number Diff line change
Expand Up @@ -626,10 +626,14 @@ func (a *allocation) resourcesAllocated(msg *sproto.ResourcesAllocated) error {
}

now := time.Now().UTC()
taskStatStartTime := msg.JobSubmissionTime
if msg.JobSubmissionTime.IsZero() && a.req.Restore {
Copy link
Member Author

@hamidzr hamidzr Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if we had jobSubmissionTime, in case of restores I think we'd want to use the request time? which would be the master start time when we're going through the restore path

taskStatStartTime = a.req.RequestTime
}
err = db.RecordTaskStats(context.TODO(), &model.TaskStats{
AllocationID: msg.ID,
EventType: "QUEUED",
StartTime: &msg.JobSubmissionTime,
StartTime: &taskStatStartTime,
EndTime: &now,
})
if err != nil {
Expand Down
3 changes: 3 additions & 0 deletions master/static/srv/update_aggregated_queued_time.sql
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ total_agg AS (
end_time >= const.target_date
AND end_time < (const.target_date + interval '1 day')
AND event_type = 'QUEUED'
-- Exclude the rows with NULL start_time. When Bun sees StartTime is nil,
-- it saves it as 0001-01-01 00:00:00+00.
AND task_stats.start_time != '0001-01-01 00:00:00+00:00'::TIMESTAMPTZ
),

all_aggs AS (
Expand Down
Loading