Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Better logging when a run terminates due to max_duration #1701

Open
james-boydell opened this issue Sep 17, 2024 · 1 comment
Open
Labels

Comments

@james-boydell
Copy link

Problem

In the server console logs, it's unclear that a run was terminated due to max_duration. Attaching image showing when a run was started and the messaging shown 6 hours later (default max_duration).

image

Solution

No response

Workaround

No response

Would you like to help us implement this feature by sending a PR?

No

@r4victor
Copy link
Collaborator

@james-boydell, I agree we should improve the run failure reason in that case. Still, it's recommended to check run diagnostic logs on fails. They are available to users that don't have access to server logs and may contain more information than server logs:

Do dstack logs --diagnose run_name and you'd see:

...
time=2024-09-19T04:32:24.589936-04:00 level=error msg=Max duration exceeded max_duration=180
time=2024-09-19T04:32:24.590001-04:00 level=info msg=Job state changed new=terminated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants