Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: wart removal #8147

Merged
merged 1 commit into from
Oct 26, 2023
Merged

ci: wart removal #8147

merged 1 commit into from
Oct 26, 2023

Conversation

rb-determined-ai
Copy link
Member

We run some CLI list and describe commands after every single experiment, as a "to sanity check that basic CLI commands don't raise errors". That has made e2e test logs unreadable since the day it was added. For now, we can keep the tests, but don't dump stdout and stderr into the test logs. Some day, we should figure out what codepaths are being tested passively, and write proper tests for them.

Also, the test_task_logs has been failing intermittently with a timeout but no error message for weeks. Instead of using pytest.mark.timeout(), which can't be caught, implement our own timeout logic, and only dump stdout and stderr if the cli crashes.

@netlify
Copy link

netlify bot commented Oct 13, 2023

Deploy Preview for determined-ui ready!

Name Link
🔨 Latest commit e8de877
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/653a95468cf9fa0008ecd30a
😎 Deploy Preview https://deploy-preview-8147--determined-ui.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@tayritenour tayritenour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment and possibly I'm wrong about it. Did we try forcing a test to fail and confirming that the logs come through?

thread.join(timeout=5 * 60)
if thread.is_alive():
# The thread did not exit
raise ValueError("do_check_logs thread did not exit")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it didn't exit, doesn't that mean that we timed out? Shouldn't we say that instead if that's the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be value to printing stdout in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikhailKardash what do you mean? Isn't that what the except Exception clause does, is print the stdout/stderr from the task?

p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
ret = p.wait()
if ret:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not ret != 0 ?

)
except socket.timeout:
raise TimeoutError(f"timed out waiting for {task_type} with id {task_id}")
result: Any = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since the type of result is Union[bool|Exception], could we make it more explicit and call it

exception: Optional[Exception] = None

A nit, but one I think makes the flow a tiny bit easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was trying to provide positive confirmation that the thread actually set the value... but I don't check for it anywhere, and I only read it after joining the thread... So yeah, good idea, I like it that way.


thread = threading.Thread(target=do_check_logs, daemon=True)
thread.start()
thread.join(timeout=5 * 60)
Copy link
Contributor

@wes-turner wes-turner Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit that this stuffs configuration into the implementation.

Options I see, none of which I love (so indeed, maybe it's better as-is):

  • set a custom mark and in the test pull it out of request.node.keywords
  • abuse pytest.parameterize to set a fixture

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly I'm inclined to leave it as-is. request.node.keywords is something I've never heard of, which makes me think it would be a lot less readable.

A strategy with 'abuse' in the bullet point sounds undesirable.

My take is that putting config into the implementation is the least evil.

@rb-determined-ai
Copy link
Member Author

Minor comment and possibly I'm wrong about it. Did we try forcing a test to fail and confirming that the logs come through?

No, I hadn't, which was sloppy of me. Tests were in fact not coming through, because pytest.fail subclasses BaseException and not Exception. It's fixed now and tested now.

Thank you @tayritenour.

We run some CLI list and describe commands after every single
experiment, as a "to sanity check that basic CLI commands don't raise
errors".  That has made e2e test logs unreadable since the day it
was added.  For now, we can keep the tests, but don't dump stdout and
stderr into the test logs.  Some day, we should figure out what
codepaths are being tested passively, and write proper tests for them.

Also, the test_task_logs has been failing intermittently with a timeout
but no error message for weeks.  Instead of using pytest.mark.timeout(),
which can't be caught, implement our own timeout logic, and only dump
stdout and stderr if the cli crashes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Contributor

@amandavialva01 amandavialva01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rb-determined-ai rb-determined-ai merged commit c09c529 into main Oct 26, 2023
71 of 81 checks passed
@rb-determined-ai rb-determined-ai deleted the rb/ci-warts branch October 26, 2023 19:17
@dannysauer dannysauer added this to the 0.26.3 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants