Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: tensorboard sync for profiler data, Core API v2 managed mode [MLG-1063] #8163

Merged
merged 1 commit into from
Oct 17, 2023

Conversation

ioga
Copy link
Contributor

@ioga ioga commented Oct 16, 2023

Description

  • When tensorboard manager is in auto mode, automatically do the sync on closure. This should sync files written after the last metrics payload, e.g. native pytorch profiler traces.
  • Core API v2 did not properly write tensorboard files while in managed mode.

Test Plan

  • Run a managed core api experiment which uses pytorch profiling for the entire duration of the experiment, and make sure the trace is saved and can be accessed in the tensorboard. See tests for an example.
  • Run Core API v2 experiment in a managed mode (i.e. as an experiment), and make sure tensorboard captures the metrics.

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

@cla-bot cla-bot bot added the cla-signed label Oct 16, 2023
@netlify
Copy link

netlify bot commented Oct 16, 2023

Deploy Preview for determined-ui ready!

Name Link
🔨 Latest commit c0c0176
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/652ef60ee7be50000835fe0f
😎 Deploy Preview https://deploy-preview-8163--determined-ui.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@determined-ci determined-ci requested a review from a team October 16, 2023 22:45
@determined-ci determined-ci added the documentation Improvements or additions to documentation label Oct 16, 2023
@ioga ioga marked this pull request as ready for review October 17, 2023 17:09
@ioga ioga requested a review from a team as a code owner October 17, 2023 17:09
@ioga ioga requested a review from caehd10 October 17, 2023 17:09
@MikhailKardash MikhailKardash self-assigned this Oct 17, 2023
@determined-ci determined-ci requested a review from a team October 17, 2023 19:22
Copy link
Contributor

@MikhailKardash MikhailKardash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@determined-ci determined-ci requested a review from a team October 17, 2023 21:01
@ioga ioga merged commit c29d086 into main Oct 17, 2023
69 of 82 checks passed
@ioga ioga deleted the pytorch-profiler-sync branch October 17, 2023 22:06
@dannysauer dannysauer added this to the 0.26.2 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants