-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune tbxlogger add images #37822
Tune tbxlogger add images #37822
Conversation
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
I applied a similar fix with the TensorboardX logger. In my case, the media metrics are being compiled into a list within the |
I also tried to apply a similar modification to TBXLoggerCallback. When the image in episode.media gets processed by the JSONLoggerCallback, significant delays (minutes!) are introduced by the JSON logger. This is because the JSON logger needs to rewrite the log file at each logging call (see #21416) . I worked around this by disabling other Logger Callbacks by setting Also I think it needs to be decided how to handle the case when images are logged by multiple episodes at one time in def on_train_result(
self,
*,
algorithm: "Algorithm",
result: dict,
**kwargs,
) -> None:
"""Called at the end of Algorithm.train().
Args:
algorithm: Current Algorithm instance.
result: Dict of results returned from Algorithm.train() call.
You can mutate this object to add additional metrics.
kwargs: Forward compatibility placeholder.
"""
if 'trajectory' in result['episode_media'].keys():
result['episode_media']['myimage'] = result['episode_media']['myimage'][0] My modification of the TBXLoggerCallback.log_trial_result() function looks like this. I decided to make it more strict with the numpy array, to not automatically interpret any 3D array as image: ...
elif (isinstance(value, list) and len(value) > 0) or (
isinstance(value, np.ndarray) and value.size > 0
):
valid_result[full_attr] = value
# Check for list of images:
if all(isinstance(v, np.ndarray) and v.ndim == 3 and v.shape[0] in [1, 3] for v in value):
if len(value) == 1:
# only one image
self._trial_writer[trial].add_image(full_attr, value[0], global_step=step)
else:
# Multiple images - stack them as tensorboard requires
imgs = np.stack(value)
self._trial_writer[trial].add_images(full_attr, imgs, global_step=step)
continue
# Check for list of videos:
if all(isinstance(v, np.ndarray) and v.ndim == 5 and v.shape[2] in [1,3] for v in value):
video = np.concatenate(value, axis=1)
self._trial_writer[trial].add_video(
full_attr, video, global_step=step, fps=20)
continue
# Cover either a single video or a single image
if isinstance(value, np.ndarray) and value.size > 0:
# Video - Must have 5 dimensions in NTCHW format:
# C must be either 1 for grayscale of 3 for RGB
if value.ndim == 5 and value.shape[2] in [1, 3]:
self._trial_writer[trial].add_video(
full_attr, value, global_step=step, fps=20
)
continue
# Image - Must have 3 dimensions in CHW format.
# C must be either 1 for grayscale of 3 for RGB
if value.ndim == 3 and value.shape[0] in [1, 3]:
self._trial_writer[trial].add_image(
full_attr, value, global_step=step
)
continue
try:
... |
Signed-off-by: Simon Zehnder <[email protected]>
e10cf2c
to
378aac4
Compare
) | ||
continue | ||
|
||
# Must be multi-image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: What if this is a single video (t, w, h, c)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the definition of add_video()
only 5-dimensional inputs are accepted for this function.
Do we anywhere pass data in a lower dimensional array into this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sven1977 Do you see any cases where this setup of separating arrays by dimension could fall on our feet?
Signed-off-by: Sven Mika <[email protected]>
Signed-off-by: Simon Zehnder <[email protected]>
…ray into tune-tbxlogger-add-images Signed-off-by: Simon Zehnder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Why are these changes needed?
This PR enables users to provide in the
result
dicitonaries also image arrays that can be presented on TensorBoard like in the following example:Images can be provided either as singleton in form of an
np.ndarray
with dimensions(3, H, W)
or in form of a4-d
np.ndarray
with dimensions(N, 3, H, W)
(in this case images get concatenated horizontally).Related issue number
#21954
As this issue is a P1Issue that should be fixed within a few weeks
rllib
RLlib related issues
the corresponding solution involves storing the images to the
episode.media
as this attribute of the episode is not summarized or appended in themetrics.collect_episodes()
function.Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.