-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazily create summary writer for TF2 logger. #5631
Conversation
Thanks, I rebased it! Can you have a look? |
python/ray/tune/logger.py
Outdated
@@ -181,10 +180,12 @@ def on_result(self, result): | |||
self._file_writer.flush() | |||
|
|||
def flush(self): | |||
self._file_writer.flush() | |||
if hasattr(self, "_file_writer"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we not use hasattr
and instead set a class variable _file_writer = None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks.
What happens if you just do this in
|
I'm not sure what you mean. The logger is created in |
Sorry, I meant |
Actually, the operations in the function The problem is that any tf-related operations will lead to initialize gpu devices as follows:
After initialized, any gpu-related configurations are not allowed. For example,
will raise |
that's terrible... OK; do you know if this happens with tensorboardx too? (will merge when tests pass) |
as always, @llan-ml thank you very much for the contribution! |
I'm not familiar with tensorboardx. But the issue is caused by the design of tf-2.0 summary writer, and also tf-2.0 does not expose some low-level apis like |
Test FAILed. |
Test PASSed. |
Test FAILed. |
Why are these changes needed?
For tensorflow 2.0 with gpu support, the creation of summary writer will initialize visible gpu devices and other gpu-related settings, and then we cannot modify them anymore (see this link).
When sub-classing
Trainable
, we may need to declare gpu-related settings inself._setup
. However, the settings will raise errors sinceTrainable
invokesself._setup
after logger creation.Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.