-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log allocation failures #988
Conversation
@@ -164,7 +164,8 @@ inline std::vector<event> parse_csv(std::string const& filename) | |||
|
|||
for (std::size_t i = 0; i < actions.size(); ++i) { | |||
auto const& action = actions[i]; | |||
RMM_EXPECTS((action == "allocate") or (action == "free"), "Invalid action string."); | |||
RMM_EXPECTS((action == "allocate") or (action == "allocate failure") or (action == "free"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break the parsing as it assumes an event can only be an action::ALLOCATE
or action::FREE
. But adding an allocate failure
effectively adds a third kind of event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a third event isn't quite what I meant. This parser is used for the replay benchmark as well and this will still break the replay benchmark.
Nevermind, I see the replay was already updated.
logger_->info("allocate,{},{},{}", ptr, bytes, fmt::ptr(stream.value())); | ||
return ptr; | ||
} catch (...) { | ||
logger_->info("allocate failure,{},{},{}", nullptr, bytes, fmt::ptr(stream.value())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation needs to be updated that it will log a different message for failed allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@gpucibot merge |
Right now when we run out of memory, the last allocation that causes the OOM is not being logged. Adding a new
allocate failure
line to the log to help with debugging. For now these lines are ignored by the replay utility.@abellina