Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Statistics Resource Adaptor and cython bindings to tracking_resource_adaptor and statistics_resource_adaptor #626

Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
c5e91d9
Adding additional tracking info and cython bindings to tracking_resou…
mdemoret-nv Nov 10, 2020
b31bb2f
Adding PR to CHANGELOG
mdemoret-nv Nov 10, 2020
533fe2a
Style cleanup
mdemoret-nv Nov 10, 2020
cffe9bf
Fixing incorrect Cython class name
mdemoret-nv Nov 10, 2020
e1218fa
Apply suggestions from code review
mdemoret-nv Nov 10, 2020
25d5da3
Merge remote-tracking branch 'upstream/branch-0.17' into enh-extend-t…
mdemoret-nv Nov 19, 2020
976dabb
Removed the reset() method, added ability to push/pull
mdemoret-nv Nov 25, 2020
39d5e22
Merge remote-tracking branch 'upstream/branch-0.17' into enh-extend-t…
mdemoret-nv Nov 25, 2020
ddd296a
Style cleanup
mdemoret-nv Nov 25, 2020
df0054e
Adding a reference to the MR used for allocation in device_buffer
mdemoret-nv Dec 4, 2020
cbf2772
Merge remote-tracking branch 'upstream/branch-0.18' into enh-extend-t…
mdemoret-nv Dec 15, 2020
e3e586a
Removing reset() and push/pop from the tracking manager
mdemoret-nv Dec 15, 2020
da25869
Apply suggestions from code review
mdemoret-nv Dec 15, 2020
312ca50
Changing ssize_t to int64_t per review from harrism
mdemoret-nv Dec 15, 2020
4c677f5
Merge branch 'branch-0.20' into enh-extend-tracking-resource-adaptor
mdemoret-nv Apr 5, 2021
1d64c6d
Incorporating feedback from code review. Simplifying counter struct.
mdemoret-nv Apr 9, 2021
97753ee
Style cleanup.
mdemoret-nv Apr 9, 2021
b812bc0
Style cleanup for `black` which was missed in the logs
mdemoret-nv Apr 9, 2021
9d74e5d
Cleaning up code to reduce number of changes with 0.20
mdemoret-nv Apr 9, 2021
9e69c67
Update python/rmm/tests/test_rmm.py
mdemoret-nv May 13, 2021
8bfd27b
Merge branch 'branch-0.20' into enh-extend-tracking-resource-adaptor
mdemoret-nv May 13, 2021
7cf3123
Moving the dl library link to the `rmm::rmm` main interface.
mdemoret-nv May 13, 2021
b5bbab4
Merge branch 'branch-21.08' into enh-extend-tracking-resource-adaptor
mdemoret-nv Jun 2, 2021
c7395e4
Separated the statistics portion from the tracking_resource_adaptor i…
mdemoret-nv Jun 2, 2021
1872ee2
Getting tests to pass
mdemoret-nv Jun 2, 2021
6b20fb8
Style cleanup
mdemoret-nv Jun 2, 2021
b281f9e
Improving method comment.
mdemoret-nv Jun 3, 2021
cab217d
Style cleanup.
mdemoret-nv Jun 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ endif(CUDA_STATIC_RUNTIME)

target_link_libraries(rmm INTERFACE rmm::Thrust)
target_link_libraries(rmm INTERFACE spdlog::spdlog_header_only)
target_link_libraries(rmm INTERFACE dl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Quick google shows that dladdr now lives in libc rather than libdl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With stack traces enabled, this was needed to compile the tests (original comment). Keith and I briefly discussed this here: #626 (comment).

Can you send me the link where you saw that dladdr has moved? All I am seeing from this link is:

Link with -ldl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, the docs I found were not for linux -- Solaris and something called illumos. As I said, it was a quick google.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested that all is well in libcudf, cuML, etc. when this library is linked here? Note that the other target_link_libraries for RMM are all header-only, which is why this one has me worried (RMM is a header-only library).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdemoret reports cuML builds and tests fine against this PR.

target_compile_features(rmm INTERFACE cxx_std_17 $<BUILD_INTERFACE:cuda_std_17>)

# Set logging level. Must go before including gtests and benchmarks.
Expand Down
26 changes: 24 additions & 2 deletions include/rmm/detail/stack_trace.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@
#include <sstream>

#if defined(RMM_ENABLE_STACK_TRACES)
#include <cxxabi.h>
#include <dlfcn.h>
#include <execinfo.h>
#include <memory>
#include <vector>
Expand Down Expand Up @@ -60,12 +62,32 @@ class stack_trace {
#if defined(RMM_ENABLE_STACK_TRACES)
std::unique_ptr<char*, decltype(&::free)> strings(
backtrace_symbols(st.stack_ptrs.data(), st.stack_ptrs.size()), &::free);

if (strings.get() == nullptr) {
os << "But no stack trace could be found!" << std::endl;
} else {
///@todo: support for demangling of C++ symbol names
// Iterate over the stack pointers converting to a string
for (std::size_t i = 0; i < st.stack_ptrs.size(); ++i) {
os << "#" << i << " in " << strings.get()[i] << std::endl;
// Leading index
os << "#" << i << " in ";

auto const str = [&] {
Dl_info info;
if (dladdr(st.stack_ptrs[i], &info)) {
int status = -1; // Demangle the name. This can occasionally fail

std::unique_ptr<char, decltype(&::free)> demangled(
abi::__cxa_demangle(info.dli_sname, nullptr, 0, &status), &::free);
// If it fails, fallback to the dli_name.
if (status == 0 or info.dli_sname) {
auto name = status == 0 ? demangled.get() : info.dli_sname;
return name + std::string(" from ") + info.dli_fname;
}
}
return std::string(strings.get()[i]);
}();

os << str << std::endl;
}
}
#else
Expand Down
112 changes: 99 additions & 13 deletions include/rmm/mr/device/tracking_resource_adaptor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,29 @@ class tracking_resource_adaptor final : public device_memory_resource {
allocation_size{size} {};
};

/**
harrism marked this conversation as resolved.
Show resolved Hide resolved
* @brief Utility struct for counting the current, peak, and total value of a number
*/
struct counter {
int64_t value{0}; // Current value
int64_t peak{0}; // Max value of `value`
int64_t total{0}; // Sum of all added values

counter& operator+=(int64_t x)
{
value += x;
total += x;
peak = std::max(value, peak);
return *this;
}

counter& operator-=(int64_t x)
{
value -= x;
return *this;
}
};

/**
* @brief Construct a new tracking resource adaptor using `upstream` to satisfy
* allocation requests.
Expand All @@ -75,13 +98,13 @@ class tracking_resource_adaptor final : public device_memory_resource {
* @param capture_stacks If true, capture stacks for allocation calls
*/
tracking_resource_adaptor(Upstream* upstream, bool capture_stacks = false)
: capture_stacks_{capture_stacks}, allocated_bytes_{0}, upstream_{upstream}
: capture_stacks_{capture_stacks}, upstream_{upstream}
{
RMM_EXPECTS(nullptr != upstream, "Unexpected null upstream resource pointer.");
}

tracking_resource_adaptor() = delete;
~tracking_resource_adaptor() = default;
virtual ~tracking_resource_adaptor() = default;
tracking_resource_adaptor(tracking_resource_adaptor const&) = delete;
tracking_resource_adaptor(tracking_resource_adaptor&&) = default;
tracking_resource_adaptor& operator=(tracking_resource_adaptor const&) = delete;
Expand Down Expand Up @@ -133,27 +156,56 @@ class tracking_resource_adaptor final : public device_memory_resource {
* @return std::size_t number of bytes that have been allocated through this
* allocator.
*/
std::size_t get_allocated_bytes() const noexcept { return allocated_bytes_; }
std::size_t get_allocated_bytes() const noexcept { return allocation_bytes_.value; }

/**
* @brief Log any outstanding allocations via RMM_LOG_DEBUG
* @brief Returns a `counter` struct for this adaptor containing the current,
* peak, and total number of allocated bytes or allocation counts for this
* adaptor since it was created.
*
* @param return_bytes true to return bytes counter, false to re
* @return counter
*/
void log_outstanding_allocations() const
counter get_counter(bool return_bytes = true) const noexcept
{
#if SPDLOG_ACTIVE_LEVEL <= SPDLOG_LEVEL_DEBUG
read_lock_t lock(mtx_);
if (not allocations_.empty()) {
std::ostringstream oss;

return return_bytes ? allocation_bytes_ : allocation_count_;
}

/**
* @brief Gets a string containing the outstanding allocation pointers, their
* size, and optionally the stack trace for when each pointer was allocated.
mdemoret-nv marked this conversation as resolved.
Show resolved Hide resolved
*
* @return std::string Containing the outstanding allocation pointers.
*/
std::string get_outstanding_allocations_str() const
{
read_lock_t lock(mtx_);

std::ostringstream oss;

if (!allocations_.empty()) {
for (auto const& al : allocations_) {
oss << al.first << ": " << al.second.allocation_size << " B";
if (al.second.strace != nullptr) {
oss << " : callstack:" << std::endl << *al.second.strace;
}
oss << std::endl;
}
RMM_LOG_DEBUG("Outstanding Allocations: {}", oss.str());
}

return oss.str();
}

/**
* @brief Log any outstanding allocations via RMM_LOG_DEBUG
*
*/
void log_outstanding_allocations() const
{
#if SPDLOG_ACTIVE_LEVEL <= SPDLOG_LEVEL_DEBUG
RMM_LOG_DEBUG("Outstanding Allocations: {}", get_outstanding_allocations_str());
#endif // SPDLOG_ACTIVE_LEVEL <= SPDLOG_LEVEL_DEBUG
}

Expand All @@ -179,8 +231,11 @@ class tracking_resource_adaptor final : public device_memory_resource {
{
write_lock_t lock(mtx_);
allocations_.emplace(p, allocation_info{bytes, capture_stacks_});

// Increment the allocation_count_ while we have the lock
allocation_bytes_ += bytes;
allocation_count_ += 1;
}
allocated_bytes_ += bytes;

return p;
}
Expand All @@ -197,11 +252,41 @@ class tracking_resource_adaptor final : public device_memory_resource {
void do_deallocate(void* p, std::size_t bytes, cuda_stream_view stream) override
{
upstream_->deallocate(p, bytes, stream);

{
write_lock_t lock(mtx_);
allocations_.erase(p);

const auto found = allocations_.find(p);

// Ensure the allocation is found and the number of bytes match
if (found == allocations_.end()) {
// Don't throw but log an error. Throwing in a descructor (or any noexcept) will call
// std::terminate
RMM_LOG_ERROR(
"Deallocating a pointer that was not tracked. Ptr: {:p} [{}B], Current Num. Allocations: "
"{}",
fmt::ptr(p),
bytes,
this->allocations_.size());
} else {
allocations_.erase(found);

auto allocated_bytes = found->second.allocation_size;

if (allocated_bytes != bytes) {
// Don't throw but log an error. Throwing in a descructor (or any noexcept) will call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't throw but log an error. Throwing in a descructor (or any noexcept) will call
// Don't throw but log an error. Throwing in a destructor (or any noexcept) will call

// std::terminate
RMM_LOG_ERROR(
"Alloc bytes ({}) and Dealloc bytes ({}) do not match", allocated_bytes, bytes);

bytes = allocated_bytes;
}
}

// Decrement the current allocated counts.
allocation_bytes_ -= bytes;
allocation_count_ -= 1;
}
allocated_bytes_ -= bytes;
}

/**
Expand Down Expand Up @@ -239,7 +324,8 @@ class tracking_resource_adaptor final : public device_memory_resource {

bool capture_stacks_; // whether or not to capture call stacks
std::map<void*, allocation_info> allocations_; // map of active allocations
std::atomic<std::size_t> allocated_bytes_; // number of bytes currently allocated
counter allocation_bytes_; // peak, current and total allocated bytes
counter allocation_count_; // peak, current and total allocation count
std::shared_timed_mutex mutable mtx_; // mutex for thread safe access to allocations_
Upstream* upstream_; // the upstream resource used for satisfying allocation requests
};
Expand Down
3 changes: 3 additions & 0 deletions python/rmm/_lib/memory_resource.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,7 @@ cdef class LoggingResourceAdaptor(UpstreamResourceAdaptor):
cpdef get_file_name(self)
cpdef flush(self)

cdef class TrackingResourceAdaptor(UpstreamResourceAdaptor):
pass

cpdef DeviceMemoryResource get_current_device_resource()
91 changes: 90 additions & 1 deletion python/rmm/_lib/memory_resource.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import warnings
from collections import defaultdict

from cython.operator cimport dereference as deref
from libc.stdint cimport int8_t
from libc.stdint cimport int8_t, int64_t
from libcpp cimport bool
from libcpp.cast cimport dynamic_cast
from libcpp.memory cimport make_shared, make_unique, shared_ptr, unique_ptr
Expand Down Expand Up @@ -83,6 +83,25 @@ cdef extern from "rmm/mr/device/logging_resource_adaptor.hpp" \

void flush() except +

cdef extern from "rmm/mr/device/tracking_resource_adaptor.hpp" \
namespace "rmm::mr" nogil:
cdef cppclass tracking_resource_adaptor[Upstream](device_memory_resource):
struct counter:
counter()

int64_t value
int64_t peak
int64_t total

tracking_resource_adaptor(
Upstream* upstream_mr,
bool capture_stacks) except +

counter get_counter(bool return_bytes) except +

string get_outstanding_allocations_str() except +
void log_outstanding_allocations() except +

cdef extern from "rmm/mr/device/per_device_resource.hpp" namespace "rmm" nogil:

cdef cppclass cuda_device_id:
Expand Down Expand Up @@ -457,6 +476,76 @@ cdef class LoggingResourceAdaptor(UpstreamResourceAdaptor):
self.c_obj.reset()


cdef class TrackingResourceAdaptor(UpstreamResourceAdaptor):

def __cinit__(
self,
DeviceMemoryResource upstream_mr,
bool capture_stacks=False
):
self.c_obj.reset(
new tracking_resource_adaptor[device_memory_resource](
upstream_mr.get_mr(),
capture_stacks
)
)

def __init__(
self,
DeviceMemoryResource upstream_mr,
bool capture_stacks=False
):
"""
Memory resource that logs tracks allocations/deallocations
performed by an upstream memory resource. Includes the ability to
query all outstanding allocations with the stack trace, if desired.

Parameters
----------
upstream : DeviceMemoryResource
The upstream memory resource.
capture_stacks : bool
Whether or not to capture the stack trace with each allocation.
"""
pass

@property
def allocation_counts(self) -> dict:
counts = (<tracking_resource_adaptor[device_memory_resource]*>(
self.c_obj.get()))[0].get_counter(False)
byte_counts = (<tracking_resource_adaptor[device_memory_resource]*>(
self.c_obj.get()))[0].get_counter(True)

return {
"current_bytes": byte_counts.value,
"current_count": counts.value,
"peak_bytes": byte_counts.peak,
"peak_count": counts.peak,
"total_bytes": byte_counts.total,
"total_count": counts.total,
}

def get_outstanding_allocations_str(self) -> str:
"""
Returns a string containing information about the current outstanding
allocations. For each allocation, the address, size and optional
stack trace are shown.
"""

return (<tracking_resource_adaptor[device_memory_resource]*>(
self.c_obj.get())
)[0].get_outstanding_allocations_str().decode('UTF-8')

def log_outstanding_allocations(self):
"""
Logs the output of `get_outstanding_allocations_str` to the current
RMM log file if enabled.
"""

(<tracking_resource_adaptor[device_memory_resource]*>(
self.c_obj.get()))[0].log_outstanding_allocations()


# Global per-device memory resources; dict of int:DeviceMemoryResource
cdef _per_device_mrs = defaultdict(CudaMemoryResource)

Expand Down
2 changes: 2 additions & 0 deletions python/rmm/mr.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
LoggingResourceAdaptor,
ManagedMemoryResource,
PoolMemoryResource,
TrackingResourceAdaptor,
_flush_logs,
_initialize,
disable_logging,
Expand All @@ -31,6 +32,7 @@
"LoggingResourceAdaptor",
"ManagedMemoryResource",
"PoolMemoryResource",
"TrackingResourceAdaptor",
"_flush_logs",
"_initialize",
"set_per_device_resource",
Expand Down
Loading