[FEA] Support per-device default memory resource #409

jrhemstad · 2020-06-16T03:07:03Z

Is your feature request related to a problem? Please describe.

I would like to be able to specify a "default" resource per GPU device available to my process. Today, RMM only has a concept of a single default resource.

Describe the solution you'd like

Create a mechanism to specify/manage a set of default resources, one per device.

Additional context
Thrust has a concept of a per_device_resource by managing a std::map<device_id, memory_resource>.

The text was updated successfully, but these errors were encountered:

harrism · 2020-07-27T01:32:02Z

@jrhemstad The difference with Thrust is that they lazily create the per-device MR. This assumes that all MRs are default constructed, which doesn't really work for us.

For RMM, should we instead extend get/set_default_resource to apply to the current device?

jrhemstad · 2020-07-27T14:08:01Z

The difference with Thrust is that they lazily create the per-device MR. This assumes that all MRs are default constructed, which doesn't really work for us.

I don't think it would be too difficult to make all resources default constructible. Can just pick sane defaults for block/pool sizes, and use default_resource() for any upstreams.

For RMM, should we instead extend get/set_default_resource to apply to the current device?

I'd still start with adding the equivalent of a per_device_resource that maintains as std::unordered_map<device_id, device_memory_resource*>. I think storing a pointer gets around needing the resource to be default constructible.

Furthermore, I notice with Thrust's per-device resource machinery, the resource is typed and stored as an object instead of a pointer, which precludes being able to do any dynamic polymorphism.

template<typename MR, typename DerivedPolicy>
__host__
MR * get_per_device_resource(execution_policy<DerivedPolicy>&)
{
    static std::mutex map_lock;
    static std::unordered_map<int, MR> device_id_to_resource;

    int device_id;
    thrust::cuda_cub::throw_on_error(cudaGetDevice(&device_id));

    std::lock_guard<std::mutex> lock{map_lock};
    return &device_id_to_resource[device_id];
}```

hcho3 · 2020-07-29T07:23:46Z

FYI, this feature will be quite useful in integrating RMM into XGBoost. For the initial implementation (dmlc/xgboost#5873), I am using CNMeM pool, but I'd like to use the pool allocator instead.

jakirkham · 2020-07-29T08:46:30Z

One point worth noting is we may benefit from being able to reuse the same allocator in Rapids as XGBoost. Mentioning this in connection with memory pressure that has been periodically reported by users trying to use these libraries together.

hcho3 · 2020-07-29T09:33:18Z

@jakirkham In dmlc/xgboost#5873, XGBoost will actually use the default allocator setting (rmm::mr::get_default_resource()), so we should be able to share a common GPU allocator between Rapids package and XGBoost. The caveat is that the default allocator setting variable rmm::mr::get_default_resource() is currently a singleton and not per device. Given that XGBoost supports the use of multiple GPUs via multi-node multi-GPU (MNMG) paradigm, the restriction implies that we can only choose between two RMM allocators:

CUDA allocator (rmm::mr::cuda_memory_resource)
CNMeM allocator (rmm::mr::cnmem_memory_resource): This one is used by the Google Test suite of XGBoost (testxgboost).

When per-device default resource is implemented, we will be able to use the pool allocator instead (rmm::mr::pool_memory_resource).

harrism · 2020-07-29T11:30:51Z

I don't think it would be too difficult to make all resources default constructible. Can just pick sane defaults for block/pool sizes, and use default_resource() for any upstreams.

I don't like that constraint. We lose the flexibility that we've designed into MRs. It would be very difficult to create a binning pool resource, for example, unless the pool was the default_resource.

I'd still start with adding the equivalent of a per_device_resource that maintains as std::unordered_map<device_id, device_memory_resource*>. I think storing a pointer gets around needing the resource to be default constructible.

@jrhemstad Thrust doesn't have a way to set the per_device_resource. I think we need a way to set it...

jrhemstad · 2020-07-29T14:17:44Z

Thrust doesn't have a way to set the per_device_resource. I think we need a way to set it...

I fully intended for having a way to set a per device resource.

harrism · 2020-09-20T21:55:15Z

Fixed since 0.15.

jrhemstad added the feature request New feature or request label Jun 16, 2020

jrhemstad self-assigned this Jun 16, 2020

jrhemstad mentioned this issue Jun 16, 2020

[FEA] Deprecate CNMeM #407

Closed

hcho3 mentioned this issue Jul 29, 2020

RMM integration plugin dmlc/xgboost#5873

Merged

harrism closed this as completed Sep 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support per-device default memory resource #409

[FEA] Support per-device default memory resource #409

jrhemstad commented Jun 16, 2020

harrism commented Jul 27, 2020

jrhemstad commented Jul 27, 2020

hcho3 commented Jul 29, 2020

jakirkham commented Jul 29, 2020

hcho3 commented Jul 29, 2020 •

edited

Loading

harrism commented Jul 29, 2020

jrhemstad commented Jul 29, 2020

harrism commented Sep 20, 2020

[FEA] Support per-device default memory resource #409

[FEA] Support per-device default memory resource #409

Comments

jrhemstad commented Jun 16, 2020

harrism commented Jul 27, 2020

jrhemstad commented Jul 27, 2020

hcho3 commented Jul 29, 2020

jakirkham commented Jul 29, 2020

hcho3 commented Jul 29, 2020 • edited Loading

harrism commented Jul 29, 2020

jrhemstad commented Jul 29, 2020

harrism commented Sep 20, 2020

hcho3 commented Jul 29, 2020 •

edited

Loading