Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support per-device default memory resource #409

Closed
jrhemstad opened this issue Jun 16, 2020 · 8 comments
Closed

[FEA] Support per-device default memory resource #409

jrhemstad opened this issue Jun 16, 2020 · 8 comments
Assignees
Labels
feature request New feature or request

Comments

@jrhemstad
Copy link
Contributor

Is your feature request related to a problem? Please describe.

I would like to be able to specify a "default" resource per GPU device available to my process. Today, RMM only has a concept of a single default resource.

Describe the solution you'd like

Create a mechanism to specify/manage a set of default resources, one per device.

Additional context
Thrust has a concept of a per_device_resource by managing a std::map<device_id, memory_resource>.

@jrhemstad jrhemstad added the feature request New feature or request label Jun 16, 2020
@jrhemstad jrhemstad self-assigned this Jun 16, 2020
@harrism
Copy link
Member

harrism commented Jul 27, 2020

@jrhemstad The difference with Thrust is that they lazily create the per-device MR. This assumes that all MRs are default constructed, which doesn't really work for us.

For RMM, should we instead extend get/set_default_resource to apply to the current device?

@jrhemstad
Copy link
Contributor Author

The difference with Thrust is that they lazily create the per-device MR. This assumes that all MRs are default constructed, which doesn't really work for us.

I don't think it would be too difficult to make all resources default constructible. Can just pick sane defaults for block/pool sizes, and use default_resource() for any upstreams.

For RMM, should we instead extend get/set_default_resource to apply to the current device?

I'd still start with adding the equivalent of a per_device_resource that maintains as std::unordered_map<device_id, device_memory_resource*>. I think storing a pointer gets around needing the resource to be default constructible.

Furthermore, I notice with Thrust's per-device resource machinery, the resource is typed and stored as an object instead of a pointer, which precludes being able to do any dynamic polymorphism.

template<typename MR, typename DerivedPolicy>
__host__
MR * get_per_device_resource(execution_policy<DerivedPolicy>&)
{
    static std::mutex map_lock;
    static std::unordered_map<int, MR> device_id_to_resource;

    int device_id;
    thrust::cuda_cub::throw_on_error(cudaGetDevice(&device_id));

    std::lock_guard<std::mutex> lock{map_lock};
    return &device_id_to_resource[device_id];
}```



@hcho3
Copy link
Contributor

hcho3 commented Jul 29, 2020

FYI, this feature will be quite useful in integrating RMM into XGBoost. For the initial implementation (dmlc/xgboost#5873), I am using CNMeM pool, but I'd like to use the pool allocator instead.

@jakirkham
Copy link
Member

One point worth noting is we may benefit from being able to reuse the same allocator in Rapids as XGBoost. Mentioning this in connection with memory pressure that has been periodically reported by users trying to use these libraries together.

@hcho3
Copy link
Contributor

hcho3 commented Jul 29, 2020

@jakirkham In dmlc/xgboost#5873, XGBoost will actually use the default allocator setting (rmm::mr::get_default_resource()), so we should be able to share a common GPU allocator between Rapids package and XGBoost. The caveat is that the default allocator setting variable rmm::mr::get_default_resource() is currently a singleton and not per device. Given that XGBoost supports the use of multiple GPUs via multi-node multi-GPU (MNMG) paradigm, the restriction implies that we can only choose between two RMM allocators:

  • CUDA allocator (rmm::mr::cuda_memory_resource)
  • CNMeM allocator (rmm::mr::cnmem_memory_resource): This one is used by the Google Test suite of XGBoost (testxgboost).

When per-device default resource is implemented, we will be able to use the pool allocator instead (rmm::mr::pool_memory_resource).

@harrism
Copy link
Member

harrism commented Jul 29, 2020

I don't think it would be too difficult to make all resources default constructible. Can just pick sane defaults for block/pool sizes, and use default_resource() for any upstreams.

I don't like that constraint. We lose the flexibility that we've designed into MRs. It would be very difficult to create a binning pool resource, for example, unless the pool was the default_resource.

I'd still start with adding the equivalent of a per_device_resource that maintains as std::unordered_map<device_id, device_memory_resource*>. I think storing a pointer gets around needing the resource to be default constructible.

@jrhemstad Thrust doesn't have a way to set the per_device_resource. I think we need a way to set it...

@jrhemstad
Copy link
Contributor Author

Thrust doesn't have a way to set the per_device_resource. I think we need a way to set it...

I fully intended for having a way to set a per device resource.

@harrism
Copy link
Member

harrism commented Sep 20, 2020

Fixed since 0.15.

@harrism harrism closed this as completed Sep 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants