Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask tests #51

Open
4 tasks
rsignell-usgs opened this issue Jan 15, 2018 · 4 comments
Open
4 tasks

Dask tests #51

rsignell-usgs opened this issue Jan 15, 2018 · 4 comments
Assignees

Comments

@rsignell-usgs
Copy link
Contributor

As requested here by @mrocklin: pangeo-data/pangeo#75 (comment)

  • Try XArray + Dask locally on the HSDS data to verify that it can be accessed concurrently from multiple threads
  • Try XArray + Dask.distributed locally on the HSDS data to verify that the h5pyd objects can survive being serialized
  • Try everything on a distributed cluster using KubeCluster and then look at the performance of scalable computing
  • Try this all again on a cluster on S3, where presumably we would expect 100-200MB/s network access from each node.
@rsignell-usgs
Copy link
Contributor Author

@jreadey , do you think you might be able to take a stab at these?

@jreadey
Copy link
Member

jreadey commented Jan 16, 2018

@rsignell-usgs - yes I think so, but may need to do a bit of self-education on Dask.

What is KubeCluster?

@mrocklin
Copy link

mrocklin commented Jan 16, 2018 via email

@mrocklin
Copy link

mrocklin commented Jan 16, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants