More intelligent sampling #82

adamjstewart · 2021-08-11T13:27:47Z

Here are some ideas:

All GeoSamplers should take a GeoDataset index as input
Randomly choose a file, then randomly sample from within bounds of that file (solves sampling out of bounds problem)
Add new sampler (RandomBatchGeoSampler) that subclasses BatchSampler and returns a batch of random patches from a single tile

When using ZipDataset with random samplers, the index should come from whichever dataset is tile-based. When using ZipDataset with grid samplers, the index should come from whichever dataset is not tile-based. Not yet sure how to handle something like Landsat + Sentinel, but we can figure that out another day.

Class hierarchy:

Sampler
- GeoSampler
  - RandomGeoSampler
  - GridGeoSampler
- BatchGeoSampler
  - RandomBatchGeoSampler

Make sure to document the difference between samplers and batch samplers and when to use which. Should store samplers and batch samplers in different files and combine in __init__ like we do with datasets. Add utils.py for things like _to_tuple.

Question: if I'm using an LRU cache and BatchSampler and multiple workers, if something isn't yet in the cache, will PyTorch spawn multiple workers all trying to warp the entire tile? It may actually be faster to use a single worker in this case.

The text was updated successfully, but these errors were encountered:

adamjstewart added the samplers Samplers for indexing datasets label Aug 11, 2021

This was referenced Aug 11, 2021

0.1.0 release and publication #55

Closed

More intelligent sampling #84

Merged

adamjstewart closed this as completed in #84 Aug 16, 2021

adamjstewart added this to the 0.1.0 milestone Nov 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More intelligent sampling #82

More intelligent sampling #82

adamjstewart commented Aug 11, 2021 •

edited

Loading

More intelligent sampling #82

More intelligent sampling #82

Comments

adamjstewart commented Aug 11, 2021 • edited Loading

adamjstewart commented Aug 11, 2021 •

edited

Loading