Structured COG band subsetting #2706

echeipesh · 2018-06-14T14:22:11Z

Current multiband COG readers still produce all of the bands from the GeoTiff.
However, its possible that the user knows that useful information exists only in subset of the bands and wishes to read exclusively those layers. GeoTiffs that use band-interleave naturally lend themselves to this optimization.

The main concerns is how to deal with invalid band selection. Currently you get "all the bands", whatever that is, so there is no opportunity for errors. Given band subset of [1,2,100] its not desirable to fail the whole query if some of the tiffs do not have band 100. Two choices are fill missing bands with NODATA tiles or return Array[Option[Tile]]. The latter is more explicit and more efficient because avoids needing to call .isNoDataTile to verify query results. However its not consistent with current API. Part of the issue here is to think through the implications and possibly establish a new convention for when we querying for band subsets.

Although we are already in 2.0.0-RC1 this should be developed with intent to back-port the addition to 2.1 branch.

The text was updated successfully, but these errors were encountered:

jbouffard · 2018-07-05T15:38:41Z

I personally don't like the first option of returning missing bands with NODATA since that just seems like the reader is failing silently. The other alternative seems okay, but I think that throwing an error might be better. If there's a failure in the reading stage, then that means the user has an incorrect assumption about the data they're working with, which should be known as soon as possible.

Regardless of what we decided to return, though, I feel that this feature should just be temporary to the 2.0 release. In my opinion, it doesn't make sense to have two separate read methods for the ValueReaders where one reads the whole Tile while the other just reads a subset of bands. Rather it should just be a single read method that gives the decided return type. What that type is is something we should discus more here #2732

moradology · 2018-07-05T16:07:10Z

The main issue Option has, and a cause for the complaints some have of 'maybeitis' in functional code, is that None is not supposed to serve as a stand-in for ⊥: if it can mean exception encountered and value not there - no exceptional state encountered, it is doing a disservice to both cases.

Because this code reaches out into the world and does stuff, exceptions aren't really avoidable. Because we need to call out such exceptional cases as well as represent successful computations which return None, I think the best bet would be for us to wrap code like this up in IO. IO[Array[Option[Tile]]] means that any None cases should be given the full semantic treatment and that we can always get at thrown exceptions in our callsite (IO.attempt converts IO[A] to IO[Either[Throwable, A]])

pomadchin · 2018-07-05T16:13:44Z

@jbouffard so you think that bandSubset method and the entire tile read should be the same method?

Separate methods were done 1. because of a different signature 2. different semantics. I can see that probably it makes sense just to call a bandSubset function in other read functions.

I also want to add that I'm very happy to see and agree with @moradology suggestion about exceptions handle via IO monad, but it's more a part of #2732. It looks like IO thing won't happen in terms of 2.0 though.

I think that exactly in this method, it looks like potentially not having a band is a correct behaviour. We're selecting some random bands from a tiff and we can get a band (Some) or can get nothing (None), all other exceptions would be thrown (or catched via IO monad attempt).

An alternative example to support Options use cases:
The same story can happen with a raster.crop(extent) function signature, where we can expect to have a non intersecting case, it means that it's fine if we have a non intersecting result, it's correct and no needs in throwing exceptions in this case. The incorrect behaviour can be smth else, not related to this non-intersection case.

jbouffard · 2018-07-05T16:28:20Z

@pomadchin Yeah, I think they should be but not right now. I was thinking that once we decide on the new Tile API and the return type for ValueReader we could merge the two together.

I'm still of the opinion that an error should just be thrown when attempting to read a band that doesn't exist. That way the user will know the data they want to work with is different from what they're expecting, and they can decide how they want to handle the error. IO[Either[Throwable, A]] seems like a lot for the user to deal with, and I think that complexity should be in an application that uses the ValueReader.

pomadchin · 2018-07-05T16:31:04Z

@jbouffard yea, your idea would work only if this method would fail on any incorrect bands. I don't buy your love to Exceptions :D

moradology · 2018-07-05T16:36:15Z

Here's how you get/deal with IO[Either[Throwable, A]]:

val someIO: IO[A] = ???
someIO.attempt map {
  case Right(success) => success // what do we do with success?
  case Left(ElementNotFoundException) => ??? // what do we do if the element is not found?
  case Left(_) => ??? // what do we do when there's some error we don't have a special case for?
}

jbouffard · 2018-07-05T16:44:09Z

@pomadchin I wouldn't say that I have a love for Exceptions haha, just that I think they have a place in IO. I do like the IO monad, but I just think that it adds more complexity than is needed to the API. By having the method just throw an error, the user will be able to decide how they want to handle things.

moradology · 2018-07-05T16:47:07Z

For those poor, confused souls, this might be OK:

val someIO: IO[A] = ???
try {
  someIO map { result => ??? }.unsafeRunSync
} catch {
  case MyException(_) => ???
}

pomadchin · 2018-07-05T16:54:46Z

@jbouffard also from the last @moradology example it's clear how we can get async api for free (we wanted to investigate it #2306).
The next thing is that you can compose effects via IO monad, and not to try catch each particular block, semantics of IO itself mean encoding side effects as pure values, capable of expressing (a)synchronous computations.

It's a big question what is better to throw new Exception (which would always be an unexpected thing) or to add some semantics into the type description.

For sure every time i mention IO it's in terms of our future possible solution. And at this point (in terms of 2.0) throwing exceptions looks not too bad.

jbouffard · 2018-07-11T12:13:14Z

@pomadchin Should we close this issue and move the discussion elsewhere?

pomadchin · 2018-07-11T12:14:16Z

@jbouffard not sure yet, will figure it out!

pomadchin · 2019-01-07T13:12:45Z

Closed here #2775

echeipesh added enhancement priority labels Jun 14, 2018

echeipesh added the cog layers reading/writing GeoTiff layers label Jun 21, 2018

jbouffard mentioned this issue Jul 6, 2018

readSubsetBands for COGValueReader and OverzoomingCOGValueReader #2757

Merged

4 tasks

pomadchin closed this as completed Jan 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structured COG band subsetting #2706

Structured COG band subsetting #2706

echeipesh commented Jun 14, 2018

jbouffard commented Jul 5, 2018

moradology commented Jul 5, 2018

pomadchin commented Jul 5, 2018 •

edited

Loading

jbouffard commented Jul 5, 2018

pomadchin commented Jul 5, 2018

moradology commented Jul 5, 2018 •

edited

Loading

jbouffard commented Jul 5, 2018

moradology commented Jul 5, 2018 •

edited

Loading

pomadchin commented Jul 5, 2018 •

edited

Loading

jbouffard commented Jul 11, 2018

pomadchin commented Jul 11, 2018

pomadchin commented Jan 7, 2019

Structured COG band subsetting #2706

Structured COG band subsetting #2706

Comments

echeipesh commented Jun 14, 2018

jbouffard commented Jul 5, 2018

moradology commented Jul 5, 2018

pomadchin commented Jul 5, 2018 • edited Loading

jbouffard commented Jul 5, 2018

pomadchin commented Jul 5, 2018

moradology commented Jul 5, 2018 • edited Loading

jbouffard commented Jul 5, 2018

moradology commented Jul 5, 2018 • edited Loading

pomadchin commented Jul 5, 2018 • edited Loading

jbouffard commented Jul 11, 2018

pomadchin commented Jul 11, 2018

pomadchin commented Jan 7, 2019

pomadchin commented Jul 5, 2018 •

edited

Loading

moradology commented Jul 5, 2018 •

edited

Loading

moradology commented Jul 5, 2018 •

edited

Loading

pomadchin commented Jul 5, 2018 •

edited

Loading