Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: BlobReadChannel limit parameter doesn't behave as documented in ReadChannel #1512

Closed
liron-upstream opened this issue Jul 17, 2022 · 4 comments · Fixed by googleapis/java-core#880
Assignees
Labels
api: storage Issues related to the googleapis/java-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue.

Comments

@liron-upstream
Copy link

liron-upstream commented Jul 17, 2022

Environment details

  1. OS type and version: Ubuntu 20.04
  2. Java version: 17
  3. version(s): 2.9.3

Any additional information below

The limit parameter in ReadChannel is documented as the following:

Limit the maximum number of bytes available to be read from this channel. If the limit is larger than the actual size of the content this will have no material impact.
NOTE:Implementers are not required to return a new instance from this method, however they are allowed to. Users of this method should always use the instance returned from this method.
Default Implementation:By default, this method will simply return this.
Params:
limit – the maximum number of bytes to limit this channel to
Returns:
The instance of channel which will respect the limit.

However, when trying to use the BlobReadChannel to read a range of bytes from object storage, limit is used as end-exclusive:
final int toRead = Math.toIntExact(Math.min(limit - position, Math.max(byteBuffer.remaining(), chunkSize)));

If you look in documentation it is also stated explicitly:
https://cloud.google.com/storage/docs/samples/storage-download-byte-range

Is this unintended behaviour or a documentation issue?

Thanks!

@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage API. label Jul 17, 2022
@ddelgrosso1 ddelgrosso1 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue. labels Jul 18, 2022
@BenWhitehead
Copy link
Collaborator

When performing a range get of an object, the Range header actually specifies the offsets inclusive. So in this case, the intent is that we're limiting a length on number of bytes rather than an end-offset. We automatically translate from length to end-offset for the request, which gives the appearance of end-exclusive.

When running the below code sample, you can see in the further logs Range: bytes=5-34 which will yield a response containing 30 bytes.

  public static void main(String[] args) {
    Storage s = StorageOptions.newBuilder().build().getService();
    BlobInfo obj = BlobInfo.newBuilder(bucket, "obj/62").build();
    String base62 = Streams.concat(
        IntStream.rangeClosed('0', '9'),
        IntStream.rangeClosed('A', 'Z'),
        IntStream.rangeClosed('a', 'z')
    )
        .mapToObj(i -> (char) i)
        .map(Object::toString)
        .collect(Collectors.joining(""));
    Blob blob = s.create(obj, base62.getBytes(StandardCharsets.UTF_8), BlobTargetOption.doesNotExist());
    Long size = blob.getSize();
    System.out.println("size = " + size);

    try (ReadChannel r = s.reader(obj.getBlobId()).limit(35)) {
      r.seek(5);
      ByteStreams.copy(Channels.newInputStream(r), System.out);
    } catch (Exception e) {
      e.printStackTrace(System.err);
    }
  }
2022-07-18 13:28:12,782 INFO  [main] com.google.api.client.http.HttpTransport - -------------- REQUEST  --------------
GET https://storage.googleapis.com/download/storage/v1/b/my_bucket/o/obj%2F62?alt=media
Accept-Encoding: gzip
Authorization: <Not Logged>
Range: bytes=5-34
User-Agent: gcloud-java/ Google-API-Java-Client/1.34.1 Google-HTTP-Java-Client/1.41.8 (gzip)
x-goog-api-client: gl-java/1.8.0 gdcl/1.34.1 linux/5.17.11

2022-07-18 13:28:12,850 INFO  [main] com.google.api.client.http.HttpTransport - -------------- RESPONSE --------------
HTTP/1.1 206 Partial Content
Content-Range: bytes 5-34/62
Pragma: no-cache
X-Goog-Metageneration: 1
Date: Mon, 18 Jul 2022 17:28:12 GMT
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
ETag: CJPOgYr7gvkCEAE=
Content-Disposition: attachment
Vary: X-Origin
Vary: Origin
Expires: Mon, 01 Jan 1990 00:00:00 GMT

2022-07-18 13:28:12,851 INFO  [main] com.google.api.client.http.HttpTransport - Total: 30 bytes
2022-07-18 13:28:12,851 INFO  [main] com.google.api.client.http.HttpTransport - 56789ABCDEFGHIJKLMNOPQRSTUVWXY
56789ABCDEFGHIJKLMNOPQRSTUVWXY

@liron-upstream
Copy link
Author

@BenWhitehead First of all thank you for the response and explanation 😀
Second, doesn't this mean that BlobReadChannel is not doing what the ReadChannel interface says it should regarding the limit parameter?

Sounds like ReadChannel treats it as number of bytes to read

@BenWhitehead
Copy link
Collaborator

Do you feel like the following modification clarifies and matches better with the behavior?

-   * Limit the maximum number of bytes available to be read from this channel. If the limit is
-   * larger than the actual size of the content this will have no material impact.
+   * Limit the maximum number of bytes to be read from the objects content, counting from the
+   * beginning of the object, which will be available to read from this channel. If the limit
+   * is larger than the actual size of the content this will have no material impact.
+   *
+   * If used in conjunction with {@link #seek(long)} the total number of returned bytes from
+   * this channel will be reduced by the number of bytes specified to seek.

@liron-upstream
Copy link
Author

@BenWhitehead That's better I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants