Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Fix Report Stats for MultiStream Downloads #10357

Closed

Conversation

Rishikesh1159
Copy link
Member

@Rishikesh1159 Rishikesh1159 commented Oct 4, 2023

Description

This PR reports stats for multistream downloads. With addition of copyTo() in PR proper stats are no longer reported. This PR fixes the issue and reports proper stats.

Related Issues

Resolves #10283

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • GitHub issue/PR created in OpenSearch documentation repo for the required public documentation changes (#[Issue/PR number])

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Compatibility status:

Checks if related components are compatible with change ee55d6f

Incompatible components

Incompatible components: [https://github.com/opensearch-project/security.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

psychbot and others added 2 commits October 4, 2023 18:18
…repository and mutation of immutable settings of system repository (opensearch-project#9839)

---------

Signed-off-by: Dharmesh 💤 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@kotwanikunal
Copy link
Member

@Rishikesh1159 I think you need to wait for #10349 and rebase the stats changes on top of it.
Commit 1 in your PR has been factored into the above PR.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

Merging #10357 (88ce167) into main (d5a95b8) will decrease coverage by 0.10%.
Report is 16 commits behind head on main.
The diff coverage is 77.35%.

❗ Current head 88ce167 differs from pull request most recent head ee55d6f. Consider uploading reports for the commit ee55d6f to get more accurate results

@@             Coverage Diff              @@
##               main   #10357      +/-   ##
============================================
- Coverage     71.20%   71.10%   -0.10%     
+ Complexity    58298    58254      -44     
============================================
  Files          4832     4832              
  Lines        274711   274719       +8     
  Branches      40031    40033       +2     
============================================
- Hits         195600   195350     -250     
- Misses        62717    63051     +334     
+ Partials      16394    16318      -76     
Files Coverage Δ
...rg/opensearch/repositories/s3/S3BlobContainer.java 79.47% <100.00%> (+0.32%) ⬆️
...bstore/AsyncMultiStreamEncryptedBlobContainer.java 59.18% <100.00%> (+1.73%) ⬆️
...arch/common/blobstore/stream/read/ReadContext.java 100.00% <100.00%> (ø)
...tore/stream/read/listener/ReadContextListener.java 100.00% <100.00%> (ø)
...ices/replication/RemoteStoreReplicationSource.java 90.62% <100.00%> (+0.14%) ⬆️
...ommon/blobstore/AsyncMultiStreamBlobContainer.java 0.00% <0.00%> (ø)
...in/java/org/opensearch/index/shard/IndexShard.java 69.25% <87.50%> (-0.30%) ⬇️
...blobstore/stream/read/listener/FilePartWriter.java 78.78% <0.00%> (-14.32%) ⬇️
...earch/index/store/RemoteSegmentStoreDirectory.java 89.45% <72.22%> (-0.23%) ⬇️

... and 461 files with indirect coverage changes

@kotwanikunal kotwanikunal changed the title [Remote Store] Report Stats for MultiStream Downloads [Remote Store] Fix Report Stats for MultiStream Downloads Oct 4, 2023
ashking94 and others added 7 commits October 4, 2023 12:04
…/fixtures/hdfs-fixture (opensearch-project#10299)

* Bump org.xerial.snappy:snappy-java in /test/fixtures/hdfs-fixture

Bumps [org.xerial.snappy:snappy-java](https://github.com/xerial/snappy-java) from 1.1.10.4 to 1.1.10.5.
- [Release notes](https://github.com/xerial/snappy-java/releases)
- [Commits](xerial/snappy-java@v1.1.10.4...v1.1.10.5)

---
updated-dependencies:
- dependency-name: org.xerial.snappy:snappy-java
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
…#10349)

* Refactor read context streams to async streams

Signed-off-by: Kunal Kotwani <[email protected]>

* Refactor multipart download to a more async model

The previous approach of kicking off the stream requests for all parts
of a file did not work well for very large files. For example, a 20GiB
file uploaded in 16MiB parts will consist of 1200+ parts. When we
attempted to initiate streaming for all parts concurrently, some parts
would hit a client timeout after 2 minutes without being able to get a
connection due to the other parts not having been completed in that time
frame. This refactoring adds yet another layer of indirection in order
to allow the code that is actually writing the destination file to
control the rate at which streams are started. This should allow for
downloading files consisting of arbitrarily many parts at any connection
speed.

This commit also wires in the download rate limiter so that the
`indices.recovery.max_bytes_per_sec` is properly honored.

Signed-off-by: Andrew Ross <[email protected]>

---------

Signed-off-by: Kunal Kotwani <[email protected]>
Signed-off-by: Andrew Ross <[email protected]>
Co-authored-by: Kunal Kotwani <[email protected]>
Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: Rishikesh1159 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 5, 2023

Gradle Check (Jenkins) Run Completed with:

@Rishikesh1159
Copy link
Member Author

closing this PR in favour of another PR : #10402

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Repository] Report proper statistics for multistream downloads
6 participants