Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tiered Caching] Caching tier policies #11863

Conversation

peteralfonsi
Copy link
Contributor

Description

This PR adds caching tier policies. In a tiered caching setup, we might not want to allow all values into all tiers, so we want to define policies to allow or reject values. For example, we might not want to add values that took a very short time to compute to a disk tier, since the time to recompute them could be shorter than the time to get them back from the disk.

This PR adds an interface for policies and integrates them to restrict access to the disk tier in the TieredSpilloverCache. Currently the only implementation is the took time policy, but it should be easy to add more as they're needed.

When QuerySearchResult values are written into the IndicesRequestCache, we first write a CachePolicyInfoWrapper object, which has any values we might want to provide to policies. This way policies don't have to read the entire QSR. This wrapper is discarded when values come out of the cache.

There is also a new cluster setting to set the threshold for this policy, but right now it has no effect since the TieredSpilloverCache isn't integrated into the IndicesRequestCache yet.

Related Issues

This is part of the tiered caching feature (#10024).

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Jan 11, 2024

Compatibility status:

Checks if related components are compatible with change 24448fc

Incompatible components

Skipped components

Compatible components

Copy link
Contributor

❕ Gradle check result for 4c78763: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing
      1 org.opensearch.indices.replication.SegmentReplicationIT.classMethod

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jan 11, 2024

Codecov Report

Attention: Patch coverage is 91.83673% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 71.36%. Comparing base (5c82ab8) to head (f907e1b).
Report is 161 commits behind head on main.

❗ Current head f907e1b differs from pull request most recent head 24448fc. Consider uploading reports for the commit 24448fc to get more accurate results

Files Patch % Lines
...arch/common/cache/tier/DiskTierTookTimePolicy.java 90.00% 2 Missing ⚠️
...search/common/cache/tier/TieredSpilloverCache.java 87.50% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11863      +/-   ##
============================================
- Coverage     71.43%   71.36%   -0.08%     
+ Complexity    59407    59296     -111     
============================================
  Files          4921     4921              
  Lines        278989   278956      -33     
  Branches      40543    40548       +5     
============================================
- Hits         199287   199063     -224     
- Misses        63086    63275     +189     
- Partials      16616    16618       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

public class DiskTierTookTimePolicy implements CacheTierPolicy<BytesReference> {
public static final Setting<TimeValue> DISK_TOOKTIME_THRESHOLD_SETTING = Setting.positiveTimeSetting(
"indices.requests.cache.disk.tooktime.threshold",
TimeValue.ZERO,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do some benchmarking tests to better understand what should be the default value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this is one of the planned benchmarking scenarios for tiered caching. We plan to have a range of queries that take different lengths of time, and see how short that time can get before sending queries to the disk tier stops providing any benefit. The current value is a placeholder. The benchmarking scenarios are on hold right now because of changes being made to some of Sagar's tiered caching PRs.

peternied and others added 27 commits March 1, 2024 14:43
Views, simplify data access and manipulation by providing a virtual layer over one or more indices

Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
…ch-project#11193)

When migrating from older versions of OpenSearch index.mapper.dynamic
could be present, use the deprecation system to warn about this state and
ignore this legacy setting so the index is usable.

Signed-off-by: Peter Nied <[email protected]>
…12077)

* Fix flaky test testScrollCreatedOnReplica

Signed-off-by: Marc Handalian <[email protected]>

* Disable scheduled refresh

Signed-off-by: Marc Handalian <[email protected]>

* Clean up segment collection assertions

Signed-off-by: Marc Handalian <[email protected]>

* Fix spotless

Signed-off-by: Marc Handalian <[email protected]>

---------

Signed-off-by: Marc Handalian <[email protected]>
* Add bwc version 2.12.1

Signed-off-by: GitHub <[email protected]>
Signed-off-by: Kunal Kotwani <[email protected]>

* Fix version identifier for 2.12.1

Signed-off-by: Kunal Kotwani <[email protected]>

---------

Signed-off-by: GitHub <[email protected]>
Signed-off-by: Kunal Kotwani <[email protected]>
Co-authored-by: opensearch-ci-bot <[email protected]>
Co-authored-by: Kunal Kotwani <[email protected]>
…ject#12297)

* Add new method to pick a random replication strategy.

Signed-off-by: Rishikesh1159 <[email protected]>

* replace usage of refresh with refreshandWaitForReplication()

Signed-off-by: Rishikesh1159 <[email protected]>

* Add replication strategy logging.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
This test was muted due to an [intermediate state][1] during type
removal. It now passes per my local testing.

[1]: opensearch-project#2440 (comment)

Signed-off-by: Andrew Ross <[email protected]>
This PR was committed on main but not backported to 2.x in time for the
2.12 release.

Signed-off-by: Andrew Ross <[email protected]>
* Shard id awareness of SearchLookup

Signed-off-by: Alexander Koval <[email protected]>

* Add unit test for deprecated constructor

Signed-off-by: Andrew Ross <[email protected]>

---------

Signed-off-by: Alexander Koval <[email protected]>
Signed-off-by: Andrew Ross <[email protected]>
Co-authored-by: Andrew Ross <[email protected]>
…ildSrc/src/testKit/thirdPartyAudit (opensearch-project#12464)

* Bump org.apache.logging.log4j:log4j-core

Bumps org.apache.logging.log4j:log4j-core from 2.22.1 to 2.23.0.

---
updated-dependencies:
- dependency-name: org.apache.logging.log4j:log4j-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
…ribution/packages (opensearch-project#12461)

* Bump com.netflix.nebula.ospackage-base in /distribution/packages

Bumps com.netflix.nebula.ospackage-base from 11.8.0 to 11.8.1.

---
updated-dependencies:
- dependency-name: com.netflix.nebula.ospackage-base
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
…ject#12462)

* Bump peter-evans/create-or-update-comment from 3 to 4

Bumps [peter-evans/create-or-update-comment](https://github.com/peter-evans/create-or-update-comment) from 3 to 4.
- [Release notes](https://github.com/peter-evans/create-or-update-comment/releases)
- [Commits](peter-evans/create-or-update-comment@v3...v4)

---
updated-dependencies:
- dependency-name: peter-evans/create-or-update-comment
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update changelog

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
…11531)

* Fix get task API does not refresh resource stats

Signed-off-by: Gao Binlong <[email protected]>

* modify change log

Signed-off-by: Gao Binlong <[email protected]>

---------

Signed-off-by: Gao Binlong <[email protected]>
…pository chunk_size (opensearch-project#12277)

* implement logic of fetching blocks from multiple chunks of snapshot file.

Signed-off-by: Rishikesh1159 <[email protected]>

* Refactor and address comments.

Signed-off-by: Rishikesh1159 <[email protected]>

* apply spotless check

Signed-off-by: Rishikesh1159 <[email protected]>

* Address comments of using a different data structure to fetch blob parts.

Signed-off-by: Rishikesh1159 <[email protected]>

* remove unnecessary code.

Signed-off-by: Rishikesh1159 <[email protected]>

* Refactor outputstream usage.

Signed-off-by: Rishikesh1159 <[email protected]>

* refactor blobpart logic into a separate method and add unit tests.

Signed-off-by: Rishikesh1159 <[email protected]>

* Add new unit tests.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: Sagar Upadhyaya <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

github-actions bot commented Mar 1, 2024

❌ Gradle check result for 24448fc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.