-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace version with reader cache key in IndicesRequestCache #34189
Conversation
Today we use the version of a DirectoryReader as a component of the key of IndicesRequestCache. This usage is perfectly fine since the version is advanced every time a new change is made into IndexWriter. In other words, two DirectoryReader with the same version should have the same content. However, this invariant is only guaranteed in the context of a single IndexWriter because the version is reset to the committed version value when IndexWriter is re-opened. Since elastic#33473, each IndexShard may have more than one IndexWriter, and using the version of a DirectoryReader as a part of the cache key can cause IndicesRequestCache to return stale cached values. For example, in elastic#27650, we rollback the engine (i.e., re-open IndexWriter), index new documents, refresh, then make a count request, but the search layer mistakenly returns the count of the DirectoryReader of the previous IndexWriter because the current DirectoryReader has the same version to the old DirectoryReader. This is possible because these two readers come from different IndexWriters. This commit replaces the the version with the reader cache key of IndexReader as a component of the cache key of IndicesRequestCache. Closes elastic#27650 Relates elastic#33473
Pinging @elastic/es-search-aggs |
Chatted with Boaz on another channel, I'll work on these follow-ups:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. Though I wonder why we were using getVersion
until now. I don't have the full history so maybe @s1monw can explain if this was needed for something non-obvious.
One reason I can think of is that the reader cache key changes every time you open a new directory reader even if nothing changed in the index but I don't see where this could be an issue since we reopen the directory reader only if something has changed.
@@ -137,7 +139,9 @@ BytesReference getOrCompute(CacheEntity cacheEntity, Supplier<BytesReference> lo | |||
* @param cacheKey the cache key to invalidate | |||
*/ | |||
void invalidate(CacheEntity cacheEntity, DirectoryReader reader, BytesReference cacheKey) { | |||
cache.invalidate(new Key(cacheEntity, reader.getVersion(), cacheKey)); | |||
if (reader.getReaderCacheHelper() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it possible to have a null cache helper here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some reader wrappers will return null here if they change the live-docs of a reader for instance. But I wonder in what case that happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced this check with an assertion as we should always have readerCacheKey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change LGTM. I think the version was really just legacy. I am not sure why we used this. I also can't think of a reason why this change would cause issues.
@@ -137,7 +139,9 @@ BytesReference getOrCompute(CacheEntity cacheEntity, Supplier<BytesReference> lo | |||
* @param cacheKey the cache key to invalidate | |||
*/ | |||
void invalidate(CacheEntity cacheEntity, DirectoryReader reader, BytesReference cacheKey) { | |||
cache.invalidate(new Key(cacheEntity, reader.getVersion(), cacheKey)); | |||
if (reader.getReaderCacheHelper() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some reader wrappers will return null here if they change the live-docs of a reader for instance. But I wonder in what case that happens?
Today we use the version of a DirectoryReader as a component of the key of IndicesRequestCache. This usage is perfectly fine since the version is advanced every time a new change is made into IndexWriter. In other words, two DirectoryReaders with the same version should have the same content. However, this invariant is only guaranteed in the context of a single IndexWriter because the version is reset to the committed version value when IndexWriter is re-opened. Since #33473, each IndexShard may have more than one IndexWriter, and using the version of a DirectoryReader as a part of the cache key can cause IndicesRequestCache to return stale cached values. For example, in #27650, we rollback the engine (i.e., re-open IndexWriter), index new documents, refresh, then make a count request, but the search layer mistakenly returns the count of the DirectoryReader of the previous IndexWriter because the current DirectoryReader has the same version of the old DirectoryReader even their documents are different. This is possible because these two readers come from different IndexWriters. This commit replaces the the version with the reader cache key of IndexReader as a component of the cache key of IndicesRequestCache. Closes #27650 Relates #33473
* master: (25 commits) HLRC: ML Adding get datafeed stats API (elastic#34271) Small fixes to the HLRC watcher documentation. (elastic#34306) Tasks: Document that status is not semvered (elastic#34270) HLRC: ML Add preview datafeed api (elastic#34284) [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation Fix error in documentation for activete watch SCRIPTING: Terms set query expression (elastic#33856) Logging: Drop remaining Settings log ctor (elastic#34149) [ML] Remove unused last_data_time member from Job (elastic#34262) Docs: Allow skipping response assertions (elastic#34240) HLRC: Add activate watch action (elastic#33988) [Security] Multi Index Expression alias wildcard exclusion (elastic#34144) [ML] Label anomalies with multi_bucket_impact (elastic#34233) Document smtp.ssl.trust configuration option (elastic#34275) Support PKCS#11 tokens as keystores and truststores (elastic#34063) Fix sporadic failure in NestedObjectMapperTests [Authz] Allow update settings action for system user (elastic#34030) Replace version with reader cache key in IndicesRequestCache (elastic#34189) [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211) Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247) ...
* rename-ccr-stats: (25 commits) HLRC: ML Adding get datafeed stats API (elastic#34271) Small fixes to the HLRC watcher documentation. (elastic#34306) Tasks: Document that status is not semvered (elastic#34270) HLRC: ML Add preview datafeed api (elastic#34284) [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation Fix error in documentation for activete watch SCRIPTING: Terms set query expression (elastic#33856) Logging: Drop remaining Settings log ctor (elastic#34149) [ML] Remove unused last_data_time member from Job (elastic#34262) Docs: Allow skipping response assertions (elastic#34240) HLRC: Add activate watch action (elastic#33988) [Security] Multi Index Expression alias wildcard exclusion (elastic#34144) [ML] Label anomalies with multi_bucket_impact (elastic#34233) Document smtp.ssl.trust configuration option (elastic#34275) Support PKCS#11 tokens as keystores and truststores (elastic#34063) Fix sporadic failure in NestedObjectMapperTests [Authz] Allow update settings action for system user (elastic#34030) Replace version with reader cache key in IndicesRequestCache (elastic#34189) [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211) Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247) ...
Today we use the version of a DirectoryReader as a component of the key of IndicesRequestCache. This usage is perfectly fine since the version is advanced every time a new change is made into IndexWriter. In other words, two DirectoryReaders with the same version should have the same content. However, this invariant is only guaranteed in the context of a single IndexWriter because the version is reset to the committed version value when IndexWriter is re-opened. Since #33473, each IndexShard may have more than one IndexWriter, and using the version of a DirectoryReader as a part of the cache key can cause IndicesRequestCache to return stale cached values. For example, in #27650, we rollback the engine (i.e., re-open IndexWriter), index new documents, refresh, then make a count request, but the search layer mistakenly returns the count of the DirectoryReader of the previous IndexWriter because the current DirectoryReader has the same version of the old DirectoryReader even their documents are different. This is possible because these two readers come from different IndexWriters. This commit replaces the the version with the reader cache key of IndexReader as a component of the cache key of IndicesRequestCache. Closes #27650 Relates #33473
Today we use the version of a DirectoryReader as a component of the key
of IndicesRequestCache. This usage is perfectly fine since the version
is advanced every time a new change is made into IndexWriter. In other
words, two DirectoryReaders with the same version should have the same
content. However, this invariant is only guaranteed in the context of a
single IndexWriter because the version is reset to the committed version
value when IndexWriter is re-opened.
Since #33473, each IndexShard may have more than one IndexWriter, and
using the version of a DirectoryReader as a part of the cache key can
cause IndicesRequestCache to return stale cached values. For example, in
#27650, we rollback the engine (i.e., re-open IndexWriter), index new
documents, refresh, then make a count request, but the search layer
mistakenly returns the count of the DirectoryReader of the previous
IndexWriter because the current DirectoryReader has the same version of
the old DirectoryReader even their documents are different. This is
possible because these two readers come from different IndexWriters.
This commit replaces the the version with the reader cache key of
IndexReader as a component of the cache key of IndicesRequestCache.
Closes #27650
Relates #33473
/cc @ywelsch