Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce FetchContext #62357

Merged
merged 17 commits into from
Sep 17, 2020
Merged

Conversation

romseygeek
Copy link
Contributor

We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.

@romseygeek romseygeek added :Search/Search Search-related issues that do not fall into other categories >breaking-java >refactoring v8.0.0 v7.10.0 labels Sep 15, 2020
@romseygeek romseygeek self-assigned this Sep 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Sep 15, 2020
/**
* The name of the index that documents are being fetched from
*/
public String getIndexName() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is I think only used in error messages, and is duplicated by information in FetchPhaseExecutionException so we may be able to remove it.


/**
* Gets index field data for a specific fieldtype
*/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice to be able to replace this with a SearchLookup

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? A method returning search lookup instead of index field data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow 100% how that makes things better but I guess I need to see this updated once #61995 gets merged to understand how the two interact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now gone entirely, but I think it would be useful to move SearchLookup directly into the FetchContext - it's not clear at present how to do that and get rid of the new QueryShardContext.newFetchLookup() method though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed on moving search lookup into fetch context, I think Nik proposed it as well somewhere.

Map<String, Set<String>> storedToRequestedFields = new HashMap<>();
FieldsVisitor fieldsVisitor = createStoredFieldsVisitor(context, storedToRequestedFields);
if (context.docIdsToLoadSize() == 0) {
// no individual hits to process, so we shortcut
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were checking in a couple of subphases for things like "how many docs are we executing against" or "is this a suggest-only response". This pulls this check up to the top level, which also means we don't need to include this information in the FetchContext.

@@ -33,8 +33,8 @@
public final class ExplainPhase implements FetchSubPhase {

@Override
public FetchSubPhaseProcessor getProcessor(SearchContext context) {
if (context.explain() == false || context.hasOnlySuggest()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hasOnlySuggest check is now done at the top level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slight update; instead of checking for hasOnlySuggest(), we check that query() != null which is more general.


import java.io.IOException;

public class FetchScorePhase implements FetchSubPhase {

@Override
public FetchSubPhaseProcessor getProcessor(SearchContext context) throws IOException {
if (context.trackScores() == false || context.docIdsToLoadSize() == 0 ||
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is all contained in FetchContext#fetchScores()

try {
fieldSnippets = highlighter.highlightField(hitContext.reader(), hitContext.docId(), loadFieldValues);
} catch (IOException e) {
throw new FetchPhaseExecutionException(fieldContext.shardTarget,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOExceptions are handled in the top-level FetchPhase

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2 (seems unrelated)

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems really nice to scope down the context passed around during fetch.

I'm wondering about the higher-level plan/ design related to this direction. Will SearchContext eventually be broken into separate objects QueryContext and FetchContext ?

@@ -173,30 +172,4 @@ private HitContext hitExecuteMultiple(XContentBuilder source, boolean fetchSourc
return hitContext;
}

private static class FetchSourcePhaseTestSearchContext extends TestSearchContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of deletion is a good sign !

context.parsedQuery() == null) {
return null;
public FetchSubPhaseProcessor getProcessor(FetchContext context) throws IOException {
Map<String, Query> namedQueries = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking I understand, does this actually fix a bug where we wouldn't return named query information if there was only a post filter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's actually possible to have a post filter without a query, but the logic of what gets set where is pretty hairy so I thought it best to be extra defensive here.

hitContext.hit().getId());
TermVectorsResponse termVector = TermVectorsService.getTermVectors(context.indexShard(), termVectorsRequest);
try {
Terms terms = hitContext.reader().getTermVector(hitContext.docId(), field);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking I understand, is this just an optional clean-up ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it means that we don't need to expose the whole index shard to the FetchContext.

/**
* Create a FetchContext based on a SearchContext
*/
public static FetchContext fromSearchContext(SearchContext context) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the private ctor and static factory? I know its nice sometimes when you need the name to describe, but I think we don't here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started out with a few ways of building the FetchContext but have ended up with just the one, so the factory method is unnecessary. I'll get rid.

return new FetchContext(context);
}

private final SearchContext searchContext;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you float it to the top so it doesn't get lost? I get really confused if member variables aren't above member methods. I'm just super used to that.

* implementation should return {@code null}
*/
FetchSubPhaseProcessor getProcessor(SearchContext searchContext, SearchLookup lookup) throws IOException;
FetchSubPhaseProcessor getProcessor(FetchContext fetchContext, SearchLookup lookup) throws IOException;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think lookup should be a member on FetchContext? I know I just added it, but now that you are proposing our own "context" I think I'd be nice to fold it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I gave it a quick try after the merge but I think we really want to remove the QueryShardContext#newFetchLookup() method at the same time and it's not obvious to me how to do that yet, so I thought I'd leave it for a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok moving to a member now if you're happy with it. It can wait too. Either way.

It isn't so much that we want to get rid of newFetchLookup as that we want it to return a FetchLookup instead of a SearchLookup. We just haven't boiled out the common interface for those.

@romseygeek
Copy link
Contributor Author

I'm wondering about the higher-level plan/ design related to this direction. Will SearchContext eventually be broken into separate objects QueryContext and FetchContext ?

I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching, because the whole 'loading source and additional stored fields' logic at the moment is very hairy - adding logic to load values from external sources will make it almost impossible to follow as things stand.

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample-windows

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample-windows again

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @romseygeek, all my comments have been addressed.

I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching...

That makes sense -- even if we don't push further in making SearchContext more modular, this PR is an improvement. And I'm looking forward to future PRs/ discussions around SearchLookup sharing.

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample-windows and do it properly this time

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@romseygeek romseygeek merged commit 41bce50 into elastic:master Sep 17, 2020
@romseygeek romseygeek deleted the fetch/fetchcontext branch September 17, 2020 08:46
romseygeek added a commit that referenced this pull request Sep 17, 2020
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
imotov added a commit that referenced this pull request Sep 17, 2020
In #62357 we introduced an additional optimization that allows us to skip the
most of the fetch phase early if no results are found. This change caused
some cancellation test failures that were relying on definitive cancellation
during the fetch phase. This commit adds an additional quick cancellation
check at the very beginning of the fetch phase to make cancellation process
more deterministic.

Fixes #62530
imotov added a commit that referenced this pull request Sep 18, 2020
)

In #62357 we introduced an additional optimization that allows us to skip the
most of the fetch phase early if no results are found. This change caused
some cancellation test failures that were relying on definitive cancellation
during the fetch phase. This commit adds an additional quick cancellation
check at the very beginning of the fetch phase to make cancellation process
more deterministic.

Fixes #62530
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking-java >refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants