-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Search Pipelines] How should we handle default pipelines for multiple indices? Aliases? Wildcards? #7512
Comments
@msfroh There is a feature request for Views [1], maybe controlling how pipelines are executed could be built into this specification, what do you think? |
Woops! I accidentally duplicated this issue over at #11058, which I've now closed to bring things back over here. As I commented there, I agree with @peternied's suggestion that a new "view" concept is probably the best way forward. Sorry I missed your comment on this issue! |
I would like to see option 1 or 2 implemented. For what it's worth, I expected it to work this way and went searching for this issue when I found that it didn't. I don't think the solution to add "views" to OpenSearch is mutually exclusive with one of these solutions. This should still be done since the behavior for aliases is unexpected, in my opinion Edit: I'm using an alias that only points to 1 index and the default pipeline isn't being triggered. Is this expected? I saw this condition, and I wondered if it was intended to cover this case |
I can take a stab at implementing option 1. Now that I have a better understanding of index name resolution, I think I know where/how to do it. |
I talked w/ @owaiskazi19 on Friday about how to implement this. Just capturing the code that we discussed: } else if (state != null && searchRequest.indices() != null) {
Index[] concreteIndices = indexNameExpressionResolver.concreteIndices(state, searchRequest);
for (Index index : concreteIndices) {
IndexMetadata indexMetadata = state.metadata().index(index);
Settings indexSettings = indexMetadata.getSettings();
if (IndexSettings.DEFAULT_SEARCH_PIPELINE.exists(indexSettings)) {
String curPipelineId = IndexSettings.DEFAULT_SEARCH_PIPELINE.get(indexSettings);
if (NOOP_PIPELINE_ID.equals(pipelineId)) {
pipelineId = curPipelineId;
} else if (pipelineId.equals(curPipelineId) == false) {
pipelineId = NOOP_PIPELINE_ID;
break;
}
}
}
} Putting that in the |
@msfroh , @owaiskazi19 so what is being implemented is if the default pipeline on all the indices are same then the pipeline will be used otherwise it will not. Isn't this a partial solution, should we think like where we merge the request, response and phase_results processor of all the different pipeline to 1 single pipeline. If there are duplicate processors then we just use the first one will be better? Have we thought about this? I know it will very tricky just want to know what are cons of this approach. |
@navneet1v Agreed that it's not a full solution, but as Mike Mccandless always says, "Progress, not perfection". Addressing simple aliases pointing to a single index with a default pipeline is good, and addressing multiple indices with the same default pipeline is logical. Reconciling multiple, maybe incompatible pipelines is a harder problem that requires more thought to do it without surprising (or broken) behavior. I think we should figure that out in a later PR. Edit: specifically, I think any solution to the more complicated cases would do exactly what the linked PR does for the simple cases. So we can fix the complicated cases later with no change in behavior. (Please correct me if I'm wrong!) |
@msfroh I agree this is a step in right direction. But purpose my comment was to ensure that we
This is what exactly I wanted to get opinion on. On thinking more on this I feel yes this is a right step in the long term more complicated solution. |
@owaiskazi19 tracking this issue for 2.14 release. Please have a doc issue linked to this issue. Thanks |
Linked above |
Is your feature request related to a problem? Please describe.
In #7470, we added the ability to specify a search pipeline that will execute by default for queries against a single index (unless another search pipeline is mentioned in the search request). This only kicks in when the search request is targeting a single index, though. A request that spans multiple indices won't apply a default search pipeline even if any (or all) of the indices has a default specified.
Even if an index has a default search pipeline specified, a request made against an alias for that index would also not use the search pipeline.
Describe the solution you'd like
I'm not sure what the correct behavior should be.
I can see a few options:
There are probably other options that make sense that I haven't thought of.
Describe alternatives you've considered
I haven't really considered alternatives (besides the ones listed above).
This issue is here to figure out (and implement or not implement) the right option based on feedback from people who will use search pipelines.
Additional context
N/A
The text was updated successfully, but these errors were encountered: