Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] relax max Clauses Count limitation of termS query over IP field #16200

Open
mkhludnev opened this issue Oct 5, 2024 · 2 comments · May be fixed by #16202 or #16391
Open

[Feature Request] relax max Clauses Count limitation of termS query over IP field #16200

mkhludnev opened this issue Oct 5, 2024 · 2 comments · May be fixed by #16202 or #16391
Labels
enhancement Enhancement or improvement to existing feature or request Search:Query Capabilities

Comments

@mkhludnev
Copy link
Contributor

Is your feature request related to a problem? Please describe

Querying Ip field with terms query can hit max Clauses Count limit.
https://forum.opensearch.org/t/terms-search-gives-error-failed-to-create-query-maxclausecount-is-set-to-1024/21729/8

Describe the solution you'd like

Plain ip addresses might be handled by rewiriting into bitset efficiently. But ip masks with slashes causes a problem since they can only be handled with boolean query (and combining disjunction over many field types is really complex).

I propose to split ip terms onto two lists with masks and concrete ips, and handle them separately. Thus terms query will only limit number of masks values by max Clause count, although we can nest bool over masks deeply to overcome it.

Related component

Search:Query Capabilities

Describe alternatives you've considered

No response

Additional context

No response

@sandeshkr419
Copy link
Contributor

[Search Triage] Yes, we should review max clause count limits, and for not just IP fields.

@mkhludnev Do you have some recommendations on it further as well?

@mkhludnev
Copy link
Contributor Author

mkhludnev commented Oct 9, 2024

Here are the approaches

  1. PR use terms in set for concrete IPs keep disjunctions over ranges. fix #16200 #16202 avoids max Clauses limit for concrete IPs, but /masks are still limited
  2. PR IP field via MultrangeQuery fix #16200 #16391 avoids the limit for IPs and /masks as well but doesn't work for DV-only fields. UPD I prefer this "at-best-effort" option. with further improvement via the next one.
  3. Lucene PR DRAFT: SortedSet DV Multi Range query apache/lucene#13974 to handle many masks for DV-only fields as well.

I'm not sure which of them to pursue. WDYT?

@mkhludnev mkhludnev linked a pull request Oct 19, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Query Capabilities
Projects
Status: 🆕 New
2 participants