-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous processor and exporter for query insights data #11296
Comments
The processor is done as part of the plugin implementation: #11903 We need to follow up on the exporter:
|
I want to further elaborate on the exporter component in the query insights framework. As mentioned in the architecture diagram (see #11429), we want to have generic and asynchronous exporters to export the queries insights data. In the first phase, we should focus on the use case to export top n queries data (since that’s the only insights data we have in memory now - #11904), while keeping in mind the exporter should be generic enough to handle different use cases to export to other sinks. Exporter types and configurationsConfiguration endpoints shuold be provided for exporters for any types of top n queries (by latency, by resource usages etc), For development and debugging purposes, a debug exporter that exports to stdout will be provided.
Additionally, the initial exporter release will include a local index exporter with minimal configurations required, such as the rolling index pattern to export to. By default, we can store the top n queries data in a daily rolling index named
Future implementations may include exporters for other sinks such as log4j, webhook, external_opensearch, etc., each with their specific configuration details. But we need to make sure the configuration endpoints under Proposed ImplementationIn the top n queries implementation, search requests data will be accumulated and finally drained to and stored in an in-memory priority queue in a fixed interval. The priority queue will be rotated to the "last window snapshot" once the data reaches the end of the window. The asynchronous exporter logic can be triggered during this window rotation process. Further consideration:
|
@ansjcy Thanks for the proposed solution for exporter and the PR that followed. The proposal looks good to me overall. For the following : Furthermore, are we planning on having a different index for top queries by cpu and memory in the future or use the same index? If its the former, we may want to make this clear in the naming here: |
Is your feature request related to a problem? Please describe.
we haven't provide a efficient way to export the “top queries with latency” data collected in #11295 , a generic, asynchronous processor and exporter should be created to handle those data for query insight.
(Parent RFC: #11186 )
Describe the solution you'd like
As part of #11186 , we need to implement an asynchronous processor and exporter to handle the data for query insight features. At the first iteration, the processor should be able to handle query latency data asynchronously and enqueue to the aggregator implemented in #11295 , and also export the aggregated data to an OpenSearch index. This framework can potentially be used by other query insight features in the future to avoid adding blocking logic in core search path.
Describe alternatives you've considered
In the future, we can potentially leverage the OPTL collector when it becomes available. With that we can send traces/spans to OPTL collectors, where the collector takes responsibility for necessary calculations, aggregations and export. This strategy could further reduce the impact on the OpenSearch process.
Additional context
Please refer to RFC: #11186
The text was updated successfully, but these errors were encountered: