You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document outlines the design for implementing Reciprocal Rank Fusion, RRF, within the neural search plugin. RRF, first outlined in a 2009 paper by Cormac, Clarke, and Büttcher, is a method for combining rankings from several subqueries, providing better search results than other systems. Risk-reward tradeoffs for different fusion techniques are analyzed in this paper. The general formula for RRF, where k = rank constant and query_j_rank is the ranking for a document when it is returned in a query method in hybrid query, is as follows:
The OpenSearch neural search plugin provides semantic search query ability to users by using ML models for embedded text. It supports hybrid queries, which combine text-based search with neural vector search. This design document describes implementation of the RRF technique to combine results from different parts of hybrid query to improve the search relevance.
One advantage of RRF is that it is simple and provides a standard way of processing results. It is scalable and doesn’t require tuning (other than adjusting rank constant). Its simplicity makes it intuitive for users. The algorithm has also been used successfully by several other organizations already. Moreover, incorporating RRF into hybrid query will facilitate customer migration to OpenSeach from platforms where they are already using RRF. Some customers have already requested it to be added as a technique. Based on these reasons, we believe that RRF will provide a benefit to users performing hybrid query.
Functional Requirements
Reciprocal Rank Fusion (RRF) Implementation: Develop and integrate the RRF technique to use document ranks in individual subquery results to calculate scores, and sum those scores across all subqueries for each document to provide “rank score” to combine hybrid search results
Hybrid Search Integration: Ensure the RRF technique can be seamlessly integrated into the existing hybrid search framework. Prevent interference with sorting(by different fields), concurrent segment search, and aggregations.
Normalization Adjustment: Modify or enhance the current normalization process to utilize ranks of documents in subquery result lists, reducing potential biases from score-based normalization.
Configuration Options: Allow users to control RRF algorithm parameter rank constant (rank constant must not be less than 1). Need to prevent invalid combinations of values in configuration (e.g. mixing current normalization options with RRF options)
Non Functional Requirements
Smooth integration into existing neural search interface and clear instructions for use for customer
Keep latency same within the range of the error
Document Scope
In this document we propose a solution for questions below:
Incorporating RRF into hybrid query under a new processor
Using 1 parameter, rank constant, with a default value of 60
Only change to API is for search pipeline configuration
Out of Document Scope
Providing ability to chain more that one processor together
Adding weights as a parameter for combining document results from different subqueries
Solution Overview
Implement RRF within the neural search plugin by adding two classes, RRFProcessor and RRFProcessorFactory, and calling them from the NeuralSearch class. Rank constant could be set by user during query but have a default value. A value of 60 for rank constant is common and was used in the original paper describing the technique. It must be greater than or equal to 1. We can include guidance for customers showing how the rank scores shrink as the rank constant grows, and vice versa, to help inform their choice.
Solution HLD: Architectural and Component Design
Proposed Solution
Modify the hybrid query execution flow to capture the individual result sets from each subquery
Implement the RRF algorithm to calculate a combined score for each document based on its ranks across the different result sets.
Merge and sort the final result set based on the calculated RRF Rank Scores.
Return blended ranks of RRF in form of document scores.
Pros:
Two-way door in case implementation has issues, can easily be rolled back
Can customize to fit RRF requirements if existing architecture isn’t ideal
Cons:
Some duplicating of workflows to set up as new processor
Might cause confusion as RRF can be conceptualized as a normalization/combination technique but set up outside of NormalizationProcessor
Potential Issues
Risks / Known limitations and Future extensions
Performance impact due to additional computations for RRF scoring and result merging.
Compatibility with existing processors - need to identify potential areas of interference between processors. For normalization processor risks are low, the processor that is configured first will publish results, the next one should be ignored and has no effect.
Potential edge cases or corner cases that need to be handled gracefully.
The following diagram shows the high-level approach for implementing RRF in hybrid query
The scores and ranks are based on the example data:
Implementation Details
Query portion of the hybrid query can be reused without any changes
RRF algorithm parameters can be set in the pipeline processor configuration
Input: rank constant. Optional, we can use following default: 60 for rank constant (cited in the original paper and used successfully in existing implementations). The rank constant must be >= 1 to avoid divide by 0 error
First part of RRF algorithm is implemented by scoreRanker
Input: List of query result sets from each shard. Each shard result will have results of each individual subquery
Performs initial processing of documents to calculate rank score for each document in each subquery
Output: Sorted list of query result sets with calculated RRF rank scores for each subquery
Summation part of RRF algorithm implemented in same class by rankResults:
Input: Sorted list of query result sets with calculated RRF rank scores for each subquery
Summing up Rank Scores for documents across subqueries and combining lists of document rank scores from all subqueries and sorting to provide final sorted list of documents to send to fetch phase
Output: Sorted list of combined query result sets (sorted by combined rank score)
Benefits
Possible improved search relevance by combining results from multiple queries effectively.
Increased flexibility in hybrid query construction and result fusion.
Seamless integration with the existing neural search plugin and hybrid query infrastructure.
An additional “normalization and combination” method that customers may find helpful
Alternatives Considered
Alternative 1:
Implementing RRF as a NormalizationTechnique and CombinationTechnique (RRFNormalizationTechnique and RRFCombinationTechnique classes) that would be called by NormalizationProcessor the same way it currently works when configured with, for example, L2NormalizationTechnique and ArithmeticMeanScoreCombinationTechnique. The RRFNormalizationTechnique class would perform the subquery-level reciprocal rank score calculation part of the algorithm and would pass the rank scores of all documents in each subquery to the RRFCombinationTechnique class to combine each document’s rank scores, then merge and sort the results, returning a list of documents sorted by combined RRF rank scores
Pros:
Easy to copy structure of existing classes and add new classes into workflow
Easier adoption by users, setup similar to current setup
Cons:
Potential for creating problems that are hard to contain within just the RRF implementation
Would require intervention to prevent users from mixing and matching normalization and combination techniques
more complicated logic for parameter parsing/validation in the factory. RRF and existing processors don't share same parameters, will require branching in logic
Implementing RRF as a processor, RRFProcessor at the same level as NormalizationProcessor, registered by a RRFProcessorFactory, and the RRFProcessor calling a RRFNormalizationTechnique and RRFCombinationTechnique as described above
Pros:
Similar to Alternative 1, easy to copy structure of existing classes and add new classes into workflow
Adding processor at NormalizationProcessor level would prevent issues with need for intervention to prevent users from mixing and matching normalization and combination techniques
Cons:
Potential for creating problems that are hard to contain within just the RRF implementation
Performance impact due to additional computations for RRF scoring and result merging.
Compatibility with existing processors - need to identify potential areas of interference between processors. For normalization processor risks are low, the processor that is configured first will publish results, the next one should be ignored and has no effect.
Potential edge cases or corner cases that need to be handled gracefully.
Backward Compatibility
Will be backward compatible, will ensure that omitting RRF-specific parameters in request configurations will not cause problems
Testability
I will be writing unit tests and integration tests for this implementation
Benchmarking
Benchmarking will involve testing across several datasets used in the initial normalization implementation testing (NFCorpus, Trec-Covid, ArguAna, FiQA, Scifact, DBPedia, Quora, Scidocs, CQADupStack, Amazon ESCI) and comparing the average nDCG@10 score against that of the L2 and Min-Max normalization techniques combined with the Arithmetic, Geometric, and Harmonic combination techniques, using BM25 nDCG@10 scores on these datasets as the baseline. We will also update nightlies benchmark runs to capture performance.
Feedback Required
Should we consider adding weights for combining rank scores from different subqueries? If so, how would weights be determined?
Are there concerns about or objections to incorporating Reciprocal Rank Fusion (RRF) as a new processor, instead of integrating it into the existing NormalizationProcessor? If there are foreseeable problems with this approach, what would a better alternative look like?
The text was updated successfully, but these errors were encountered:
We can definitely add weights, it's good addition.
As for the Alternative 2 - that implies that rrf is split into normalization technique and combination technique, and that is exposed to the user in a same way as we do for other techniques today. But that will not be supported because parts of rrf cannot be combined with other techniques, I think that's the main issue with that option
Introduction
This document outlines the design for implementing Reciprocal Rank Fusion, RRF, within the neural search plugin. RRF, first outlined in a 2009 paper by Cormac, Clarke, and Büttcher, is a method for combining rankings from several subqueries, providing better search results than other systems. Risk-reward tradeoffs for different fusion techniques are analyzed in this paper. The general formula for RRF, where k = rank constant and query_j_rank is the ranking for a document when it is returned in a query method in hybrid query, is as follows:
Background
The OpenSearch neural search plugin provides semantic search query ability to users by using ML models for embedded text. It supports hybrid queries, which combine text-based search with neural vector search. This design document describes implementation of the RRF technique to combine results from different parts of hybrid query to improve the search relevance.
One advantage of RRF is that it is simple and provides a standard way of processing results. It is scalable and doesn’t require tuning (other than adjusting rank constant). Its simplicity makes it intuitive for users. The algorithm has also been used successfully by several other organizations already. Moreover, incorporating RRF into hybrid query will facilitate customer migration to OpenSeach from platforms where they are already using RRF. Some customers have already requested it to be added as a technique. Based on these reasons, we believe that RRF will provide a benefit to users performing hybrid query.
Functional Requirements
Non Functional Requirements
Document Scope
In this document we propose a solution for questions below:
Out of Document Scope
Solution Overview
Implement RRF within the neural search plugin by adding two classes, RRFProcessor and RRFProcessorFactory, and calling them from the NeuralSearch class. Rank constant could be set by user during query but have a default value. A value of 60 for rank constant is common and was used in the original paper describing the technique. It must be greater than or equal to 1. We can include guidance for customers showing how the rank scores shrink as the rank constant grows, and vice versa, to help inform their choice.
Solution HLD: Architectural and Component Design
Proposed Solution
Pros:
Cons:
Potential Issues
Risks / Known limitations and Future extensions
The following diagram shows the high-level approach for implementing RRF in hybrid query
Solution LLD
For more information about hybrid search, see the following https://opensearch.org/blog/hybrid-search/
The following example uses the data from this tutorial: https://opensearch.org/docs/latest/search-plugins/neural-search-tutorial/
I’ve copied over the scores from the match query and neural query into the table below and used the resulting numbers to calculate the ranks and rank scores
RRF used in hybrid query setting up search pipeline:
Using hybrid query currently, setting up search pipeline (no changes from current setup):
The scores and ranks are based on the example data:
Implementation Details
Benefits
Alternatives Considered
Alternative 1:
Implementing RRF as a NormalizationTechnique and CombinationTechnique (RRFNormalizationTechnique and RRFCombinationTechnique classes) that would be called by NormalizationProcessor the same way it currently works when configured with, for example, L2NormalizationTechnique and ArithmeticMeanScoreCombinationTechnique. The RRFNormalizationTechnique class would perform the subquery-level reciprocal rank score calculation part of the algorithm and would pass the rank scores of all documents in each subquery to the RRFCombinationTechnique class to combine each document’s rank scores, then merge and sort the results, returning a list of documents sorted by combined RRF rank scores
Pros:
Cons:
Example pipeline configuration call
Alternative 2:
Implementing RRF as a processor, RRFProcessor at the same level as NormalizationProcessor, registered by a RRFProcessorFactory, and the RRFProcessor calling a RRFNormalizationTechnique and RRFCombinationTechnique as described above
Pros:
Cons:
Example search pipeline configuration call
Potential Issues
Risks / Known limitations and Future extensions
Backward Compatibility
Will be backward compatible, will ensure that omitting RRF-specific parameters in request configurations will not cause problems
Testability
I will be writing unit tests and integration tests for this implementation
Benchmarking
Benchmarking will involve testing across several datasets used in the initial normalization implementation testing (NFCorpus, Trec-Covid, ArguAna, FiQA, Scifact, DBPedia, Quora, Scidocs, CQADupStack, Amazon ESCI) and comparing the average nDCG@10 score against that of the L2 and Min-Max normalization techniques combined with the Arithmetic, Geometric, and Harmonic combination techniques, using BM25 nDCG@10 scores on these datasets as the baseline. We will also update nightlies benchmark runs to capture performance.
Feedback Required
Should we consider adding weights for combining rank scores from different subqueries? If so, how would weights be determined?
Are there concerns about or objections to incorporating Reciprocal Rank Fusion (RRF) as a new processor, instead of integrating it into the existing NormalizationProcessor? If there are foreseeable problems with this approach, what would a better alternative look like?
The text was updated successfully, but these errors were encountered: