Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Reranker feature #591

Merged
merged 2 commits into from
Feb 6, 2024
Merged

Added Reranker feature #591

merged 2 commits into from
Feb 6, 2024

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Feb 6, 2024

Description

This is a merge PR for Re-ranker from feature branch to main, feature author @HenryL27. Change meets main intake criteria based on information provided in this issue.

PRs that are part of this merge:

Issues Resolved

#248

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…anker for improving search relavancy. (#494)

* Add rerank processor interfaces

Signed-off-by: HenryL27 <[email protected]>

* add cross-encoder specific logic and factory

Signed-off-by: HenryL27 <[email protected]>

* add unittests

Signed-off-by: HenryL27 <[email protected]>

* add integration test

Signed-off-by: HenryL27 <[email protected]>

* use string.format() instead of concatenation

Signed-off-by: HenryL27 <[email protected]>

* rename generateScoringContext to generateRerankingContext

Signed-off-by: HenryL27 <[email protected]>

* add name change in test too. whoops

Signed-off-by: HenryL27 <[email protected]>

* start refactoring with contextSaourceFetchers

Signed-off-by: HenryL27 <[email protected]>

* refactor to use contextSourceFetchers to get context

Signed-off-by: HenryL27 <[email protected]>

* rename CrossEncoder to TextSimilarity

Signed-off-by: HenryL27 <[email protected]>

* add query_context layer to search ext

Signed-off-by: HenryL27 <[email protected]>

* add javadocs

Signed-off-by: HenryL27 <[email protected]>

* update to new asyncProcessResponse api

Signed-off-by: HenryL27 <[email protected]>

* rename reranktype to ML_OPENSEARCH

Signed-off-by: HenryL27 <[email protected]>

* improve error messages for bad rerank type config

Signed-off-by: HenryL27 <[email protected]>

* simplify configuration/factory logic

Signed-off-by: HenryL27 <[email protected]>

* improve handling for non-flat-string context fields

Signed-off-by: HenryL27 <[email protected]>

* rename TextSimilarity files to MLOpenSearch files

Signed-off-by: HenryL27 <[email protected]>

* apply spotless after rebase

Signed-off-by: HenryL27 <[email protected]>

* update changelog

Signed-off-by: HenryL27 <[email protected]>

* after rebase

Signed-off-by: HenryL27 <[email protected]>

* Address pr comments and fix XContent in search ext

Signed-off-by: HenryL27 <[email protected]>

* move contextSourceFetchers to their own subdirectory

Signed-off-by: HenryL27 <[email protected]>

* Apply suggestions from code review

Co-authored-by: Martin Gaievski <[email protected]>
Signed-off-by: HenryL27 <[email protected]>

* CR changes

Signed-off-by: HenryL27 <[email protected]>

* finish CR comments and fix broken unittest

Signed-off-by: HenryL27 <[email protected]>

* fix unittest names

Signed-off-by: HenryL27 <[email protected]>

---------

Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
@martin-gaievski martin-gaievski added Features Introduces a new unit of functionality that satisfies a requirement backport 2.x Label will add auto workflow to backport PR to 2.x branch v2.12.0 Issues targeting release v2.12.0 labels Feb 6, 2024
* add validations from appsec

Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Copy link

codecov bot commented Feb 6, 2024

Codecov Report

Attention: 41 lines in your changes are missing coverage. Please review.

Comparison is base (8877db5) 84.39% compared to head (678fd15) 84.43%.

Files Patch % Lines
...ssor/rerank/context/QueryContextSourceFetcher.java 77.77% 8 Missing and 4 partials ⚠️
...arch/processor/factory/RerankProcessorFactory.java 75.00% 4 Missing and 4 partials ⚠️
...rch/processor/rerank/RescoringRerankProcessor.java 84.09% 5 Missing and 2 partials ⚠️
...g/opensearch/neuralsearch/plugin/NeuralSearch.java 0.00% 6 Missing ⚠️
...neuralsearch/processor/rerank/RerankProcessor.java 80.00% 4 Missing ⚠️
...arch/neuralsearch/processor/rerank/RerankType.java 86.66% 1 Missing and 1 partial ⚠️
...r/rerank/context/DocumentContextSourceFetcher.java 94.11% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #591      +/-   ##
============================================
+ Coverage     84.39%   84.43%   +0.04%     
- Complexity      535      603      +68     
============================================
  Files            40       48       +8     
  Lines          1570     1825     +255     
  Branches        245      275      +30     
============================================
+ Hits           1325     1541     +216     
- Misses          133      161      +28     
- Partials        112      123      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@martin-gaievski martin-gaievski merged commit 1bb48e2 into main Feb 6, 2024
106 of 107 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-591-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 1bb48e20b67cc8256235c55f5ef5ea10a5993e96
# Push it to GitHub
git push --set-upstream origin backport/backport-591-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-591-to-2.x.

martin-gaievski added a commit to martin-gaievski/neural-search that referenced this pull request Feb 6, 2024
* Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. (opensearch-project#494)

Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>

---------

Signed-off-by: HenryL27 <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
(cherry picked from commit 1bb48e2)
martin-gaievski added a commit to martin-gaievski/neural-search that referenced this pull request Feb 6, 2024
* Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. (opensearch-project#494)

Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>

---------

Signed-off-by: HenryL27 <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
(cherry picked from commit 1bb48e2)
Signed-off-by: Martin Gaievski <[email protected]>
martin-gaievski added a commit that referenced this pull request Feb 6, 2024
* Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. (#494)

(cherry picked from commit 1bb48e2)

Signed-off-by: HenryL27 <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
yuye-aws pushed a commit to yuye-aws/neural-search that referenced this pull request Mar 8, 2024
* Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. (opensearch-project#494)

Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>

---------

Signed-off-by: HenryL27 <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: HenryL27 <[email protected]>
Co-authored-by: Heemin Kim <[email protected]>
Signed-off-by: yuye-aws <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch Features Introduces a new unit of functionality that satisfies a requirement v2.12.0 Issues targeting release v2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants