Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable '.' for nested field in text embedding processor #811

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Jul 3, 2024

Description

Adding support for complex structures in inference processor definition. Main purpose is to improve user experience for cases when object has complex hierarchical structure, so users can apply easier syntax in processor definition.

Example of such new format:

"a.b.c.d": "field"

or

"a.b" : {
   "c.d": "field"
}

in this case we will try to look for structure like this in the ingest document:

"a" : {
   "b": {
     "c" : {
       "d" : "field"

Today we do support only hierarchical type of definition in mapping. It must look exactly like it is in the document:

"a" : {
   "b": {
     "c" : {
       "d" : "field"

Note: this change affects only source field (left part of the mapping, one that holds value that is a basis for embedding generation). Today logic for the destination field (right part in mapping, field that will store generated embeddings) will be unchanged. As per today's logic that destination field is expected at the same level with the final source field. Example: "a.b.c: d" in this case embeddings are inserted in following structure:

"a" : {
   "b" : {
     "d" : [0.1, 0.2 ....]

Issues Resolved

#110

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@martin-gaievski martin-gaievski added Enhancements Increases software capabilities beyond original client specifications backport 2.x Label will add auto workflow to backport PR to 2.x branch v2.16.0 labels Jul 3, 2024
@martin-gaievski martin-gaievski changed the title Treat . in the field name as a nested field in the fields map of text embedding processor Enable '.' for nested field in text embedding processor Jul 3, 2024
@martin-gaievski martin-gaievski force-pushed the nested_structure_in_normalization_processor_mapping branch from b33f315 to 6d808c9 Compare July 3, 2024 01:14
Copy link
Member

@vibrantvarun vibrantvarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Curious how nested field is processed currently without this change?

@martin-gaievski
Copy link
Member Author

Looks good. Curious how nested field is processed currently without this change?

Today we treat field name separated with . as a single field name, if someone put "a.b" : "c" we will pickup the field only if doc has field "a.b", not "a": { "b"}. There is a workaround, user can put mapping as a structured json, I put example in PR description. In this case UX isn't great for cases when objects have complex structure.

@martin-gaievski martin-gaievski merged commit fb1f1fd into opensearch-project:main Jul 9, 2024
81 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-811-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fb1f1fda2755676163935dcc278abede8e82bf87
# Push it to GitHub
git push --set-upstream origin backport/backport-811-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-811-to-2.x.

martin-gaievski added a commit that referenced this pull request Jul 9, 2024
* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>
(cherry picked from commit fb1f1fd)
vibrantvarun pushed a commit that referenced this pull request Jul 9, 2024
* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>
(cherry picked from commit fb1f1fd)
vibrantvarun pushed a commit to vibrantvarun/neural-search that referenced this pull request Jul 9, 2024
…roject#811)

* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>
vibrantvarun added a commit that referenced this pull request Jul 9, 2024
* Adds method_parameters in neural search query to support ef_search (#787) (#814)

Signed-off-by: Tejas Shah <[email protected]>

* Add BWC for batch ingestion (#769)

* Add BWC for batch ingestion

Signed-off-by: Liyun Xiu <[email protected]>

* Update Changelog

Signed-off-by: Liyun Xiu <[email protected]>

* Fix spotlessLicenseCheck

Signed-off-by: Liyun Xiu <[email protected]>

* Fix comments

Signed-off-by: Liyun Xiu <[email protected]>

* Reuse the same code

Signed-off-by: Liyun Xiu <[email protected]>

* Rename some functions

Signed-off-by: Liyun Xiu <[email protected]>

* Rename a function

Signed-off-by: Liyun Xiu <[email protected]>

* Minor change to trigger rebuild

Signed-off-by: Liyun Xiu <[email protected]>

---------

Signed-off-by: Liyun Xiu <[email protected]>

* Neural sparse query two-phase search processor's bwc test (#777)

* Poc of pipeline

Signed-off-by: conggguan <[email protected]>

* Complete some settings for two phase pipeline.

Signed-off-by: conggguan <[email protected]>

* Change the implement of two-phase from QueryBuilderVistor to custom process funciton.

Signed-off-by: conggguan <[email protected]>

* Add It and fix some bug on the state of multy same neuralsparsequerybuilder.

Signed-off-by: conggguan <[email protected]>

* Simplify some logic, and correct some format.

Signed-off-by: conggguan <[email protected]>

* Optimize some format.

Signed-off-by: conggguan <[email protected]>

* Add some test case.

Signed-off-by: conggguan <[email protected]>

* Optimize some logic for zhichao-aws's comments.

Signed-off-by: conggguan <[email protected]>

* Optimize a line without application.

Signed-off-by: conggguan <[email protected]>

* Add some comments, remove some redundant lines, fix some format.

Signed-off-by: conggguan <[email protected]>

* Remove a redundant null check, fix a if format.

Signed-off-by: conggguan <[email protected]>

* Fix a typo for a comment, camelcase format for some variable.

Signed-off-by: conggguan <[email protected]>

* Add some comments to illustrate the influence of the modify on 2-phase search pipeline to neural sparse query builder.

Signed-off-by: conggguan <[email protected]>

* Add restart and rolling upgrade bwc test for neural sparse two phase processor.

Signed-off-by: conggguan <[email protected]>

* Spotless on qa.

Signed-off-by: conggguan <[email protected]>

* Update change log for two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Remove redundant lines of two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Add changelog.

Signed-off-by: conggguan <[email protected]>

* Add the PR link and number for the CHANGELOG.md.

Signed-off-by: conggguan <[email protected]>

* [Fix] NeuralSparseTwoPhaseProcessorIT created wrong ingest pipeline, fix it to correct API.

Signed-off-by: conggguan <[email protected]>

---------

Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>

* Enable '.' for nested field in text embedding processor (#811)

* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>

* Fix linux build CI error due to action runner env upgrade node 20 (#821)

* Fix linux build CI error due to action runner env upgrade node 20

Signed-off-by: Varun Jain <[email protected]>

* Fix linux build on additional integ tests

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
vibrantvarun added a commit that referenced this pull request Jul 9, 2024
…827)

* Fix jdk version for CI test secure cluster action (#801) (#806)

Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* [Part 2] Normalization Phase for Sorting (#802)

* Normalization Phase for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Fixing compile test issue

Signed-off-by: Varun Jain <[email protected]>

* Optimize code

Signed-off-by: Varun Jain <[email protected]>

* Add method description

Signed-off-by: Varun Jain <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* HybridSearchSortUtil class

Signed-off-by: Varun Jain <[email protected]>

* Add Integ Tests

Signed-off-by: Varun Jain <[email protected]>

* Add Sorting Integ tests

Signed-off-by: Varun Jain <[email protected]>

* Add integ test for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Refactoring normalization processor workflow

Signed-off-by: Varun Jain <[email protected]>

* Fix Unit Tests

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Optimising Normalization

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Vijay comments

Signed-off-by: Varun Jain <[email protected]>

* Address Vijay Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

* Update bwc workflow to include 2.16.0-SNAPSHOT (#809) (#810)

* Increment BWC version



* Append 2.16.0-SNAPSHOTn in restart upgrade tests



---------

Signed-off-by: Varun Jain <[email protected]>

* [Part 3] Concurrent segment search bug in Sorting (#808)

* Cherry picking Concurrent Segment Search Bug Commit

Signed-off-by: Varun Jain <[email protected]>

* Fix Concurrent Segment Search Bug in Sorting

Signed-off-by: Varun Jain <[email protected]>

* Functional Interface

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Removing comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin commnents

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Rebasing with main (#826)

* Adds method_parameters in neural search query to support ef_search (#787) (#814)

Signed-off-by: Tejas Shah <[email protected]>

* Add BWC for batch ingestion (#769)

* Add BWC for batch ingestion

Signed-off-by: Liyun Xiu <[email protected]>

* Update Changelog

Signed-off-by: Liyun Xiu <[email protected]>

* Fix spotlessLicenseCheck

Signed-off-by: Liyun Xiu <[email protected]>

* Fix comments

Signed-off-by: Liyun Xiu <[email protected]>

* Reuse the same code

Signed-off-by: Liyun Xiu <[email protected]>

* Rename some functions

Signed-off-by: Liyun Xiu <[email protected]>

* Rename a function

Signed-off-by: Liyun Xiu <[email protected]>

* Minor change to trigger rebuild

Signed-off-by: Liyun Xiu <[email protected]>

---------

Signed-off-by: Liyun Xiu <[email protected]>

* Neural sparse query two-phase search processor's bwc test (#777)

* Poc of pipeline

Signed-off-by: conggguan <[email protected]>

* Complete some settings for two phase pipeline.

Signed-off-by: conggguan <[email protected]>

* Change the implement of two-phase from QueryBuilderVistor to custom process funciton.

Signed-off-by: conggguan <[email protected]>

* Add It and fix some bug on the state of multy same neuralsparsequerybuilder.

Signed-off-by: conggguan <[email protected]>

* Simplify some logic, and correct some format.

Signed-off-by: conggguan <[email protected]>

* Optimize some format.

Signed-off-by: conggguan <[email protected]>

* Add some test case.

Signed-off-by: conggguan <[email protected]>

* Optimize some logic for zhichao-aws's comments.

Signed-off-by: conggguan <[email protected]>

* Optimize a line without application.

Signed-off-by: conggguan <[email protected]>

* Add some comments, remove some redundant lines, fix some format.

Signed-off-by: conggguan <[email protected]>

* Remove a redundant null check, fix a if format.

Signed-off-by: conggguan <[email protected]>

* Fix a typo for a comment, camelcase format for some variable.

Signed-off-by: conggguan <[email protected]>

* Add some comments to illustrate the influence of the modify on 2-phase search pipeline to neural sparse query builder.

Signed-off-by: conggguan <[email protected]>

* Add restart and rolling upgrade bwc test for neural sparse two phase processor.

Signed-off-by: conggguan <[email protected]>

* Spotless on qa.

Signed-off-by: conggguan <[email protected]>

* Update change log for two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Remove redundant lines of two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Add changelog.

Signed-off-by: conggguan <[email protected]>

* Add the PR link and number for the CHANGELOG.md.

Signed-off-by: conggguan <[email protected]>

* [Fix] NeuralSparseTwoPhaseProcessorIT created wrong ingest pipeline, fix it to correct API.

Signed-off-by: conggguan <[email protected]>

---------

Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>

* Enable '.' for nested field in text embedding processor (#811)

* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>

* Fix linux build CI error due to action runner env upgrade node 20 (#821)

* Fix linux build CI error due to action runner env upgrade node 20

Signed-off-by: Varun Jain <[email protected]>

* Fix linux build on additional integ tests

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Add changelog

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 9, 2024
…827)

* Fix jdk version for CI test secure cluster action (#801) (#806)

Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* [Part 2] Normalization Phase for Sorting (#802)

* Normalization Phase for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Fixing compile test issue

Signed-off-by: Varun Jain <[email protected]>

* Optimize code

Signed-off-by: Varun Jain <[email protected]>

* Add method description

Signed-off-by: Varun Jain <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* HybridSearchSortUtil class

Signed-off-by: Varun Jain <[email protected]>

* Add Integ Tests

Signed-off-by: Varun Jain <[email protected]>

* Add Sorting Integ tests

Signed-off-by: Varun Jain <[email protected]>

* Add integ test for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Refactoring normalization processor workflow

Signed-off-by: Varun Jain <[email protected]>

* Fix Unit Tests

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Optimising Normalization

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Vijay comments

Signed-off-by: Varun Jain <[email protected]>

* Address Vijay Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

* Update bwc workflow to include 2.16.0-SNAPSHOT (#809) (#810)

* Increment BWC version

* Append 2.16.0-SNAPSHOTn in restart upgrade tests

---------

Signed-off-by: Varun Jain <[email protected]>

* [Part 3] Concurrent segment search bug in Sorting (#808)

* Cherry picking Concurrent Segment Search Bug Commit

Signed-off-by: Varun Jain <[email protected]>

* Fix Concurrent Segment Search Bug in Sorting

Signed-off-by: Varun Jain <[email protected]>

* Functional Interface

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Removing comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin commnents

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Rebasing with main (#826)

* Adds method_parameters in neural search query to support ef_search (#787) (#814)

Signed-off-by: Tejas Shah <[email protected]>

* Add BWC for batch ingestion (#769)

* Add BWC for batch ingestion

Signed-off-by: Liyun Xiu <[email protected]>

* Update Changelog

Signed-off-by: Liyun Xiu <[email protected]>

* Fix spotlessLicenseCheck

Signed-off-by: Liyun Xiu <[email protected]>

* Fix comments

Signed-off-by: Liyun Xiu <[email protected]>

* Reuse the same code

Signed-off-by: Liyun Xiu <[email protected]>

* Rename some functions

Signed-off-by: Liyun Xiu <[email protected]>

* Rename a function

Signed-off-by: Liyun Xiu <[email protected]>

* Minor change to trigger rebuild

Signed-off-by: Liyun Xiu <[email protected]>

---------

Signed-off-by: Liyun Xiu <[email protected]>

* Neural sparse query two-phase search processor's bwc test (#777)

* Poc of pipeline

Signed-off-by: conggguan <[email protected]>

* Complete some settings for two phase pipeline.

Signed-off-by: conggguan <[email protected]>

* Change the implement of two-phase from QueryBuilderVistor to custom process funciton.

Signed-off-by: conggguan <[email protected]>

* Add It and fix some bug on the state of multy same neuralsparsequerybuilder.

Signed-off-by: conggguan <[email protected]>

* Simplify some logic, and correct some format.

Signed-off-by: conggguan <[email protected]>

* Optimize some format.

Signed-off-by: conggguan <[email protected]>

* Add some test case.

Signed-off-by: conggguan <[email protected]>

* Optimize some logic for zhichao-aws's comments.

Signed-off-by: conggguan <[email protected]>

* Optimize a line without application.

Signed-off-by: conggguan <[email protected]>

* Add some comments, remove some redundant lines, fix some format.

Signed-off-by: conggguan <[email protected]>

* Remove a redundant null check, fix a if format.

Signed-off-by: conggguan <[email protected]>

* Fix a typo for a comment, camelcase format for some variable.

Signed-off-by: conggguan <[email protected]>

* Add some comments to illustrate the influence of the modify on 2-phase search pipeline to neural sparse query builder.

Signed-off-by: conggguan <[email protected]>

* Add restart and rolling upgrade bwc test for neural sparse two phase processor.

Signed-off-by: conggguan <[email protected]>

* Spotless on qa.

Signed-off-by: conggguan <[email protected]>

* Update change log for two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Remove redundant lines of two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Add changelog.

Signed-off-by: conggguan <[email protected]>

* Add the PR link and number for the CHANGELOG.md.

Signed-off-by: conggguan <[email protected]>

* [Fix] NeuralSparseTwoPhaseProcessorIT created wrong ingest pipeline, fix it to correct API.

Signed-off-by: conggguan <[email protected]>

---------

Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>

* Enable '.' for nested field in text embedding processor (#811)

* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>

* Fix linux build CI error due to action runner env upgrade node 20 (#821)

* Fix linux build CI error due to action runner env upgrade node 20

Signed-off-by: Varun Jain <[email protected]>

* Fix linux build on additional integ tests

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Add changelog

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
(cherry picked from commit d22e1b8)
vibrantvarun added a commit that referenced this pull request Jul 9, 2024
…827) (#829)

* Fix jdk version for CI test secure cluster action (#801) (#806)

Signed-off-by: Martin Gaievski <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* [Part 2] Normalization Phase for Sorting (#802)

* Normalization Phase for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Fixing compile test issue

Signed-off-by: Varun Jain <[email protected]>

* Optimize code

Signed-off-by: Varun Jain <[email protected]>

* Add method description

Signed-off-by: Varun Jain <[email protected]>

* [Part 1] Collector for Sorting Results (#797)

* HybridSearchSortUtil class

Signed-off-by: Varun Jain <[email protected]>

* Add Integ Tests

Signed-off-by: Varun Jain <[email protected]>

* Add Sorting Integ tests

Signed-off-by: Varun Jain <[email protected]>

* Add integ test for Sorting

Signed-off-by: Varun Jain <[email protected]>

* Refactoring normalization processor workflow

Signed-off-by: Varun Jain <[email protected]>

* Fix Unit Tests

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Refactoring

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Optimising Normalization

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Vijay comments

Signed-off-by: Varun Jain <[email protected]>

* Address Vijay Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

* Update bwc workflow to include 2.16.0-SNAPSHOT (#809) (#810)

* Increment BWC version

* Append 2.16.0-SNAPSHOTn in restart upgrade tests

---------

Signed-off-by: Varun Jain <[email protected]>

* [Part 3] Concurrent segment search bug in Sorting (#808)

* Cherry picking Concurrent Segment Search Bug Commit

Signed-off-by: Varun Jain <[email protected]>

* Fix Concurrent Segment Search Bug in Sorting

Signed-off-by: Varun Jain <[email protected]>

* Functional Interface

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Removing comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Martin commnents

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

* Address Martin Comments

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Rebasing with main (#826)

* Adds method_parameters in neural search query to support ef_search (#787) (#814)

Signed-off-by: Tejas Shah <[email protected]>

* Add BWC for batch ingestion (#769)

* Add BWC for batch ingestion

Signed-off-by: Liyun Xiu <[email protected]>

* Update Changelog

Signed-off-by: Liyun Xiu <[email protected]>

* Fix spotlessLicenseCheck

Signed-off-by: Liyun Xiu <[email protected]>

* Fix comments

Signed-off-by: Liyun Xiu <[email protected]>

* Reuse the same code

Signed-off-by: Liyun Xiu <[email protected]>

* Rename some functions

Signed-off-by: Liyun Xiu <[email protected]>

* Rename a function

Signed-off-by: Liyun Xiu <[email protected]>

* Minor change to trigger rebuild

Signed-off-by: Liyun Xiu <[email protected]>

---------

Signed-off-by: Liyun Xiu <[email protected]>

* Neural sparse query two-phase search processor's bwc test (#777)

* Poc of pipeline

Signed-off-by: conggguan <[email protected]>

* Complete some settings for two phase pipeline.

Signed-off-by: conggguan <[email protected]>

* Change the implement of two-phase from QueryBuilderVistor to custom process funciton.

Signed-off-by: conggguan <[email protected]>

* Add It and fix some bug on the state of multy same neuralsparsequerybuilder.

Signed-off-by: conggguan <[email protected]>

* Simplify some logic, and correct some format.

Signed-off-by: conggguan <[email protected]>

* Optimize some format.

Signed-off-by: conggguan <[email protected]>

* Add some test case.

Signed-off-by: conggguan <[email protected]>

* Optimize some logic for zhichao-aws's comments.

Signed-off-by: conggguan <[email protected]>

* Optimize a line without application.

Signed-off-by: conggguan <[email protected]>

* Add some comments, remove some redundant lines, fix some format.

Signed-off-by: conggguan <[email protected]>

* Remove a redundant null check, fix a if format.

Signed-off-by: conggguan <[email protected]>

* Fix a typo for a comment, camelcase format for some variable.

Signed-off-by: conggguan <[email protected]>

* Add some comments to illustrate the influence of the modify on 2-phase search pipeline to neural sparse query builder.

Signed-off-by: conggguan <[email protected]>

* Add restart and rolling upgrade bwc test for neural sparse two phase processor.

Signed-off-by: conggguan <[email protected]>

* Spotless on qa.

Signed-off-by: conggguan <[email protected]>

* Update change log for two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Remove redundant lines of two-phase BWC test.

Signed-off-by: conggguan <[email protected]>

* Add changelog.

Signed-off-by: conggguan <[email protected]>

* Add the PR link and number for the CHANGELOG.md.

Signed-off-by: conggguan <[email protected]>

* [Fix] NeuralSparseTwoPhaseProcessorIT created wrong ingest pipeline, fix it to correct API.

Signed-off-by: conggguan <[email protected]>

---------

Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>

* Enable '.' for nested field in text embedding processor (#811)

* Added nested structure for text embed processor mapping

Signed-off-by: Martin Gaievski <[email protected]>

* Fix linux build CI error due to action runner env upgrade node 20 (#821)

* Fix linux build CI error due to action runner env upgrade node 20

Signed-off-by: Varun Jain <[email protected]>

* Fix linux build on additional integ tests

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>

* Add changelog

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Tejas Shah <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: conggguan <[email protected]>
Signed-off-by: conggguan <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
Co-authored-by: Tejas Shah <[email protected]>
Co-authored-by: Liyun Xiu <[email protected]>
Co-authored-by: conggguan <[email protected]>
(cherry picked from commit d22e1b8)

Co-authored-by: Varun Jain <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch Enhancements Increases software capabilities beyond original client specifications v2.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants