Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Hybrid search and collapse compatibility #665

Open
qmauret opened this issue Mar 29, 2024 · 4 comments
Open

[FEATURE] Hybrid search and collapse compatibility #665

qmauret opened this issue Mar 29, 2024 · 4 comments
Assignees
Labels
Features Introduces a new unit of functionality that satisfies a requirement

Comments

@qmauret
Copy link

qmauret commented Mar 29, 2024

Describe the bug

Using collapse feature in an hybrid search did not collapse documents.

Related component

Search

To Reproduce

I’m trying to combine hybrid search (semantic + keyword) with collapse feature to deduplicate products from same visual.

I have tried collapsed search on a basic search, which works great.

With hybrid search, the behaviour is a bit different. It places products from the same visual in the inner_hits field but did not collapse them (they are still present in the root level of the search results) which is not the expected behaviour.

Anyone’s aware of a problem of compatibility between hybrid and collapse ?

Expected behavior

I expect the same behaviour as performing a collapse on non hybrid search

Additional Details

Host/Environment (please complete the following information):

  • OS: AWS
  • Version : 2.11

Additional context
Basic search with collapse (working as expected) :

GET /product_1/_search
{
“_source”: {
“includes”: [“_id”, “name”, “category_name”, “visual.id_visual”]
},
“query”: {
“match”: {
“name”: {
“query”: “Ski”
}
}
},
“collapse”: {
“field”: “visual.id_visual”,
“inner_hits”: {
“size”: 1,
“name”: “from_same_visual”,
“sort”: [
{
“_score”: “desc”
}
]
}
}
}

Hybrid search with collapse (not working) :

GET /product_1/_search?search_pipeline=search_pipeline
{
“_source”: {
“includes”: [“_id”, “name”, “category_name”, “visual.id_visual”]
},
“query”: {
“hybrid”: {
“queries”: [
{
“neural”: {
“fullname_v”: {
“query_text”: “Ski”,
“model_id”: “xxx”,
“k”: 200
}
}
},
{
“multi_match”: {
“query”: “Ski”,
“type”: “most_fields”,
“fields”: [“category.name^2”, “name^4”, “tags.name^3”],
“fuzziness”: “AUTO”,
“prefix_length”: 0,
“max_expansions”: 10
}
}
]
}
},
“collapse”: {
“field”: “visual.id_visual”,
“inner_hits”: {
“size”: 1,
“name”: “from_same_visual”,
“sort”: [
{
“_score”: “desc”
}
]
}
}
}
@qmauret qmauret added bug Something isn't working untriaged labels Mar 29, 2024
@peternied
Copy link
Member

peternied commented Apr 3, 2024

[Triage - attendees 1 2 3 4 5 6 7 8]
@opensearch-project/admin Could you transfer this to the neural search repository, this seems related to its functionality.

@bbarani bbarani transferred this issue from opensearch-project/OpenSearch Apr 3, 2024
@martin-gaievski
Copy link
Member

@qmauret functionality of collapse is not supported by the hybrid query. Team will look into the feasibility of adding it.

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Later (6 months plus) in Search Project Board Aug 15, 2024
@sonic182
Copy link

Hi, I'm having the same issue, for certain customers with products + product variants, it is ugly to have the same result repeated sometimes (eg: for the product size so the image is the same)

Would be nice to have this feature for hybrid queries 👍

@navneet1v navneet1v moved this from Backlog to Backlog (Hot) in Vector Search RoadMap Sep 14, 2024
@navneet1v
Copy link
Collaborator

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement and removed bug Something isn't working labels Sep 18, 2024
@naveentatikonda naveentatikonda changed the title Hybrid search and collapse compatibility [FEATURE] Hybrid search and collapse compatibility Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Introduces a new unit of functionality that satisfies a requirement
Projects
Status: Later (6 months plus)
Status: Backlog (Hot)
Development

No branches or pull requests

7 participants