Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector field getting "appended" instead of replacing #207

Closed
mattkallo opened this issue Jun 2, 2023 · 4 comments
Closed

Vector field getting "appended" instead of replacing #207

mattkallo opened this issue Jun 2, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@mattkallo
Copy link

mattkallo commented Jun 2, 2023

Describe the bug
Updating a document with knn_vector setup through ingest pipline re-computes the vector and tries to "append" (doubles the dimension) to the vector field. I am not sure if this is bug or expected behavior.

Steps to reproduce the behavior:

  1. Create an index with knn_vector field (eg. content_vector)
  2. Add a ingest pipeline that uses a model to generate the embedding from field "content" and update the "content_vector" with the value.
  3. Index few documents
  4. Try to update one documents (any field of that document)
  5. It throws the below error
      "cause": {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [content_vector] of type [knn_vector] in document with id 'eb10dee8-0e07-47fa-a338-26865cecf585'. Preview of field's value: 'null'",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "Vector dimension mismatch. Expected: 768, Given: 1536"
        }
      },

Expected behavior
It should update the source and return

Plugins
Default plugins only. No additional ones installed.

Host/Environment (please complete the following information):

  • OS: Linux
  • Version 2.5.0

Additional context
Pipeline

PUT _ingest/pipeline/nlp-pipeline
{
  "description": "Pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "gJCndYgBjZUvSFOP3Cwb",
        "field_map": {
          "content": "content_vector"
        },
      "ignore_failure": true
      }
    }
  ]
}

Update document

POST /content-index/_update_by_query
{
  "script": {
    "source": "ctx._source.title='Autonomous planning'",
    "lang": "painless"
  },
  "query": {
    "match": {
      "_id": "eb10dee8-0e07-47fa-a338-26865cecf585"
    }
  }
}

@mattkallo mattkallo added bug Something isn't working untriaged labels Jun 2, 2023
@mattkallo mattkallo changed the title [BUG] Vector field getting appended instead of replacing Jun 2, 2023
@mattkallo mattkallo changed the title Vector field getting appended instead of replacing Vector field getting "appended" instead of replacing Jun 2, 2023
@minalsha minalsha transferred this issue from opensearch-project/OpenSearch Jun 7, 2023
@minalsha
Copy link

minalsha commented Jun 7, 2023

@vamshin would you help triage this issue? Thanks

@vamshin
Copy link
Member

vamshin commented Jun 22, 2023

This seem to be neural search plugin ingest issue.

@navneet1v
Copy link
Collaborator

@mattkallo the code fix has been deployed in 2.9 branch. Once we have the release for 2.9 this bug will be fixed.

@navneet1v
Copy link
Collaborator

Resolving the 2.9 is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants