Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #4121: Better error messaging with vectordb query/get procedures #4132

Merged
merged 1 commit into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,9 @@ CALL apoc.vectordb.chroma.get($host, '<collection_id>', ['1','2'], {<optional co
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | null | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null | null
| ...
|===

Expand All @@ -91,9 +91,9 @@ CALL apoc.vectordb.chroma.get($host, '<collection_id>', ['1','2'], {<optional co
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | 1 | [...] | ajeje | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | brazorf | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | 1 | [...] | ajeje | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | brazorf | null | null
| ...
|===

Expand All @@ -113,9 +113,9 @@ CALL apoc.vectordb.chroma.queryAndUpdate($host,
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | ajeje
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | brazorf
| score | metadata | id | vector | text | errors
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | ajeje | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | brazorf | null
| ...
|===

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ CALL apoc.vectordb.custom.get('https://<INDEX-ID>.svc.gcp-starter.pinecone.io/qu
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text
| 1, | {a: 1} | 1 | [1,2,3,4]
| 0.1 | {a: 2} | 2 | [1,2,3,4]
| score | metadata | id | vector | text | errors
| 1, | {a: 1} | 1 | [1,2,3,4] | null
| 0.1 | {a: 2} | 2 | [1,2,3,4] | null
| ...
|===

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,12 +76,21 @@ CALL apoc.vectordb.milvus.get('http://localhost:19531', 'test_collection', [1,2]
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | null | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null | null
| ...
|===

In case of errors, e.g. due to `apoc.vectordb.milvus.query` with wrong vector size as a 3rd parameter, the error field will be populated, for example:

.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | errors
| null | null | null | null | null | ..please check the primary key and its' type can only in [int, string], error: unable to cast "wrong" of type string to int64..
|===

.Get vectors with `{allResults: true}`
[source,cypher]
----
Expand All @@ -92,9 +101,9 @@ CALL apoc.vectordb.milvus.get('http://localhost:19531', 'test_collection', [1,2]
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand All @@ -115,12 +124,20 @@ CALL apoc.vectordb.milvus.query('http://localhost:19531',
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

In case of errors, e.g. due to `apoc.vectordb.milvus.query` with wrong vector size as a 3rd parameter, the error field will be populated, for example:

.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | errors
| null | null | null | null | null | ..can only accept json format request, error: dimension: 4, but length of []float: 3: invalid parameter[expected=FloatVector][actual=[0.2,0.1,0.9]]..
|===

We can define a mapping, to auto-create one/multiple nodes and relationships, by leveraging the vector metadata.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,9 @@ CALL apoc.vectordb.pinecone.get($host, 'test-index', [1,2], {<optional config>})
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | null | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null | null
| ...
|===

Expand All @@ -108,9 +108,9 @@ CALL apoc.vectordb.pinecone.get($host, 'test-index', ['1','2'], {allResults: tru
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand All @@ -129,9 +129,9 @@ CALL apoc.vectordb.pinecone.query($host,
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,9 @@ CALL apoc.vectordb.qdrant.get($hostOrKey, 'test_collection', [1,2], {<optional c
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | null | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null | null
| ...
|===

Expand All @@ -91,9 +91,9 @@ CALL apoc.vectordb.qdrant.get($hostOrKey, 'test_collection', [1,2], {allResults:
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand All @@ -114,9 +114,9 @@ CALL apoc.vectordb.qdrant.query($hostOrKey,
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {<optional conf
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | null | null | null | null | null
| null | {city: "Berlin", foo: "two"} | null | null | null | null | null
| ...
|===

Expand All @@ -104,9 +104,9 @@ CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {allResults: tr
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | entity
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| score | metadata | id | vector | text | entity | errors
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null | null
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null | null
| ...
|===

Expand All @@ -126,12 +126,20 @@ CALL apoc.vectordb.weaviate.query($host,
.Example results
[opts="header"]
|===
| score | metadata | id | vector | text
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null
| score | metadata | id | vector | text | errors
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
| ...
|===

In case of errors, e.g. due to `apoc.vectordb.weaviate.query` with wrong vector size as a 3rd parameter, the error field will be populated, for example:

.Example results
[opts="header"]
|===
| score | metadata | id | vector | text | errors
| null | null | null | null | null | ..vector search: knn search: distance between entrypoint and query node: vector lengths don't match: 4 vs 3..
|===

We can define a mapping, to fetch the associated nodes and relationships and optionally create them, by leveraging the vector metadata.

Expand Down
40 changes: 40 additions & 0 deletions extended-it/src/test/java/apoc/vectordb/MilvusTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import static apoc.vectordb.VectorDbTestUtil.getAuthHeader;
import static apoc.vectordb.VectorDbTestUtil.ragSetup;
import static apoc.vectordb.VectorEmbeddingConfig.ALL_RESULTS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.DEFAULT_ERRORS;
import static apoc.vectordb.VectorEmbeddingConfig.FIELDS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.MAPPING_KEY;
import static apoc.vectordb.VectorMappingConfig.EMBEDDING_KEY;
Expand All @@ -47,6 +48,7 @@
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;
import static org.neo4j.configuration.GraphDatabaseSettings.DEFAULT_DATABASE_NAME;
import static org.neo4j.configuration.GraphDatabaseSettings.SYSTEM_DATABASE_NAME;

Expand Down Expand Up @@ -171,6 +173,20 @@ public void queryVectors() {
});
}

@Test
public void queryVectorsWithWrongVectorSize() {
testCall(db, "CALL apoc.vectordb.milvus.query($host, 'test_collection', [0.2, 0.1, 0.9], null, 5, $conf)",
map("host", HOST, "conf", map(FIELDS_KEY, FIELDS, ALL_RESULTS_KEY, true)),
row -> {
Map error = (Map) row.get(DEFAULT_ERRORS);
String message = (String) error.get("message");
String expected = "invalid parameter";
assertTrue("Actual error message is: " + message,
message.contains(expected)
);
});
}

@Test
public void queryVectorsWithoutVectorResult() {
testResult(db, "CALL apoc.vectordb.milvus.query($host, 'test_collection', [0.2, 0.1, 0.9, 0.7], null, 5, $conf)",
Expand Down Expand Up @@ -322,6 +338,30 @@ MAPPING_KEY, map(EMBEDDING_KEY, "vect",
);
}

@Test
public void getVectorsWithWrongVectorIdFormat() {
db.executeTransactionally("CREATE (:Test {readID: 'one'}), (:Test {readID: 'two'})");

Map<String, Object> conf = map(ALL_RESULTS_KEY, true,
FIELDS_KEY, FIELDS,
MAPPING_KEY, map(EMBEDDING_KEY, "vect",
NODE_LABEL, "Test",
ENTITY_KEY, "readID",
METADATA_KEY, "foo"));

testCall(db, "CALL apoc.vectordb.milvus.get($host, 'test_collection', ['wrong', 'id'], $conf)",
map("host", HOST, "conf", conf),
r -> {
Map error = (Map) r.get(DEFAULT_ERRORS);
String message = (String) error.get("message");
String expected = "unable to cast";
assertTrue("Actual error message is: " + message,
message.contains(expected)
);
}
);
}

@Test
public void queryVectorsWithCreateNodeUsingExistingNode() {

Expand Down
20 changes: 18 additions & 2 deletions extended-it/src/test/java/apoc/vectordb/WeaviateTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import static apoc.vectordb.VectorDbTestUtil.getAuthHeader;
import static apoc.vectordb.VectorDbTestUtil.ragSetup;
import static apoc.vectordb.VectorEmbeddingConfig.ALL_RESULTS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.DEFAULT_ERRORS;
import static apoc.vectordb.VectorEmbeddingConfig.FIELDS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.MAPPING_KEY;
import static apoc.vectordb.VectorMappingConfig.*;
Expand All @@ -48,6 +49,7 @@
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;
import static org.junit.Assert.fail;
import static org.neo4j.configuration.GraphDatabaseSettings.DEFAULT_DATABASE_NAME;
import static org.neo4j.configuration.GraphDatabaseSettings.SYSTEM_DATABASE_NAME;
Expand Down Expand Up @@ -200,7 +202,21 @@ public void queryVectors() {
assertLondonResult(row, ID_2, FALSE);
assertNotNull(row.get("score"));
assertNotNull(row.get("vector"));
});
});
}

@Test
public void queryVectorsWithWrongVectorSize() {
testCall(db, "CALL apoc.vectordb.weaviate.query($host, 'TestCollection', [0.2, 0.1, 0.9], null, 5, $conf)",
map("host", HOST, "conf", map(ALL_RESULTS_KEY, true, FIELDS_KEY, FIELDS, HEADERS_KEY, ADMIN_AUTHORIZATION)),
row -> {
List<Map> errors = (List<Map>) row.get(DEFAULT_ERRORS);
String message = (String) errors.get(0).get("message");
String expected = "vector lengths don't match";
assertTrue("Actual error message is: " + message,
message.contains(expected)
);
});
}

@Test
Expand Down Expand Up @@ -377,7 +393,7 @@ public void getReadOnlyVectorsWithMapping() {
METADATA_KEY, "foo")
);

testResult(db, "CALL apoc.vectordb.weaviate.get($host, 'TestCollection', [$id1, $id2], $conf) " +
testResult(db, "CALL apoc.vectordb.weaviate.get($host, 'TestCollection', ['$id1', $id2], $conf) " +
"YIELD vector, id, metadata, node RETURN * ORDER BY id",
MapUtil.map("host", HOST, "id1", ID_1, "id2", ID_2, "conf", conf),
r -> assertReadOnlyProcWithMappingResults(r, "node")
Expand Down
4 changes: 4 additions & 0 deletions extended/src/main/java/apoc/vectordb/Milvus.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import static apoc.vectordb.VectorDb.getEmbeddingResultStream;
import static apoc.vectordb.VectorDbHandler.Type.MILVUS;
import static apoc.vectordb.VectorDbUtil.*;
import static apoc.vectordb.VectorEmbeddingConfig.DEFAULT_ERRORS;

@Extended
public class Milvus {
Expand Down Expand Up @@ -180,6 +181,9 @@ public Stream<EmbeddingResult> queryAndUpdate(@Name("hostOrKey") String hostOrKe

private Stream<Map> getMapStream(Map v) {
var data = v.get("data");
if (data == null) {
return Stream.of(Map.of(DEFAULT_ERRORS, v));
}

return ((List<Map>) data).stream()
.map(i -> {
Expand Down
9 changes: 8 additions & 1 deletion extended/src/main/java/apoc/vectordb/VectorDb.java
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,12 @@ public static Stream<EmbeddingResult> getEmbeddingResultStream(VectorEmbeddingCo
}

public static EmbeddingResult getEmbeddingResult(VectorEmbeddingConfig conf, Transaction tx, boolean hasEmbedding, boolean hasMetadata, VectorMappingConfig mapping, Map m) {
Object errors = m.get(conf.getErrorsKey());
if (errors != null) {
return new EmbeddingResult(null, null, null, null, null, null, null,
errors);
}

Object id = conf.isAllResults() ? m.get(conf.getIdKey()) : null;
List<Double> embedding = hasEmbedding ? (List<Double>) m.get(conf.getVectorKey()) : null;
Map<String, Object> metadata = hasMetadata ? (Map<String, Object>) m.get(conf.getMetadataKey()) : null;
Expand All @@ -126,7 +132,8 @@ public static EmbeddingResult getEmbeddingResult(VectorEmbeddingConfig conf, Tra
if (entity != null) entity = Util.rebind(tx, entity);
return new EmbeddingResult(id, score, embedding, metadata, text,
mapping.getNodeLabel() == null ? null : (Node) entity,
mapping.getNodeLabel() != null ? null : (Relationship) entity
mapping.getNodeLabel() != null ? null : (Relationship) entity,
errors
);
}

Expand Down
Loading
Loading