Skip to content

Commit

Permalink
Fixes #4090: The apoc.vectordb.*.get/query procedures should search f…
Browse files Browse the repository at this point in the history
…or nodes/relationships with mapping config
  • Loading branch information
vga91 committed May 29, 2024
1 parent 368d62e commit 8f4cc9c
Show file tree
Hide file tree
Showing 19 changed files with 330 additions and 192 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -119,17 +119,6 @@ CALL apoc.vectordb.chroma.queryAndUpdate($host,
| ...
|===

[NOTE]
====
We can use mapping with `apoc.vectordb.chroma.getAndUpdate` procedure as well
====

[NOTE]
====
To optimize performances, we can choose what to `YIELD` with the apoc.vectordb.chroma.query and the `apoc.vectordb.chroma.get` procedures.
For example, by executing a `CALL apoc.vectordb.chroma.query(...) YIELD metadata, score, id`, the RestAPI request will have an {"include": ["metadatas", "documents", "distances"]},
so that we do not return the other values that we do not need.
====

We can define a mapping, to fetch the associated nodes and relationships and optionally create them, by leveraging the vector metadata.

Expand Down Expand Up @@ -199,6 +188,17 @@ which populates the two relationships as: `()-[:TEST {myId: 'one', city: 'Berlin
and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
which will be returned in the `entity` column result.

[NOTE]
====
We can use mapping with `apoc.vectordb.chroma.get*` procedures as well
====

[NOTE]
====
To optimize performances, we can choose what to `YIELD` with the apoc.vectordb.chroma.query and the `apoc.vectordb.chroma.get` procedures.
For example, by executing a `CALL apoc.vectordb.chroma.query(...) YIELD metadata, score, id`, the RestAPI request will have an {"include": ["metadatas", "documents", "distances"]},
so that we do not return the other values that we do not need.
====

.Delete vectors (it leverages https://docs.trychroma.com/usage-guide#deleting-data-from-a-collection[this API])
[source,cypher]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -189,9 +189,28 @@ which populates the two relationships as: `()-[:TEST {myId: 'one', city: 'Berlin
and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
which will be returned in the `entity` column result.


We can also use mapping for `apoc.vectordb.milvus.query` procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates.
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column `rel`:

[source,cypher]
----
CALL apoc.vectordb.milvus.query('http://localhost:19531', 'test_collection',
[0.2, 0.1, 0.9, 0.7],
{},
5,
{ mapping: {
embeddingKey: "vect",
relType: "TEST",
entityKey: "myId",
metadataKey: "foo"
}
})
----

[NOTE]
====
We can use mapping with `apoc.vectordb.milvus.getAndUpdate` procedure as well
We can use mapping with `apoc.vectordb.milvus.get*` procedures as well
====

[NOTE]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,28 @@ which populates the two relationships as: `()-[:TEST {myId: 'one', city: 'Berlin
and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
which will be returned in the `entity` column result.


We can also use mapping for `apoc.vectordb.pinecone.query` procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates.
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column `rel`:

[source,cypher]
----
CALL apoc.vectordb.pinecone.query($host, 'test-index',
[0.2, 0.1, 0.9, 0.7],
{},
5,
{ mapping: {
embeddingKey: "vect",
relType: "TEST",
entityKey: "myId",
metadataKey: "foo"
}
})
----

[NOTE]
====
We can use mapping with `apoc.vectordb.pinecone.getAndUpdate` procedure as well
We can use mapping with `apoc.vectordb.pinecone.get*` procedures as well
====

[NOTE]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -191,9 +191,27 @@ which populates the two relationships as: `()-[:TEST {myId: 'one', city: 'Berlin
and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
which will be returned in the `entity` column result.


We can also use mapping for `apoc.vectordb.qdrant.query` procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates.
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column `rel`:

[source,cypher]
----
CALL apoc.vectordb.qdrant.queryAndUpdate($hostOrKey, 'test_collection',
[0.2, 0.1, 0.9, 0.7],
{},
5,
{ mapping: {
relType: "TEST",
entityKey: "myId",
metadataKey: "foo"
}
})
----

[NOTE]
====
We can use mapping with `apoc.vectordb.qdrant.getAndUpdate` procedure as well
We can use mapping with `apoc.vectordb.qdrant.get*` procedures as well
====

[NOTE]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -205,9 +205,29 @@ and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
which will be returned in the `entity` column result.


We can also use mapping for `apoc.vectordb.weaviate.query` procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates.
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column `rel`:

[source,cypher]
----
CALL apoc.vectordb.weaviate.query($host, 'test_collection',
[0.2, 0.1, 0.9, 0.7],
{},
5,
{ fields: ["city", "foo"],
mapping: {
relType: "TEST",
entityKey: "myId",
metadataKey: "foo"
}
})
----



[NOTE]
====
We can use mapping with `apoc.vectordb.weaviate.getAndUpdate` procedure as well
We can use mapping with `apoc.vectordb.weaviate.get*` procedures as well
====

[NOTE]
Expand Down
47 changes: 27 additions & 20 deletions extended-it/src/test/java/apoc/vectordb/ChromaDbTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,15 @@
import java.util.Map;
import java.util.concurrent.atomic.AtomicReference;

import static apoc.ml.RestAPIConfig.HEADERS_KEY;
import static apoc.util.MapUtil.map;
import static apoc.util.TestUtil.testCall;
import static apoc.util.TestUtil.testResult;
import static apoc.vectordb.VectorDbHandler.Type.CHROMA;
import static apoc.vectordb.VectorDbTestUtil.assertBerlinResult;
import static apoc.vectordb.VectorDbTestUtil.assertLondonResult;
import static apoc.vectordb.VectorDbTestUtil.assertNodesCreated;
import static apoc.vectordb.VectorDbTestUtil.assertReadOnlyProcWithMappingResults;
import static apoc.vectordb.VectorDbTestUtil.assertRelsCreated;
import static apoc.vectordb.VectorDbTestUtil.dropAndDeleteAll;
import static apoc.vectordb.VectorDbTestUtil.EntityType.*;
Expand Down Expand Up @@ -294,19 +296,22 @@ MAPPING_KEY, map(EMBEDDING_KEY, "vect",
assertNodesCreated(db);
}


@Test
public void getReadOnlyVectorsWithMapping() {
db.executeTransactionally("CREATE (:Test {readID: 'one'}), (:Test {readID: 'two'})");

Map<String, Object> conf = map(ALL_RESULTS_KEY, true,
MAPPING_KEY, map(EMBEDDING_KEY, "vect"));

try {
testCall(db, "CALL apoc.vectordb.chroma.get($host, $collection, [1, 2], $conf)",
map("host", HOST, "collection", COLL_ID.get(), "conf", conf),
r -> fail()
);
} catch (RuntimeException e) {
Assertions.assertThat(e.getMessage()).contains(ERROR_READONLY_MAPPING);
}
MAPPING_KEY, map(NODE_LABEL, "Test",
ENTITY_KEY, "readID",
METADATA_KEY, "foo")
);

testResult(db, "CALL apoc.vectordb.chroma.get($host, $collection, ['1', '2'], $conf) " +
"YIELD vector, id, metadata, node RETURN * ORDER BY id",
map("host", HOST, "collection", COLL_ID.get(), "conf", conf),
r -> assertReadOnlyProcWithMappingResults(r, "node")
);
}

@Test
Expand Down Expand Up @@ -338,17 +343,19 @@ MAPPING_KEY, map(EMBEDDING_KEY, "vect",

@Test
public void queryReadOnlyVectorsWithMapping() {
db.executeTransactionally("CREATE (:Start)-[:TEST {readID: 'one'}]->(:End), (:Start)-[:TEST {readID: 'two'}]->(:End)");

Map<String, Object> conf = map(ALL_RESULTS_KEY, true,
MAPPING_KEY, map(EMBEDDING_KEY, "vect"));

try {
testCall(db, "CALL apoc.vectordb.chroma.query($host, $collection, [0.2, 0.1, 0.9, 0.7], {}, 5, $conf)",
map("host", HOST, "collection", COLL_ID.get(), "conf", conf),
r -> fail()
);
} catch (RuntimeException e) {
Assertions.assertThat(e.getMessage()).contains(ERROR_READONLY_MAPPING);
}
MAPPING_KEY, map(
REL_TYPE, "TEST",
ENTITY_KEY, "readID",
METADATA_KEY, "foo")
);

testResult(db, "CALL apoc.vectordb.chroma.query($host, $collection, [0.2, 0.1, 0.9, 0.7], {}, 5, $conf)",
map("host", HOST, "collection", COLL_ID.get(), "conf", conf),
r -> assertReadOnlyProcWithMappingResults(r, "rel")
);
}

@Test
Expand Down
47 changes: 34 additions & 13 deletions extended-it/src/test/java/apoc/vectordb/MilvusTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import apoc.util.TestUtil;
import apoc.util.Util;
import org.assertj.core.api.Assertions;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
Expand All @@ -27,9 +26,9 @@
import static apoc.vectordb.VectorDbTestUtil.assertBerlinResult;
import static apoc.vectordb.VectorDbTestUtil.assertLondonResult;
import static apoc.vectordb.VectorDbTestUtil.assertNodesCreated;
import static apoc.vectordb.VectorDbTestUtil.assertReadOnlyProcWithMappingResults;
import static apoc.vectordb.VectorDbTestUtil.assertRelsCreated;
import static apoc.vectordb.VectorDbTestUtil.dropAndDeleteAll;
import static apoc.vectordb.VectorDbUtil.ERROR_READONLY_MAPPING;
import static apoc.vectordb.VectorEmbeddingConfig.ALL_RESULTS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.FIELDS_KEY;
import static apoc.vectordb.VectorEmbeddingConfig.MAPPING_KEY;
Expand Down Expand Up @@ -297,6 +296,24 @@ MAPPING_KEY, map(EMBEDDING_KEY, "vect",
assertNodesCreated(db);
}

@Test
public void getReadOnlyVectorsWithMapping() {
db.executeTransactionally("CREATE (:Test {readID: 'one'}), (:Test {readID: 'two'})");

Map<String, Object> conf = map(ALL_RESULTS_KEY, true,
FIELDS_KEY, FIELDS,
MAPPING_KEY, map(EMBEDDING_KEY, "vect",
NODE_LABEL, "Test",
ENTITY_KEY, "readID",
METADATA_KEY, "foo"));

testResult(db, "CALL apoc.vectordb.milvus.get($host, 'test_collection', [1, 2], $conf) " +
"YIELD vector, id, metadata, node RETURN * ORDER BY id",
map("host", HOST, "conf", conf),
r -> assertReadOnlyProcWithMappingResults(r, "node")
);
}

@Test
public void queryVectorsWithCreateNodeUsingExistingNode() {

Expand Down Expand Up @@ -336,7 +353,8 @@ public void queryVectorsWithCreateRel() {
MAPPING_KEY, map(EMBEDDING_KEY, "vect",
REL_TYPE, "TEST",
ENTITY_KEY, "myId",
METADATA_KEY, "foo"));
METADATA_KEY, "foo")
);
testResult(db, "CALL apoc.vectordb.milvus.queryAndUpdate($host, 'test_collection', [0.2, 0.1, 0.9, 0.7], null, 5, $conf)",
map("host", HOST, "conf", conf),
r -> {
Expand All @@ -356,17 +374,20 @@ MAPPING_KEY, map(EMBEDDING_KEY, "vect",

@Test
public void queryReadOnlyVectorsWithMapping() {
db.executeTransactionally("CREATE (:Start)-[:TEST {readID: 'one'}]->(:End), (:Start)-[:TEST {readID: 'two'}]->(:End)");

Map<String, Object> conf = map(ALL_RESULTS_KEY, true,
MAPPING_KEY, map(EMBEDDING_KEY, "vect"));

try {
testCall(db, "CALL apoc.vectordb.milvus.query($host, 'test_collection', [0.2, 0.1, 0.9, 0.7], {}, 5, $conf)",
map("host", HOST, "conf", conf),
r -> fail()
);
} catch (RuntimeException e) {
Assertions.assertThat(e.getMessage()).contains(ERROR_READONLY_MAPPING);
}
FIELDS_KEY, FIELDS,
MAPPING_KEY, map(
REL_TYPE, "TEST",
ENTITY_KEY, "readID",
METADATA_KEY, "foo")
);

testResult(db, "CALL apoc.vectordb.milvus.query($host, 'test_collection', [0.2, 0.1, 0.9, 0.7], null, 5, $conf)",
map("host", HOST, "conf", conf),
r -> assertReadOnlyProcWithMappingResults(r, "rel")
);
}

@Test
Expand Down
Loading

0 comments on commit 8f4cc9c

Please sign in to comment.