Skip to content

Commit

Permalink
Fixes #4087: Add vector info procedures
Browse files Browse the repository at this point in the history
  • Loading branch information
vga91 committed Jul 16, 2024
1 parent c49fe8a commit 3f29732
Show file tree
Hide file tree
Showing 16 changed files with 268 additions and 20 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ note that the list and the signature procedures are consistent with the others,
[opts=header, cols="1, 3"]
|===
| name | description
| apoc.vectordb.chroma.info(hostOrKey, collection, $config) | Get information about the specified existing collection or throws an error if it does not exist
| apoc.vectordb.chroma.createCollection(hostOrKey, collection, similarity, size, $config) |
Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
The default endpoint is `<hostOrKey param>/api/v1/collections`.
Expand Down Expand Up @@ -38,6 +39,19 @@ With hostOrKey=null, the default is 'http://localhost:8000'.

== Examples

.Get collection info (it leverages https://docs.trychroma.com/reference/py-client#get_collection[this API])
[source,cypher]
----
CALL apoc.vectordb.chroma.info(hostOrKey, 'test_collection', {<optional config>})
----

.Example results
[opts="header"]
|===
| value
| {name=test_collection, metadata={size=4, hnsw:space=cosine}, database=default_database, id=74ebe008-1ccb-4d3d-8c5d-cdd7cfa526c2, tenant=default_tenant}
|===

.Create a collection (it leverages https://docs.trychroma.com/usage-guide#creating-inspecting-and-deleting-collections[this API])
[source,cypher]
----
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Here is a list of all available Milvus procedures:
[opts=header, cols="1, 3"]
|===
| name | description
| apoc.vectordb.milvus.info(hostOrKey, collection, $config) | Get information about the specified existing collection or returns a response with code 100 if if it does not exist
| apoc.vectordb.milvus.createCollection(hostOrKey, collection, similarity, size, $config) |
Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
The default endpoint is `<hostOrKey param>/v2/vectordb/collections/create`.
Expand Down Expand Up @@ -39,6 +40,18 @@ With hostOrKey=null, the default host is 'http://localhost:19530'.

Here is a list of example using a local installation using th default port `19531`.

.Get collection info (it leverages https://milvus.io/docs/manage-collections.md#View-Collections[this API])
[source,cypher]
----
CALL apoc.vectordb.milvus.info($host, 'test_collection', '', {<optional config>})
----

.Example results
[opts="header"]
|===
| value
| {data={shardsNum=1, aliases=[], autoId=false, description=, partitionsNum=1, collectionName=test_collection, indexes=[{metricType=COSINE, indexName=vector, fieldName=vector}], load=LoadStateLoading, consistencyLevel=Bounded, fields=[{partitionKey=false, autoId=false, name=id, description=, id=100, type=Int64, primaryKey=true}, {partitionKey=false, autoId=false, name=vector, description=, id=101, params=[{value=4, key=dim}], type=FloatVector, primaryKey=false}], collectionID=451046728334049293, enableDynamicField=true, properties=[]}, message=, code=200}
|===

.Create a collection (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Collection%20(v2)/Create.md[this API])
[source,cypher]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Here is a list of all available Pinecone procedures:
[opts=header, cols="1, 3"]
|===
| name | description
| apoc.vectordb.pinecone.info(hostOrKey, collection, $config) | Get information about the specified existing collection or throws an error if it does not exist
| apoc.vectordb.pinecone.createCollection(hostOrKey, index, similarity, size, $config) |
Creates an index, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
The default endpoint is `<hostOrKey param>/indexes`.
Expand Down Expand Up @@ -54,6 +55,13 @@ image::pinecone-index.png[width=800]

The following example assume we want to create and manage an index called `test-index`.

.Get collection info (it leverages https://docs.pinecone.io/reference/api/control-plane/describe_collection[this API])
[source,cypher]
----
CALL apoc.vectordb.pinecone.info(hostOrKey, 'test-collection', {<optional config>})
----


.Create an index (it leverages https://docs.pinecone.io/reference/api/control-plane/create_index[this API])
[source,cypher]
----
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ note that the list and the signature procedures are consistent with the others,
[opts=header, cols="1, 3"]
|===
| name | description
| apoc.vectordb.qdrant.info(hostOrKey, collection, $config) | Get information about the specified existing collection or throws an error if it does not exist or throws an error
| apoc.vectordb.qdrant.createCollection(hostOrKey, collection, similarity, size, $config) |
Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
The default endpoint is `<hostOrKey param>/collections/<collection param>`.
Expand Down Expand Up @@ -39,6 +40,19 @@ With hostOrKey=null, the default is 'http://localhost:6333'.

== Examples

.Get collection info (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/collections/operation/get_collection[this API])
[source,cypher]
----
CALL apoc.vectordb.qdrant.info(hostOrKey, 'test_collection', {<optional config>})
----

.Example results
[opts="header"]
|===
| value
| {result={optimizer_status=ok, points_count=2, vectors_count=2, segments_count=8, indexed_vectors_count=0, config={params={on_disk_payload=true, vectors={size=4, distance=Cosine}, shard_number=1, replication_factor=1, write_consistency_factor=1}, optimizer_config={max_optimization_threads=1, indexing_threshold=20000, deleted_threshold=0.2, flush_interval_sec=5, memmap_threshold=null, default_segment_number=0, max_segment_size=null, vacuum_min_vector_number=1000}, quantization_config=null, hnsw_config={max_indexing_threads=0, full_scan_threshold=10000, ef_construct=100, m=16, on_disk=false}, wal_config={wal_segments_ahead=0, wal_capacity_mb=32}}, status=green, payload_schema={}}, time=1.2725E-4, status=ok}
|===

.Create a collection (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/collections/operation/create_collection[this API])
[source,cypher]
----
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ note that the list and the signature procedures are consistent with the others,
[opts=header, cols="1, 3"]
|===
| name | description
| apoc.vectordb.weaviate.info($host, $collectionName, $config) | Get information about the specified existing collection or throws an error if it does not exist
| apoc.vectordb.weaviate.createCollection(hostOrKey, collection, similarity, size, $config) |
Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
The default endpoint is `<hostOrKey param>/schema`.
Expand Down Expand Up @@ -40,6 +41,19 @@ With hostOrKey=null, the default is 'http://localhost:8080/v1'.

== Examples

.Get collection info (it leverages https://weaviate.io/developers/weaviate/api/rest#tag/schema/get/schema/{className}[this API])
[source, cypher]
----
CALL apoc.vectordb.weaviate.info($host, 'test_collection', {<optional config>})
----

.Example results
[opts="header"]
|===
| value
| {vectorizer=none, invertedIndexConfig={bm25={b=0.75, k1=1.2}, stopwords={additions=null, removals=null, preset=en}, cleanupIntervalSeconds=60}, vectorIndexConfig={ef=-1, dynamicEfMin=100, pq={centroids=256, trainingLimit=100000, encoder={type=kmeans, distribution=log-normal}, enabled=false, bitCompression=false, segments=0}, distance=cosine, skip=false, dynamicEfFactor=8, bq={enabled=false}, vectorCacheMaxObjects=1000000000000, cleanupIntervalSeconds=300, dynamicEfMax=500, efConstruction=128, flatSearchCutoff=40000, maxConnections=64}, multiTenancyConfig={enabled=false}, vectorIndexType=hnsw, replicationConfig={factor=1}, shardingConfig={desiredVirtualCount=128, desiredCount=1, actualCount=1, function=murmur3, virtualPerPhysical=128, strategy=hash, actualVirtualCount=128, key=_id}, class=TestCollection, properties=[{name=city, description=This property was generated by Weaviate's auto-schema feature on Wed Jul 10 12:50:18 2024, indexFilterable=true, tokenization=word, indexSearchable=true, dataType=[text]}, {name=foo, description=This property was generated by Weaviate's auto-schema feature on Wed Jul 10 12:50:18 2024, indexFilterable=true, tokenization=word, indexSearchable=true, dataType=[text]}]}
|===

.Create a collection (it leverages https://weaviate.io/developers/weaviate/api/rest#tag/schema/post/schema[this API])
[source,cypher]
----
Expand Down
13 changes: 11 additions & 2 deletions extended-it/src/test/java/apoc/vectordb/ChromaDbTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
import static apoc.vectordb.VectorDbTestUtil.assertBerlinResult;
import static apoc.vectordb.VectorDbTestUtil.assertLondonResult;
import static apoc.vectordb.VectorDbTestUtil.assertNodesCreated;
import static apoc.vectordb.VectorDbTestUtil.assertRagWithVectors;
import static apoc.vectordb.VectorDbTestUtil.assertReadOnlyProcWithMappingResults;
import static apoc.vectordb.VectorDbTestUtil.assertRelsCreated;
import static apoc.vectordb.VectorDbTestUtil.dropAndDeleteAll;
Expand All @@ -41,7 +40,6 @@
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;
import static org.neo4j.configuration.GraphDatabaseSettings.DEFAULT_DATABASE_NAME;
import static org.neo4j.configuration.GraphDatabaseSettings.SYSTEM_DATABASE_NAME;

Expand All @@ -50,6 +48,7 @@ public class ChromaDbTest {
private static final ChromaDBContainer CHROMA_CONTAINER = new ChromaDBContainer("chromadb/chroma:0.4.25.dev137");
private static final String READONLY_KEY = "my_readonly_api_key";
private static final Map<String, String> READONLY_AUTHORIZATION = getAuthHeader(READONLY_KEY);
private static final String COLLECTION_NAME = "test_collection";

private static String HOST;

Expand Down Expand Up @@ -109,6 +108,16 @@ public static void tearDown() throws Exception {
public void before() {
dropAndDeleteAll(db);
}

@Test
public void getInfo() {
testResult(db, "CALL apoc.vectordb.chroma.info($host, $collection, $conf) ",
map("host", HOST, "collection", COLLECTION_NAME, "conf", map(ALL_RESULTS_KEY, true)),
r -> {
Map<String, Object> row = (Map<String, Object>) r.next().get("value");
assertEquals(COLLECTION_NAME, row.get("name"));
});
}

@Test
public void getVectors() {
Expand Down
11 changes: 11 additions & 0 deletions extended-it/src/test/java/apoc/vectordb/MilvusTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,17 @@ public void before() {
dropAndDeleteAll(db);
}

@Test
public void getInfo() {
testResult(db, "CALL apoc.vectordb.milvus.info($host, 'taaaest_collection', '', $conf) ",
map("host", HOST, "conf", map(FIELDS_KEY, FIELDS)),
r -> {
Map<String, Object> row = r.next();
Map value = (Map) row.get("value");
assertEquals(200L, value.get("code"));
});
}

@Test
public void getVectorsWithoutVectorResult() {
testResult(db, "CALL apoc.vectordb.milvus.get($host, 'test_collection', [1], $conf) ",
Expand Down
22 changes: 18 additions & 4 deletions extended-it/src/test/java/apoc/vectordb/QdrantTest.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package apoc.vectordb;

import apoc.ml.Prompt;
import apoc.util.TestUtil;
import apoc.util.Util;
import org.junit.AfterClass;
Expand All @@ -17,15 +18,14 @@
import java.util.List;
import java.util.Map;

import apoc.ml.Prompt;
import static apoc.ml.RestAPIConfig.HEADERS_KEY;
import static apoc.ml.Prompt.API_KEY_CONF;
import static apoc.ml.RestAPIConfig.HEADERS_KEY;
import static apoc.util.MapUtil.map;
import static apoc.util.TestUtil.testCall;
import static apoc.util.TestUtil.testResult;
import static apoc.vectordb.VectorDbHandler.Type.QDRANT;
import static apoc.vectordb.VectorDbTestUtil.EntityType.NODE;
import static apoc.vectordb.VectorDbTestUtil.EntityType.FALSE;
import static apoc.vectordb.VectorDbTestUtil.EntityType.NODE;
import static apoc.vectordb.VectorDbTestUtil.EntityType.REL;
import static apoc.vectordb.VectorDbTestUtil.assertBerlinResult;
import static apoc.vectordb.VectorDbTestUtil.assertLondonResult;
Expand All @@ -43,7 +43,8 @@
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;
import static org.junit.Assert.fail;
import static org.neo4j.configuration.GraphDatabaseSettings.*;
import static org.neo4j.configuration.GraphDatabaseSettings.DEFAULT_DATABASE_NAME;
import static org.neo4j.configuration.GraphDatabaseSettings.SYSTEM_DATABASE_NAME;

public class QdrantTest {
private static final String ADMIN_KEY = "my_admin_api_key";
Expand Down Expand Up @@ -117,6 +118,19 @@ public static void tearDown() throws Exception {
public void before() {
dropAndDeleteAll(db);
}

@Test
public void getInfo() {
testResult(
db,
"CALL apoc.vectordb.qdrant.info($host, 'test_collection', $conf)",
map("host", HOST, "conf", ADMIN_HEADER_CONF),
r -> {
Map<String, Object> res = r.next();
Map value = (Map) res.get("value");
assertEquals("ok", value.get("status"));
});
}

@Test
public void getVectorsWithReadOnlyApiKey() {
Expand Down
24 changes: 18 additions & 6 deletions extended-it/src/test/java/apoc/vectordb/WeaviateTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ public class WeaviateTest {
private static final List<String> FIELDS = List.of("city", "foo");
private static final String ADMIN_KEY = "jane-secret-key";
private static final String READONLY_KEY = "ian-secret-key";
private static final String COLLECTION_NAME = "TestCollection";

private static final WeaviateContainer WEAVIATE_CONTAINER = new WeaviateContainer("semitechnologies/weaviate:1.24.5")
.withEnv("AUTHENTICATION_APIKEY_ENABLED", "true")
Expand Down Expand Up @@ -114,10 +115,10 @@ public static void setUp() throws Exception {
MapUtil.map("host", HOST, "id1", ID_1, "id2", ID_2, "conf", ADMIN_HEADER_CONF),
r -> {
ResourceIterator<Map> values = r.columnAs("value");
assertEquals("TestCollection", values.next().get("class"));
assertEquals("TestCollection", values.next().get("class"));
assertEquals("TestCollection", values.next().get("class"));
assertEquals("TestCollection", values.next().get("class"));
assertEquals(COLLECTION_NAME, values.next().get("class"));
assertEquals(COLLECTION_NAME, values.next().get("class"));
assertEquals(COLLECTION_NAME, values.next().get("class"));
assertEquals(COLLECTION_NAME, values.next().get("class"));
assertFalse(values.hasNext());
});

Expand All @@ -134,8 +135,8 @@ public static void setUp() throws Exception {

@AfterClass
public static void tearDown() throws Exception {
testCallEmpty(db, "CALL apoc.vectordb.weaviate.deleteCollection($host, 'TestCollection', $conf)",
MapUtil.map("host", HOST, "conf", ADMIN_HEADER_CONF)
testCallEmpty(db, "CALL apoc.vectordb.weaviate.deleteCollection($host, $collectionName, $conf)",
MapUtil.map("host", HOST, "collectionName", COLLECTION_NAME, "conf", ADMIN_HEADER_CONF)
);

WEAVIATE_CONTAINER.stop();
Expand All @@ -147,6 +148,17 @@ public void before() {
dropAndDeleteAll(db);
}

@Test
public void getInfo() {
testResult(db, "CALL apoc.vectordb.weaviate.info($host, '$collectionName', $conf)",
map("host", HOST, "collectionName", COLLECTION_NAME, "conf", map(ALL_RESULTS_KEY, true, HEADERS_KEY, READONLY_AUTHORIZATION)),
r -> {
Map<String, Object> row = r.next();
Map value = (Map) row.get("value");
assertEquals(COLLECTION_NAME, value.get("class"));
});
}

@Test
public void getVectorsWithReadOnlyApiKey() {
testResult(db, "CALL apoc.vectordb.weaviate.get($host, 'TestCollection', [$id1], $conf)",
Expand Down
16 changes: 16 additions & 0 deletions extended/src/main/java/apoc/vectordb/ChromaDb.java
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,22 @@ public class ChromaDb {
@Context
public URLAccessChecker urlAccessChecker;

@Procedure("apoc.vectordb.chroma.info")
@Description("apoc.vectordb.chroma.info(hostOrKey, collection, $configuration) - Get information about the specified existing collection or throws an error if it does not exist")
public Stream<MapResult> info(@Name("hostOrKey") String hostOrKey, @Name("collection") String collection, @Name(value = "configuration", defaultValue = "{}") Map<String, Object> configuration) throws Exception {
String url = "%s/api/v1/collections/%s";

Map<String, Object> config = getVectorDbInfo(hostOrKey, collection, configuration, url);

methodAndPayloadNull(config);

RestAPIConfig restAPIConfig = new RestAPIConfig( config, Map.of(), Map.of() );

return executeRequest(restAPIConfig, urlAccessChecker)
.map(v -> (Map<String,Object>) v)
.map(MapResult::new);
}

@Procedure("apoc.vectordb.chroma.createCollection")
@Description("apoc.vectordb.chroma.createCollection(hostOrKey, collection, similarity, size, $configuration) - Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`")
public Stream<MapResult> createCollection(@Name("hostOrKey") String hostOrKey,
Expand Down
16 changes: 16 additions & 0 deletions extended/src/main/java/apoc/vectordb/Milvus.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import java.util.Map;
import java.util.stream.Stream;

import static apoc.ml.RestAPIConfig.BODY_KEY;
import static apoc.ml.RestAPIConfig.METHOD_KEY;
import static apoc.vectordb.VectorDb.executeRequest;
import static apoc.vectordb.VectorDb.getEmbeddingResultStream;
Expand All @@ -40,6 +41,21 @@ public class Milvus {
@Context
public URLAccessChecker urlAccessChecker;

@Procedure("apoc.vectordb.milvus.info")
@Description("apoc.vectordb.milvus.info(hostOrKey, collection, $configuration) - Get information about the specified existing collection or returns a response with code 100 if if it does not exist")
public Stream<MapResult> info(@Name("hostOrKey") String hostOrKey, @Name("collection") String collection, @Name(value = "dbName", defaultValue = "default") String dbName, @Name(value = "configuration", defaultValue = "{}") Map<String, Object> configuration) throws Exception {
String url = "%s/collections/describe";
Map<String, Object> config = getVectorDbInfo(hostOrKey, collection, configuration, url);

config.put(BODY_KEY, Map.of("dbName", dbName, "collectionName", collection));

RestAPIConfig restAPIConfig = new RestAPIConfig( config, Map.of(), Map.of() );

return executeRequest(restAPIConfig, urlAccessChecker)
.map(v -> (Map<String,Object>) v)
.map(MapResult::new);
}

@Procedure("apoc.vectordb.milvus.createCollection")
@Description("apoc.vectordb.milvus.createCollection(hostOrKey, collection, similarity, size, $configuration) - Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`")
public Stream<MapResult> createCollection(@Name("hostOrKey") String hostOrKey,
Expand Down
Loading

0 comments on commit 3f29732

Please sign in to comment.