Skip to content
/ vts Public
forked from apache/seatunnel

VTS (short for Vector Transport Service) is an open-source tool for moving vectors and unstructured data. It is developed based on Apache Seatunnel by the Zilliz team, creators of the open-source Milvus vector database.

License

Notifications You must be signed in to change notification settings

zilliztech/vts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VTS

Discord Twitter Follow Twitter Follow

Overview

VTS (short for Vector Transport Service) is an open-source tool for moving vectors and unstructured data. It is developed by Zilliz based on Apache Seatunnel.

Why do you need a vector and unstructured data moving tool?

  1. Meeting the Growing Data Migration Needs: VTS evolves from our Milvus Migration Service, which has successfully helped over 100 organizations migrate data between Milvus clusters. User demands have grown to include migrations from different vector databases, traditional search engines like Elasticsearch and Solr, relational databases, data warehouses, document databases, and even S3 and data lakes to Milvus.
  2. Supporting Real-time Data Streaming and Offline Import: As vector database capabilities expand, users require both real-time data streaming and offline batch import options.
  3. Simplifying Unstructured Data Transformation: Unlike traditional ETL, transforming unstructured data requires AI and model capabilities. VTS, in conjunction with the Zilliz Cloud Pipelines, enables vector embedding, tagging, and complex transformations, significantly reducing data cleaning costs and operational complexity.
  4. Ensuring End-to-End Data Quality: Data integration and synchronization processes are prone to data loss and inconsistencies. VTS addresses these critical data quality concerns with robust monitoring and alerting mechanisms.

Core Capabilities of VTS

Built on top of Apache Seatunnel, Vector-Transport-Service offers:

  1. Rich, extensible connectors
  2. Unified stream and batch processing for real-time synchronization and offline batch imports
  3. Distributed snapshot support for data consistency
  4. High performance, low latency, and scalability
  5. Real-time monitoring and visual management

migration.png

Additionally, Vector-Transport-Service introduces vector-specific capabilities such as multiple data source support, schema matching, and basic data validation.

Roadmap

Future roadmaps include incremental sync, combined one-time migration and change data capture, and more advanced data transformation capabilities.

roadmap.png

To learn more details about VTS used in action, read our blog:

Get Started

To get started with VTS, follow the QuickStart Guide.

QuickStart Guide

This guide will help you get started with how to use vts to transport vector data into milvus, currently, we support the following source connectors:

  • milvus
  • postgres vector
  • elastic search
  • pinecone
  • qdrant
  • tencent vectordb

1. Build the vts project

./mvnw install -Dmaven.test.skip

2. Setup the configuration file

go to ./seatunnel-example/seatunnel-examples/src/main/resources/examples, update the conf file

  • milvus_to_milvus.conf
  • pg_to_milvus.conf
  • es_to_milvus.conf

here is an example of milvus_to_milvus.conf

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  Milvus {
    url="https://in01-***.aws-us-west-2.vectordb.zillizcloud.com:19530"
    token="***"
    database="default"
    collection="medium_articles"
    batch_size=100
  }
}

sink {
  Milvus {
    url="https://in01-***.aws-us-west-2.vectordb.zillizcloud.com:19542"
    token="***"
    database="default"
    batch_size=10
  }
}

3. Run examples

The example file is located at ./seatunnel-example/seatunnel-examples/src/main/java/com/zilliz/seatunnel/examples/engine/SeatunnelEngineExample.java

update the configuration file path in SeatunnelEngineExample.java, and run the example.

String configurePath = args.length > 0 ? args[0] : "/examples/****.conf";

4. Check the data in milvus

go to milvus console, check the data in the collection

Tutorial

In addition to the quick start guide, vts has much more powerful features like

  • lots of transformer to support TabelPathMapper, FieldMapper, Embedding etc.
  • cluster mode ready for production use with restful api to manage the job
  • docker deploy, etc.

For detailed information, please refer to Tutorial.md

Connectors

VTS supports a variety of connectors to move data between different systems.

Find Detailed documentation for each connector:

Support

If you require any assistance or have questions regarding VTS, please feel free to reach out to our support team: Email: [email protected]

About Apache Seatunnel

SeaTunnel is a next-generation, high-performance, distributed data integration tool, capable of synchronizing vast amounts of data daily. It's trusted by numerous companies for its efficiency and stability. It's released under Apache 2 License.

SeaTunnel is a top-level project of the Apache Software Foundation (ASF). For more information, visit the Apache Seatunnel website.

About

VTS (short for Vector Transport Service) is an open-source tool for moving vectors and unstructured data. It is developed based on Apache Seatunnel by the Zilliz team, creators of the open-source Milvus vector database.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 99.6%
  • Other 0.4%