The incoming Airbyte data is structured in a Json format and is sent across diferent stream shards determined by the partition key. This connector maps an incoming data from a namespace and stream to a unique Kinesis stream. The Kinesis record which is sent to the stream is consisted of the following Json fields
_airbyte_ab_id
: Random UUID generated to be used as a partition key for sending data to different shards._airbyte_emitted_at
: a timestamp representing when the event was received from the data source._airbyte_data
: a json text/object representing the data that was received from the data source.
Feature | Support | Notes |
---|---|---|
Full Refresh Sync | ❌ | |
Incremental - Append Sync | ✅ | Incoming messages are streamed/appended to a Kinesis stream as they are received. |
Incremental - Deduped History | ❌ | |
Namespaces | ✅ | Namespaces will be used to determine the Kinesis stream name. |
Although Kinesis is designed to handle large amounts of real-time data by scaling streams with shards, you should be aware of the following Kinesis Quotas and Limits. The connector buffer size should also be tweaked according to your data size and freguency
- The connector is compatible with the latest Kinesis service version at the time of this writing.
- Configuration
- Endpoint(
Optional
): Aws Kinesis endpoint to connect to. Default endpoint if not provided - Region(
Optional
): Aws Kinesis region to connect to. Default region if not provided. - shardCount: The number of shards with which the stream should be created. The amount of shards affects the throughput of your stream.
- accessKey: Access key credential for authenticating with the service.
- privateKey: Private key credential for authenticating with the service.
- bufferSize: Buffer size used to increase throughput by sending data in a single request.
- Endpoint(
######TODO: more info, screenshots?, etc...