Skip to content
This repository has been archived by the owner on Dec 21, 2021. It is now read-only.

Latest commit

 

History

History
44 lines (30 loc) · 2.6 KB

kinesis.md

File metadata and controls

44 lines (30 loc) · 2.6 KB

Kinesis

Sync overview

Output schema

The incoming Airbyte data is structured in a Json format and is sent across diferent stream shards determined by the partition key. This connector maps an incoming data from a namespace and stream to a unique Kinesis stream. The Kinesis record which is sent to the stream is consisted of the following Json fields

  • _airbyte_ab_id: Random UUID generated to be used as a partition key for sending data to different shards.
  • _airbyte_emitted_at: a timestamp representing when the event was received from the data source.
  • _airbyte_data: a json text/object representing the data that was received from the data source.

Features

Feature Support Notes
Full Refresh Sync
Incremental - Append Sync Incoming messages are streamed/appended to a Kinesis stream as they are received.
Incremental - Deduped History
Namespaces Namespaces will be used to determine the Kinesis stream name.

Performance considerations

Although Kinesis is designed to handle large amounts of real-time data by scaling streams with shards, you should be aware of the following Kinesis Quotas and Limits. The connector buffer size should also be tweaked according to your data size and freguency

Getting started

Requirements

  • The connector is compatible with the latest Kinesis service version at the time of this writing.
  • Configuration
    • Endpoint(Optional): Aws Kinesis endpoint to connect to. Default endpoint if not provided
    • Region(Optional): Aws Kinesis region to connect to. Default region if not provided.
    • shardCount: The number of shards with which the stream should be created. The amount of shards affects the throughput of your stream.
    • accessKey: Access key credential for authenticating with the service.
    • privateKey: Private key credential for authenticating with the service.
    • bufferSize: Buffer size used to increase throughput by sending data in a single request.

Setup guide

######TODO: more info, screenshots?, etc...