The Airbyte Dotnet CDK is a framework for rapidly developing production-grade Airbyte connectors. The CDK currently offers helpers specific for creating Airbyte source connectors for:
- HTTP APIs (REST APIs, GraphQL, etc..)
- Generic .NET sources (anything not covered by the above)
The CDK provides an improved developer experience by providing basic implementation structure and abstracting away low-level glue boilerplate.
Generate a boilerplate connector using the code generator. Create a new directory and run:
git clone https://github.com/mrhamburg/airbyte.cdk.dotnet.git
cd airbyte.cdk.dotnet
dotnet run init --project Airbyte.Cdk
Component | Type | Code | Installation | Version |
---|---|---|---|---|
Airbyte CDK | CDK | Airbyte.Cdk | dotnet add package Airbyte.Cdk |
|
Exchange Rates Free | Source | source-exchange-rates-free | docker pull airbytedotnet/source-exchange-rates-free |
This readme is a general introduction to the CDK. Readers should have basic familiarity with the Airbyte Specification before proceeding.
If you have any issues with troubleshooting or want to learn more about the CDK from the Airbyte team, head to the #connector-development channel in Slack to inquire further!
If you want to learn more about the classes required to implement an Airbyte Source, head to our basic concepts doc as described in the python-cdk.
A Stream
is the atomic unit for reading data from a Source. A stream can read data from anywhere: a relational database, an API, or even scrape a web page! (although that might be stretching the limits of what a connector should do).
To implement a stream, there are two minimum requirements: 1. Define the stream's schema 2. Implement the logic for reading records from the underlying data source
Schema's should be stored in the schemas folder part of your project.
Using the fluentbuilder we can define a source. The short and simple example below, does the following steps:
- Create a string
url
as a baseurl for all subsequent requests - Create a base implementation by converting the string to a HttpStream and set a default response parser being the whole object (this function expects a JsonPath expression to extract the data)
- Create a new http stream using the create statement and name it symbols (the name
symbols
will also be appended to the url when executing the request, thus the example below will result in an executed endpoint being:https://api.exchangerate.host/symbols
)
public override Stream[] Streams(JsonElement config)
{
string url = "https://api.exchangerate.host";
var baseimpl = url.HttpStream().ParseResponseObject("$");
return new Stream[] { baseimpl.Create("symbols") };
}
An incremental Stream is a stream which reads data incrementally. That is, it only reads data that was generated or updated since the last time it ran, and is thus far more efficient than a stream which reads all the source data every time it runs. If possible, developers are encouraged to implement incremental streams to reduce sync times and resource usage.
Several new pieces are essential to understand how incrementality works with the CDK:
- Cursor fields
AirbyteStateMessage
Stream.GetUpdatedState
The AirbyteStateMessage
is sent based on the StateCheckpointInterval
setting of a stream object. Every N number of requests will result in sending out an AirbyteStateMessage
. In the example below, this is every 25th request for the symbols
stream.
The CursorField
refers to the field in the stream's output records used to determine the "recency" or ordering of records. An example is a date
field in an API, as shown in the example below.
Cursor fields can be input by the user (e.g: a user can choose to use an auto-incrementing id
column in a DB table) or they can be defined by the source e.g: where an API defines that date
is what determines the ordering of records.
In the context of the CDK, setting the Stream.CursorField
property to any truthy value informs the framework that this stream is incremental.
Stream.GetUpdatedState
, this function helps the stream keep track of the latest state by inspecting every record output by the stream (as returned by the Stream.ReadRecords
method) and comparing it against the most recent state object. This allows sync to resume from where the previous sync last stopped, regardless of success or failure. This function typically compares the state object's and the latest record's cursor field, picking the latest one.
public override Stream[] Streams(JsonElement config)
{
...
Dictionary<string, DateTime> _currentstate = new Dictionary<string, DateTime>();
string basesymbol = config.GetProperty("symbol").GetString();
var incremental = baseimpl
.CursorField(new[] {"date"})
.BackoffTime(((i, _) => TimeSpan.FromMinutes(i * 10)))
.GetUpdatedState((_, _) => _currentstate.AsJsonElement())
.RequestParams((_, _, _) => new Dictionary<string, object> {{ "date", _currentstate[basesymbol] } })
.BackoffTime((i, response) =>
response.StatusCode == 429 ? i * TimeSpan.FromSeconds(15) : TimeSpan.FromMinutes(1))
.HttpMethod(HttpMethod.Get)
.PageSize(150)
.StateCheckpointInterval(25)
.MaxRetries(15)
.ShouldRetry(exc => exc.StatusCode > 300)
.WithAuth(new BasicAuth(new[] {config.GetProperty("api-token").GetString()}))
.Create("symbols");
...
}
Airbyte recommends using the CDK template generator to develop with the CDK. The template generates created all the required scaffolding, with convenient TODOs, allowing developers to truly focus on implementing the API.
Make sure the latest version of dotnet is installed, this can be found using the following link: Dotnet SDK. Also, make sure you have docker installed, information to do so can be found here: Get Docker
Run the following commands to get a boilerplate for creating a new connector:
git clone https://github.com/mrhamburg/airbyte.cdk.dotnet.git
cd airbyte.cdk.dotnet
dotnet run init --project Airbyte.Cdk
Once ready, check the README.md file in your connector directory for instructions. Include your connector in this readme as well, see the components section at the top.
- Iterate on the code locally
- Run tests via
dotnet test
- Try to build the connector using docker
docker build .
All tests are located in the *.Tests
directory where your connector is created. Run dotnet test
to run them.
- Open a PR
- Once ready, update the CHANGELOG.md file, which should include a new version number
- Connectors are automatically published to Docker Hub after updates to the main branch. If the connector is updated without incrementing the version, GitHub will NOT overwrite the existing image in Docker Hub.
Make sure the latest version of dotnet is installed, this can be found using the following link: Dotnet SDK. Also, make sure you have docker installed, information to do so can be found here: Get Docker
- Iterate on the code locally
- Run tests via
dotnet test
- Try to build the nuget package using docker
docker build --target build .
All tests are located in the Airbyte.Cdk.Test
directory. Run dotnet test
to run them.
- Open a PR
- Testing and releasing is part of the CI/CD process
- Don't see a feature you need? Create an issue and let us know how we can help!