Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AirbyteTraceMessage to Airbyte protocol #12458

Merged
merged 6 commits into from
May 3, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ definitions:
- SPEC
- CONNECTION_STATUS
- CATALOG
- TRACE
log:
description: "log message: any kind of logging you want the platform to know about."
"$ref": "#/definitions/AirbyteLogMessage"
Expand All @@ -43,6 +44,9 @@ definitions:
state:
description: "schema message: the state. Must be the last message produced. The platform uses this information"
"$ref": "#/definitions/AirbyteStateMessage"
trace:
description: "trace message: a message to communicate information about the status and performance of a connector"
"$ref": "#/definitions/AirbyteTraceMessage"
AirbyteRecordMessage:
type: object
additionalProperties: true
Expand Down Expand Up @@ -94,6 +98,45 @@ definitions:
message:
description: "the log message"
type: string
AirbyteTraceMessage:
type: object
additionalProperties: true
required:
- type
- emitted_at
properties:
type:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to leave a comment here suggesting that we use the json schema oneOf to implement the different types here instead of having a type property, and a separate property for each trace message type. However, I found this comment in our code base which indicates that jsonschema2pojo does not support oneOf. Since we use that to generate pojos for our platform, this means we can't use that here, so we'll have to stick with this type enum approach.

description: "the type of trace message"
type: string
enum:
- ERROR
emitted_at:
description: "the time in ms that the message was emitted"
type: number
error:
description: "error trace message: the error object"
"$ref": "#/definitions/AirbyteErrorTraceMessage"
AirbyteErrorTraceMessage:
type: object
additionalProperties: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on additionalProperties - The thought was that we might take any additional properties and append them to the FailureReason's "metadata" property.

cc @pedroslopez

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recap from discussion on the doc: this will not be the case - additionalProperties is true so that it will be ok if a connector does erroneously set an additional property, and it's easier to make backwards-compatible changes to the protocol later.

No further properties are expected to be included in AirbyteErrorTraceMessage and the connector logs should give additional context.

required:
- message
properties:
message:
description: A user-friendly message that indicates the cause of the error
type: string
internal_message:
description: The internal error that caused the failure
type: string
stack_trace:
description: The full stack trace of the error
type: string
failure_type:
description: The type of error
type: string
enum:
- system_error
- config_error
AirbyteConnectionStatus:
description: Airbyte connection status
type: object
Expand Down
5 changes: 3 additions & 2 deletions docs/understanding-airbyte/airbyte-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,8 +214,9 @@ For the sake of brevity, we will not re-describe `spec` and `check`. They are ex
## The Airbyte Protocol

* All messages passed to and from connectors must be wrapped in an `AirbyteMessage` envelope and serialized as JSON. The JsonSchema specification for these messages can be found [here](https://github.com/airbytehq/airbyte/blob/922bfd08a9182443599b78dbb273d70cb9f63d30/airbyte-protocol/models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml#L13-L45).
* Even if a record is wrapped in an `AirbyteMessage` it will only be processed if it appropriate for the given command. e.g. If a source `read` action includes AirbyteMessages in its stream of type Catalog for instance, these messages will be ignored as the `read` interface only expects `AirbyteRecordMessage`s and `AirbyteStateMessage`s. The appropriate `AirbyteMessage` types have been described in each command above.
* **ALL** actions are allowed to return `AirbyteLogMessage`s on stdout. For brevity, we have not mentioned these log messages in the description of each action, but they are always allowed. An `AirbyteLogMessage` wraps any useful logging that the connector wants to provide. These logs will be written to Airbyte's log files and output to the console.
* Even if a record is wrapped in an `AirbyteMessage` it will only be processed if it is appropriate for the given command. e.g. If a source `read` action includes AirbyteMessages in its stream of type Catalog for instance, these messages will be ignored as the `read` interface only expects `AirbyteRecordMessage`s and `AirbyteStateMessage`s. The appropriate `AirbyteMessage` types have been described in each command above.
* **ALL** actions are allowed to return `AirbyteLogMessage`s and `AirbyteTraceMessage`s on stdout. For brevity, we have not mentioned these messages in the description of each action, but they are always allowed. An `AirbyteLogMessage` wraps any useful logging that the connector wants to provide. These logs will be written to Airbyte's log files and output to the console. An `AirbyteTraceMessage` provides structured information about the performance and status of a connector, such as the failure reason in the event of an error.

* I/O:
* Connectors receive arguments on the command line via JSON files. `e.g. --catalog catalog.json`
* They read `AirbyteMessage`s from stdin. The destination `write` action is the only command that consumes `AirbyteMessage`s.
Expand Down