-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AirbyteTraceMessage to Airbyte protocol #12458
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! I would recommend we also add a blurb in the docs describing this new message https://docs.airbyte.com/understanding-airbyte/airbyte-specification
@@ -43,6 +44,9 @@ definitions: | |||
state: | |||
description: "schema message: the state. Must be the last message produced. The platform uses this information" | |||
"$ref": "#/definitions/AirbyteStateMessage" | |||
trace: | |||
description: "trace message: a message to provide human readable information about a sync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could also be about non-syncs like check and discover operations right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, you're right, I was going off of the spec but will update.
@@ -43,6 +44,9 @@ definitions: | |||
state: | |||
description: "schema message: the state. Must be the last message produced. The platform uses this information" | |||
"$ref": "#/definitions/AirbyteStateMessage" | |||
trace: | |||
description: "trace message: a message to provide human readable information to users" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is important to not call it human readable. It is really about providing structured information about the connector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps alternative language might be:
"trace message: a structured message to share information about connector and sync performance with external consumers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps
"trace message: a structured message to communicate information about the status and performance of a connector"
I would be happy with either
- message | ||
properties: | ||
message: | ||
description: The human readable message that will be shown to the user in the UI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the mention of the UI? the protocol doesn't know anything about the UI.
For the protocol changes, it is important to not mix the what the data is with how it is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: The human readable message that will be shown to the user in the UI | |
description: A user-friendly message that indicates the cause of the error |
Maybe using the wording user-friendly
here is a good middle ground, as it indicates we want this to be understandable by users of the system, without referencing another part of the platform (UI)
description: The human readable message that will be shown to the user in the UI | ||
type: string | ||
internal_message: | ||
description: The less human readable, more technical message showing the reason for the failure, e.g. the first line of a stack trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably remove the concept of human readability. Also, why do we need this field? The example you're giving mentions the stack grace so I assume we could just read the stack_trace field first line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. Evan and I discussed this in the spec - @evantahler I think you were thinking that it's not necessarily going to be the first line of a stack trace and could be some other kind of message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the conversation again - @evantahler maybe you only wanted this field for AirbyteTraceMessages that are not errors? https://docs.google.com/document/d/1ctrj3Yh_GjtQ93aND-WH3ocqGxsmxyC3jfiarrF6NY0/edit?disco=AAAAYr7ZeRQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is backwards thinking, but the inspiration for the message once again comes from our (future) usage of these messages to create FailureReasons, and we've got all 3: message
, internal_message
, and stack_trace
in FailureReasons.
I do think there is value for message
, internal_message
, and stack_trace
. An error like
message: API Key not accepted - check your credentials
internal_message: 403: Access Denied
stack_trace: "403: Access Denied" from connect() on line 101...
could come from a number of different places in the code (really, any time we make an HTTP request). Analyzing these by internal message (which will always be the same) even though they come from different stack traces will be helpful. Grouping by the full stack trace without doing some extraction won't be helpful. The "message" might be decorated better to say "...check your username" vs "... check your apiKey" if we are able to do that.
Also, I think the connector, using the same programming language that generated the stack trace, is the best place to "extract" the most meaningful message from the stack, rather than the platform/airbyte-worker trying to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on Evan's comment I think internal_message makes sense to keep, but I agree that we should remove less human readable
from the description, and probably also remove e.g. the first line of a stack trace
. Intead this could more simple like The message of the internal error that caused the failure
or something along those lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
Can you just removed the concept of "human readable" from PR desc?
Trace message is really not about human readability. It is just about providing structured information about the connectors
@@ -43,6 +44,9 @@ definitions: | |||
state: | |||
description: "schema message: the state. Must be the last message produced. The platform uses this information" | |||
"$ref": "#/definitions/AirbyteStateMessage" | |||
trace: | |||
description: "trace message: a message to provide human readable information to users" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps alternative language might be:
"trace message: a structured message to share information about connector and sync performance with external consumers"
"$ref": "#/definitions/AirbyteErrorTraceMessage" | ||
AirbyteErrorTraceMessage: | ||
type: object | ||
additionalProperties: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 on additionalProperties
- The thought was that we might take any additional properties and append them to the FailureReason's "metadata" property.
cc @pedroslopez
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recap from discussion on the doc: this will not be the case - additionalProperties
is true
so that it will be ok if a connector does erroneously set an additional property, and it's easier to make backwards-compatible changes to the protocol later.
No further properties are expected to be included in AirbyteErrorTraceMessage
and the connector logs should give additional context.
failure_type: | ||
description: The type of error | ||
type: string | ||
enum: | ||
- SYSTEM_ERROR | ||
- CONFIG_ERROR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that these enum values and https://github.com/airbytehq/airbyte/blob/master/airbyte-config/models/src/main/resources/types/FailureReason.yaml#L20-L26 have different cases. Not sure if that matters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we will need to convert between the POJO type generated from this json schema to the type generated by the FailureReason json schema, it should be pretty trivial to lowercase these values when performing the conversion.
However, given that there are already other lowercase string enum values in this protocol file (e.g. SyncMode), we might as well make these enum values lowercase as well to have them be consistent with the FailureReason values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alovew I left some comments with my feedback on these changes and other discussions. This looks very close! As Sherif suggested, we should also update the airbyte-specification.md docs file with a note describing this new message type added to The Airbyte Protocol
section.
Once that is done and all of the open conversations here are addressed, this should be pretty much ready to go
@@ -43,6 +44,9 @@ definitions: | |||
state: | |||
description: "schema message: the state. Must be the last message produced. The platform uses this information" | |||
"$ref": "#/definitions/AirbyteStateMessage" | |||
trace: | |||
description: "trace message: a message to provide human readable information to users" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps
"trace message: a structured message to communicate information about the status and performance of a connector"
I would be happy with either
- type | ||
- emitted_at | ||
properties: | ||
type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to leave a comment here suggesting that we use the json schema oneOf
to implement the different types here instead of having a type
property, and a separate property for each trace message type. However, I found this comment in our code base which indicates that jsonschema2pojo does not support oneOf
. Since we use that to generate pojos for our platform, this means we can't use that here, so we'll have to stick with this type
enum approach.
description: "the time in ms that the message was emitted" | ||
type: number | ||
error: | ||
description: "error data from the sync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: "error data from the sync" | |
description: "error trace message: the error object" |
Suggesting this description change to use similar wording as the AirbyteMessage's various sub-type descriptions, since we have a similar structure here
- message | ||
properties: | ||
message: | ||
description: The human readable message that will be shown to the user in the UI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: The human readable message that will be shown to the user in the UI | |
description: A user-friendly message that indicates the cause of the error |
Maybe using the wording user-friendly
here is a good middle ground, as it indicates we want this to be understandable by users of the system, without referencing another part of the platform (UI)
description: The human readable message that will be shown to the user in the UI | ||
type: string | ||
internal_message: | ||
description: The less human readable, more technical message showing the reason for the failure, e.g. the first line of a stack trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on Evan's comment I think internal_message makes sense to keep, but I agree that we should remove less human readable
from the description, and probably also remove e.g. the first line of a stack trace
. Intead this could more simple like The message of the internal error that caused the failure
or something along those lines
failure_type: | ||
description: The type of error | ||
type: string | ||
enum: | ||
- SYSTEM_ERROR | ||
- CONFIG_ERROR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we will need to convert between the POJO type generated from this json schema to the type generated by the FailureReason json schema, it should be pretty trivial to lowercase these values when performing the conversion.
However, given that there are already other lowercase string enum values in this protocol file (e.g. SyncMode), we might as well make these enum values lowercase as well to have them be consistent with the FailureReason values.
This is ready for re-review - just want to give everyone a chance to look at this again before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last small suggestion, otherwise LGTM
Co-authored-by: Lake Mossman <lake@airbyte.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏼
"$ref": "#/definitions/AirbyteErrorTraceMessage" | ||
AirbyteErrorTraceMessage: | ||
type: object | ||
additionalProperties: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recap from discussion on the doc: this will not be the case - additionalProperties
is true
so that it will be ok if a connector does erroneously set an additional property, and it's easier to make backwards-compatible changes to the protocol later.
No further properties are expected to be included in AirbyteErrorTraceMessage
and the connector logs should give additional context.
The "Connectors Base" failed after this PR is merged. Looks like some of the auto generated files need to be updated. @alovew, can you take a look? Update: never mind. It should be fixed. |
* Add AirbyteTraceMessage to protocol Co-authored-by: Lake Mossman <lake@airbyte.io>
What
AirbyteTraceMessages is a new type of AirbyteMessage that will display information about syncs to users. The first use case is to display error messages, and in the future there will be other AirbyteTraceMessage types to display different kinds of information in a user friendly way. More information about the error use case is outlined in these two documents:
https://docs.google.com/document/d/1ctrj3Yh_GjtQ93aND-WH3ocqGxsmxyC3jfiarrF6NY0
https://docs.google.com/document/d/1nDVU-6_lDHkNVHPFHwfozaEoGl9hmV42RaAbv__6Bqo