-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Source Slack migration to low code #35477
Conversation
This reverts commit c128690.
@irvingpop Thanks again for the feedback.
For threads stream we do not migrate it to low code for now. But it's planned to migrate once cdk updates with new state feature in slices. |
@alafanechere Hi, I have run regression test for current changes. TestDataIntegrity.test_record_count_with_state TestDataIntegrity.test_all_pks_are_produced_in_target_version_with_state TestDataIntegrity.test_all_records_are_the_same_with_state TestDataIntegrity.test_record_count_without_state TestDataIntegrity.test_all_records_are_the_same_without_state
I have tested it locally with connection creds, dev and latest versions returned same amount of records, except channel messages with state as low-code version start from previous cursor partition and returned one record for each partition which is expected behavior.
latest without state
dev with state
latest with state
Missing records: Both 2 missed records present in dev local test result for channel messages stream.
ts value in attachments are the same type, regression tool compared different records. This value are not present in stream schema so we should consider changing stream schema to fix it in next slack release. |
Hey @darynaishchenko!
Yes, unfortunately Slack's API heavily rate limits us all. The rate limiting with the old connector was atrocious (thousands of rate limit backoffs per sync) - but is much much better with the new connector (still couple hundred rate limit messages during the initial sync, but a huge improvement)
Okay, good to know! Out of curiosity, is the plan to release the new low-code connector first, and then update the threads functionality at a later time? From what I can tell, the threads sync has suffered a big regression. Feel free also to message me in the Airbyte slack if you need any other details. |
airbyte-integrations/connectors/source-slack/source_slack/manifest.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My testing with the regression test tool leads to the following conclusion:
- Record count is correct on all streams. This new version should not lead to data loss
- Record schemas are matching.
- There's indeed a breaking change on
channel_messages
stream as the state format has changed.
Here are a couple of additional questions I'd like answer on before giving an additional review:
- The checkpointing logic has slightly changed. The new version produces ~30% less state message than the current one. Is it expected? Should it be concerning? Do we have sufficient control in the low-code CDK to make the checkpointing logic match?
- I noticed that the catalog schema is incomplete. For instance
channel_messages.attachment
do not have all the properties returned by the API. I don't think it is problematic for this PR. But do we plan on improving the catalog schema to match the API response? (you can check the following artifact if you want to introspect the inferred schema onchannel_messages
:command_execution_artifacts/source-slack/read/dev/stream_schemas/channel_messages.json
Why I'm requesting changes?
I think you should use scopedImpact
breaking changes if the only impacted stream is channel_messages
.
Yes, it'expected. Declarative stream disables checkpointing and we don't have a way to change it from the source implementation.
Yes, I also noticed undeclared fields for channel messages stream. I will create a follow up ticket and discuss with my team when we can work on it. PS: working on requested changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉🎉🎉🎉
Congrats @darynaishchenko for taking this to the finish line.
Thanks for your patience and multiple iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
message: | ||
The source slack connector is being migrated from the Python CDK to our declarative low-code CDK. | ||
Due to changes in the handling of state format for incremental substreams, this migration constitutes a breaking change for the channel_messages stream. | ||
Users will need to retest source configuration, refresh the source schema and reset the channel_messages stream after upgrading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/retest/reset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
breakingChanges: | ||
1.0.0: | ||
message: | ||
The source slack connector is being migrated from the Python CDK to our declarative low-code CDK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/slack/Slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
## Upgrading to 1.0.0 | ||
|
||
We're continuously striving to enhance the quality and reliability of our connectors at Airbyte. | ||
As part of our commitment to delivering exceptional service, we are transitioning source slack from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/slack/Slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
docs/integrations/sources/slack.md
Outdated
|:--------|:-----------|:---------------------------------------------------------|:------------------------------------------------------------------------------------| | ||
| Version | Date | Pull Request | Subject | | ||
|:--------|:-----------|:---------------------------------------------------------|:-------------------------------------------------------------------------------------| | ||
| 1.0.0 | 2024-04-02 | [35477](https://github.com/airbytehq/airbyte/pull/35477) | Migration to low code | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
low-code CDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
docs/integrations/sources/slack.md
Outdated
|:--------|:-----------|:---------------------------------------------------------|:------------------------------------------------------------------------------------| | ||
| Version | Date | Pull Request | Subject | | ||
|:--------|:-----------|:---------------------------------------------------------|:-------------------------------------------------------------------------------------| | ||
| 1.0.0 | 2024-04-02 | [35477](https://github.com/airbytehq/airbyte/pull/35477) | Migration to low code CDK | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migration to low-code CDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
What
resolved: https://github.com/airbytehq/airbyte-internal-issues/issues/2920
Migrate slack to low-code cdk.
How
Deleted python code and moved main stream to manifest.yaml.
Channels
stream: custom component -join_channels.py
used JoinChannel stream inChannelsRetriever
to connect to channel if itis_member
is False and user defined this feature in config. Retriever checks if it should be connected JoinChannelStream performs logic for connection and returns joined channel.Users
stream: also used to check connection, so error_handler was extended with needed status codes.Channel Members
stream: custom component for extracting records from responsechannel_members_extractor.py
:from: ['aa', 'bb'] to: [{'member_id': 'aa'}, {{'member_id': 'bb'}]
.Channel messages
stream: no custom components, retrieves messages in channel by channel id and provided time period.Threads
stream: is used old python base implementation. For now we can't migrate to low code, because we can't operate state in substream partitioner which prevents incorrect work of incremental sync.Also add Migrate legacy config to make it compatible with Selective Auth. Legacy config does not have path to be able to select authenticator.
Recommended reading order
airbyte-integrations/connectors/source-slack/source_slack/manifest.yaml
airbyte-integrations/connectors/source-slack/source_slack/components
🚨 User Impact 🚨
Breaking change due to update of state format in sub streams.
Pre-merge Actions
Update date in breaking changes.