feat(low-code): add use check availability flag to dynamic check #293

lazebnyi · 2025-01-29T12:17:52Z

What

Check availability by attempting to fetch the first slice of data during the connection check. However, for some dynamic streams, this slice may be empty, leading to a failed connection check even if the stream has been created and we have sufficient permissions to access data (e.g., source-google-sheets will create a stream if there is at least one row in the sheet, but it won’t create a slice if there are no additional rows).

How

Add a use_check_availability flag to bypass the connection check if streams have been successfully created.

Summary by CodeRabbit

New Features
- Added use_check_availability option to control stream availability checks.
- Introduced more flexible configuration for dynamic stream checking.
Improvements
- Enhanced error handling and logging for stream availability checks.
- Improved schema resolution for complex field types.
Bug Fixes
- Refined test cases to handle various scenarios for availability checks.

…-schemas

…f github.com:airbytehq/airbyte-python-cdk into lazebnyi/add-array-items-handling-to-dynamic-schemas

…-schemas

…f github.com:airbytehq/airbyte-python-cdk into lazebnyi/add-array-items-handling-to-dynamic-schemas

…dynamic-check

coderabbitai · 2025-01-29T12:20:03Z

📝 Walkthrough

Walkthrough

This pull request introduces a new use_check_availability feature for the CheckDynamicStream class across multiple files in the Airbyte CDK. The changes enable more flexible stream availability checking by adding a boolean flag that allows developers to control whether stream availability is verified during connection processes. The modifications span the implementation, schema definition, and test cases to support this new configuration option.

Changes

File	Change Summary
`airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py`	Added `use_check_availability` attribute, modified `check_connection` method to conditionally skip availability checks.
`airbyte_cdk/sources/declarative/declarative_component_schema.yaml`	Added new `use_check_availability` property with default `true`.
`airbyte_cdk/sources/declarative/models/declarative_component_schema.py`	Introduced optional `use_check_availability` field with default `True`.
`airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`	Updated `create_check_dynamic_stream` method to handle `use_check_availability`.
`unit_tests/sources/declarative/checks/test_check_dynamic_stream.py`	Modified test function to include `use_check_availability` parameter.

Possibly related PRs

feat(low-code): add check dynamic stream #223: Related to CheckDynamicStream class modifications.
feat(low-code): add items mappings to dynamic schemas #256: Enhances schema capabilities with ComplexFieldType, aligning with changes in the main PR.

Suggested reviewers

maxi297
darynaishchenko
natikgadzhi

Hey there! 👋 I noticed you've added a neat feature for controlling stream availability checks. Quick question: have you considered adding some documentation about the new use_check_availability flag to help other developers understand its purpose and usage? Wdyt? 🤔

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dca1dbf and 296cc0c.

📒 Files selected for processing (1)

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-the-guardian-api' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.12, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Analyze (python)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

lazebnyi · 2025-01-29T12:20:40Z

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (8)

airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (2)
Line range hint 228-233: The implementation looks solid! Would you consider a few enhancements? 🤔

The recursive resolution of complex types is well implemented. Here are some suggestions that might make it even better:

Maybe we could add type hints to the dictionary values? Something like:
-        field_type["items"] = (
+        field_type["items"]: Mapping[str, Any] = (
Would you consider adding a docstring to explain the recursive nature? For example:
def _resolve_complex_type(self, complex_type: ComplexFieldType) -> Mapping[str, Any]:
    """Resolves a complex field type into its corresponding Airbyte type definition.
    
    For array types, recursively resolves the items property which can be either:
    - A simple type string (e.g., "string", "integer")
    - Another ComplexFieldType for nested arrays
    
    Args:
        complex_type: The complex field type to resolve
        
    Returns:
        A mapping containing the resolved Airbyte type definition
    """
What do you think? 🙂

Line range hint 228-233: Would you like help with test cases for the recursive type resolution? 🧪

To ensure the robustness of the complex type resolution, we might want to add test cases for:

Simple array types (e.g., array of strings)

Nested array types (e.g., array of arrays)

Edge cases (e.g., invalid item types)

I'd be happy to help write these tests if you'd like! What do you think? 😊
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (3)
124-125: Consider using more descriptive test case IDs.

The test case IDs could be more descriptive to better indicate what's being tested, wdyt?
-        pytest.param(200, True, True, [], id="test_stream_available"),
-        pytest.param(200, True, False, [], id="test_stream_available"),
+        pytest.param(200, True, True, [], id="test_stream_available_with_availability_check"),
+        pytest.param(200, True, False, [], id="test_stream_available_bypass_availability_check"),
153-155: Consider adding test cases for edge cases.

Would it be helpful to add test cases for edge scenarios? For example:

When stream count is 0

When use_check_availability is False but stream count is 0
@pytest.mark.parametrize(
    "stream_count, use_check_availability, expected_messages",
    [
        pytest.param(0, True, ["No streams to connect to"], id="test_no_streams_with_availability_check"),
        pytest.param(0, False, ["No streams to connect to"], id="test_no_streams_bypass_availability_check"),
    ],
)
def test_check_dynamic_stream_edge_cases(stream_count, use_check_availability, expected_messages):
    # Test implementation
108-134: How about making the test IDs more descriptive?

The test cases look comprehensive! Would you consider making the test IDs more descriptive by including the use_check_availability value? For example:
id="test_stream_available_with_check_enabled"  # for use_check_availability=True
id="test_stream_available_with_check_disabled"  # for use_check_availability=False
This could make it easier to understand test results at a glance. What do you think? 🤔
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
906-914: Consider adding a descriptive assertion message.

The assertion could benefit from an error message to help developers understand what went wrong, wdyt?
-        assert model.use_check_availability is not None  # for mypy
+        assert model.use_check_availability is not None, "use_check_availability must be defined in CheckDynamicStream"  # for mypy
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

323-327: LGTM! Would you consider enhancing the description? 🤔

The schema changes look good and align well with the PR objectives. The property is well-defined with all required fields. However, would you consider making the description more informative by explaining why and when this flag would be set to false? Something like:

"Enables stream check availability. When set to false, the connection check will be bypassed if the streams have already been successfully created. This field is automatically set by the CDK."

wdyt?
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1)
26-26: Consider enhancing the class documentation to include the new attribute?

The new use_check_availability flag looks good! Would you consider adding it to the class docstring under Attributes to help other developers understand its purpose? Something like:
     Attributes:
         stream_count (int): numbers of streams to check
+        use_check_availability (bool): when True, checks stream availability by fetching data. When False, skips availability check if streams are created successfully.
wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3af96dc and 331dc24.

📒 Files selected for processing (7)

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2 hunks)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1 hunks)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)
airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py (1 hunks)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (1 hunks)
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2 hunks)

✅ Files skipped from review due to trivial changes (1)

airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py

⏰ Context from checks skipped due to timeout of 90000ms (8)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-the-guardian-api' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.12, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Pytest (Fast)

🔇 Additional comments (5)

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

62-66: LGTM! Well-documented schema changes.

The schema definition for use_check_availability is well-structured with:

Clear description

Appropriate default value

Good title

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2)

38-40: LGTM! Early return looks good.

The early return when use_check_availability is False is well-placed and maintains the expected return type signature. Nice work! 👍

51-56: Error handling improvements look great!

Including the stream name in the error message and using exc_info=True will make debugging much easier. Excellent attention to detail! 🎯

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2)

138-155: Test implementation looks solid!

Great job on the test implementation! Using deepcopy for the manifest and conditionally adjusting the request count based on use_check_availability is a clean approach. The HTTP mock setup is clear and well-structured. 👏

166-168: Assertions are thorough and well-placed!

The request count assertions effectively validate the behavior of use_check_availability. Nice touch checking both item and items requests! 💯

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2)
108-133: Consider adding edge cases to improve test coverage?

The test cases look good! To make them even more robust, what do you think about adding these scenarios?

Error responses (500, 502, etc.) with use_check_availability=False

Empty response with status 200

Malformed JSON response
wdyt?

147-150: Consider extracting the item ID as a constant?

The URL "/items/1" uses a magic number. What do you think about making it more explicit? Something like:
+TEST_ITEM_ID = 1
-item_request = HttpRequest(url="https://api.test.com/items/1")
+item_request = HttpRequest(url=f"https://api.test.com/items/{TEST_ITEM_ID}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 331dc24 and f50f38b.

📒 Files selected for processing (1)

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Check: 'source-pokeapi' (skip=false)
GitHub Check: Check: 'source-the-guardian-api' (skip=false)
GitHub Check: Check: 'source-shopify' (skip=false)
GitHub Check: Check: 'source-hardcoded-records' (skip=false)
GitHub Check: Pytest (All, Python 3.12, Ubuntu)
GitHub Check: Pytest (All, Python 3.11, Ubuntu)
GitHub Check: Pytest (Fast)
GitHub Check: Pytest (All, Python 3.10, Ubuntu)
GitHub Check: Analyze (python)

🔇 Additional comments (4)

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (4)

135-137: LGTM! Clear and well-structured function signature.

The parameter names are descriptive and follow a consistent pattern.

141-145: LGTM! Clean HTTP mock setup.

The request and response setup is clear and well-structured.

152-154: LGTM! Clear conditional logic.

The feature flag handling is straightforward and the comment explains the behavior well.

165-166: LGTM! Good verification of request count.

The assertion ensures the feature flag correctly controls whether the request is made.

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py

maxi297

Preemptively approving

lazebnyi · 2025-01-30T12:26:34Z

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

lazebnyi and others added 22 commits January 23, 2025 17:50

Added items handling to dynamic schemas

1953fba

Auto-fix lint and format issues

2a34b81

Merge branch 'main' into lazebnyi/add-array-items-handling-to-dynamic…

f3896b4

…-schemas

Fix typo

d181da3

Rollback properties for objects

bf913dd

Fix typo

2ff96b0

Auto-fix lint and format issues

f42e98b

Fix mypy

1616df0

Merge branch 'lazebnyi/add-array-items-handling-to-dynamic-schemas' o…

52a2fe8

…f github.com:airbytehq/airbyte-python-cdk into lazebnyi/add-array-items-handling-to-dynamic-schemas

Auto-fix lint and format issues

4412b72

Rollback dynamic schema loader

d21a122

Update to complex type resolving

52683a4

Merge master to branch

12a653d

Auto-fix lint and format issues

6682d9f

Fix mypy

73bda55

Merge branch 'lazebnyi/add-array-items-handling-to-dynamic-schemas' o…

1bda3d4

…f github.com:airbytehq/airbyte-python-cdk into lazebnyi/add-array-items-handling-to-dynamic-schemas

Merge branch 'main' into lazebnyi/add-array-items-handling-to-dynamic…

736bf28

…-schemas

Fix complex type resolving

0ee84d5

Merge branch 'lazebnyi/add-array-items-handling-to-dynamic-schemas' o…

85262e2

…f github.com:airbytehq/airbyte-python-cdk into lazebnyi/add-array-items-handling-to-dynamic-schemas

Fix typo

f59cd42

Add use_check_availability to CheckDynamicStream

7b148ce

Merge branch 'main' into lazebnyi/add-use-check-availability-flag-to-…

202b0d0

…dynamic-check

github-actions bot added the enhancement New feature or request label Jan 29, 2025

Auto-fix lint and format issues

331dc24

coderabbitai bot reviewed Jan 29, 2025

View reviewed changes

coderabbitai bot approved these changes Jan 29, 2025

View reviewed changes

lazebnyi temporarily deployed to DockerHub January 29, 2025 12:34 — with GitHub Actions Inactive

lazebnyi temporarily deployed to PyPi January 29, 2025 12:34 — with GitHub Actions Inactive

lazebnyi requested a review from darynaishchenko January 29, 2025 17:14

Remove items request count assert

f50f38b

coderabbitai bot reviewed Jan 29, 2025

View reviewed changes

lazebnyi requested a review from maxi297 January 29, 2025 17:50

maxi297 reviewed Jan 29, 2025

View reviewed changes

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py Show resolved Hide resolved

lazebnyi requested a review from maxi297 January 29, 2025 19:32

lazebnyi mentioned this pull request Jan 30, 2025

feat(source-google-sheets): migrate low code airbytehq/airbyte#50843

Merged

2 tasks

maxi297 approved these changes Jan 30, 2025

View reviewed changes

Add comment about stream names in check dynamic stream

dca1dbf

Auto-fix lint and format issues

296cc0c

lazebnyi merged commit 4e7f94a into main Jan 30, 2025
19 of 24 checks passed

lazebnyi deleted the lazebnyi/add-use-check-availability-flag-to-dynamic-check branch January 30, 2025 13:44

coderabbitai bot mentioned this pull request Jan 31, 2025

fix(low-code): add wrong dynamic stream name type validation #305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(low-code): add use check availability flag to dynamic check #293

feat(low-code): add use check availability flag to dynamic check #293

lazebnyi commented Jan 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2025 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

lazebnyi commented Jan 29, 2025 •

edited by github-actions bot

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

maxi297 left a comment

lazebnyi commented Jan 30, 2025 •

edited by github-actions bot

Loading

feat(low-code): add use check availability flag to dynamic check #293

feat(low-code): add use check availability flag to dynamic check #293

Conversation

lazebnyi commented Jan 29, 2025 • edited by coderabbitai bot Loading

What

How

Summary by CodeRabbit

coderabbitai bot commented Jan 29, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

lazebnyi commented Jan 29, 2025 • edited by github-actions bot Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

maxi297 left a comment

Choose a reason for hiding this comment

lazebnyi commented Jan 30, 2025 • edited by github-actions bot Loading

lazebnyi commented Jan 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

lazebnyi commented Jan 29, 2025 •

edited by github-actions bot

Loading

lazebnyi commented Jan 30, 2025 •

edited by github-actions bot

Loading