Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(low-code): add use check availability flag to dynamic check #293

Merged

Conversation

lazebnyi
Copy link
Contributor

@lazebnyi lazebnyi commented Jan 29, 2025

What

Check availability by attempting to fetch the first slice of data during the connection check. However, for some dynamic streams, this slice may be empty, leading to a failed connection check even if the stream has been created and we have sufficient permissions to access data (e.g., source-google-sheets will create a stream if there is at least one row in the sheet, but it won’t create a slice if there are no additional rows).

How

Add a use_check_availability flag to bypass the connection check if streams have been successfully created.

Summary by CodeRabbit

  • New Features

    • Added use_check_availability option to control stream availability checks.
    • Introduced more flexible configuration for dynamic stream checking.
  • Improvements

    • Enhanced error handling and logging for stream availability checks.
    • Improved schema resolution for complex field types.
  • Bug Fixes

    • Refined test cases to handle various scenarios for availability checks.

@github-actions github-actions bot added the enhancement New feature or request label Jan 29, 2025
Copy link
Contributor

coderabbitai bot commented Jan 29, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a new use_check_availability feature for the CheckDynamicStream class across multiple files in the Airbyte CDK. The changes enable more flexible stream availability checking by adding a boolean flag that allows developers to control whether stream availability is verified during connection processes. The modifications span the implementation, schema definition, and test cases to support this new configuration option.

Changes

File Change Summary
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py Added use_check_availability attribute, modified check_connection method to conditionally skip availability checks.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Added new use_check_availability property with default true.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Introduced optional use_check_availability field with default True.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Updated create_check_dynamic_stream method to handle use_check_availability.
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py Modified test function to include use_check_availability parameter.

Possibly related PRs

Suggested reviewers

  • maxi297
  • darynaishchenko
  • natikgadzhi

Hey there! 👋 I noticed you've added a neat feature for controlling stream availability checks. Quick question: have you considered adding some documentation about the new use_check_availability flag to help other developers understand its purpose and usage? Wdyt? 🤔


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dca1dbf and 296cc0c.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Jan 29, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (2)

Line range hint 228-233: The implementation looks solid! Would you consider a few enhancements? 🤔

The recursive resolution of complex types is well implemented. Here are some suggestions that might make it even better:

  1. Maybe we could add type hints to the dictionary values? Something like:
-        field_type["items"] = (
+        field_type["items"]: Mapping[str, Any] = (
  1. Would you consider adding a docstring to explain the recursive nature? For example:
def _resolve_complex_type(self, complex_type: ComplexFieldType) -> Mapping[str, Any]:
    """Resolves a complex field type into its corresponding Airbyte type definition.
    
    For array types, recursively resolves the items property which can be either:
    - A simple type string (e.g., "string", "integer")
    - Another ComplexFieldType for nested arrays
    
    Args:
        complex_type: The complex field type to resolve
        
    Returns:
        A mapping containing the resolved Airbyte type definition
    """

What do you think? 🙂


Line range hint 228-233: Would you like help with test cases for the recursive type resolution? 🧪

To ensure the robustness of the complex type resolution, we might want to add test cases for:

  1. Simple array types (e.g., array of strings)
  2. Nested array types (e.g., array of arrays)
  3. Edge cases (e.g., invalid item types)

I'd be happy to help write these tests if you'd like! What do you think? 😊

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (3)

124-125: Consider using more descriptive test case IDs.

The test case IDs could be more descriptive to better indicate what's being tested, wdyt?

-        pytest.param(200, True, True, [], id="test_stream_available"),
-        pytest.param(200, True, False, [], id="test_stream_available"),
+        pytest.param(200, True, True, [], id="test_stream_available_with_availability_check"),
+        pytest.param(200, True, False, [], id="test_stream_available_bypass_availability_check"),

153-155: Consider adding test cases for edge cases.

Would it be helpful to add test cases for edge scenarios? For example:

  • When stream count is 0
  • When use_check_availability is False but stream count is 0
@pytest.mark.parametrize(
    "stream_count, use_check_availability, expected_messages",
    [
        pytest.param(0, True, ["No streams to connect to"], id="test_no_streams_with_availability_check"),
        pytest.param(0, False, ["No streams to connect to"], id="test_no_streams_bypass_availability_check"),
    ],
)
def test_check_dynamic_stream_edge_cases(stream_count, use_check_availability, expected_messages):
    # Test implementation

108-134: How about making the test IDs more descriptive?

The test cases look comprehensive! Would you consider making the test IDs more descriptive by including the use_check_availability value? For example:

id="test_stream_available_with_check_enabled"  # for use_check_availability=True
id="test_stream_available_with_check_disabled"  # for use_check_availability=False

This could make it easier to understand test results at a glance. What do you think? 🤔

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

906-914: Consider adding a descriptive assertion message.

The assertion could benefit from an error message to help developers understand what went wrong, wdyt?

-        assert model.use_check_availability is not None  # for mypy
+        assert model.use_check_availability is not None, "use_check_availability must be defined in CheckDynamicStream"  # for mypy
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

323-327: LGTM! Would you consider enhancing the description? 🤔

The schema changes look good and align well with the PR objectives. The property is well-defined with all required fields. However, would you consider making the description more informative by explaining why and when this flag would be set to false? Something like:

"Enables stream check availability. When set to false, the connection check will be bypassed if the streams have already been successfully created. This field is automatically set by the CDK."

wdyt?

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1)

26-26: Consider enhancing the class documentation to include the new attribute?

The new use_check_availability flag looks good! Would you consider adding it to the class docstring under Attributes to help other developers understand its purpose? Something like:

     Attributes:
         stream_count (int): numbers of streams to check
+        use_check_availability (bool): when True, checks stream availability by fetching data. When False, skips availability check if streams are created successfully.

wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3af96dc and 331dc24.

📒 Files selected for processing (7)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2 hunks)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)
  • airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py (1 hunks)
  • airbyte_cdk/sources/declarative/schema/dynamic_schema_loader.py (1 hunks)
  • unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (5)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

62-66: LGTM! Well-documented schema changes.

The schema definition for use_check_availability is well-structured with:

  • Clear description
  • Appropriate default value
  • Good title
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2)

38-40: LGTM! Early return looks good.

The early return when use_check_availability is False is well-placed and maintains the expected return type signature. Nice work! 👍


51-56: Error handling improvements look great!

Including the stream name in the error message and using exc_info=True will make debugging much easier. Excellent attention to detail! 🎯

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2)

138-155: Test implementation looks solid!

Great job on the test implementation! Using deepcopy for the manifest and conditionally adjusting the request count based on use_check_availability is a clean approach. The HTTP mock setup is clear and well-structured. 👏


166-168: Assertions are thorough and well-placed!

The request count assertions effectively validate the behavior of use_check_availability. Nice touch checking both item and items requests! 💯

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2)

108-133: Consider adding edge cases to improve test coverage?

The test cases look good! To make them even more robust, what do you think about adding these scenarios?

  • Error responses (500, 502, etc.) with use_check_availability=False
  • Empty response with status 200
  • Malformed JSON response
    wdyt?

147-150: Consider extracting the item ID as a constant?

The URL "/items/1" uses a magic number. What do you think about making it more explicit? Something like:

+TEST_ITEM_ID = 1
-item_request = HttpRequest(url="https://api.test.com/items/1")
+item_request = HttpRequest(url=f"https://api.test.com/items/{TEST_ITEM_ID}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 331dc24 and f50f38b.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (4)
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (4)

135-137: LGTM! Clear and well-structured function signature.

The parameter names are descriptive and follow a consistent pattern.


141-145: LGTM! Clean HTTP mock setup.

The request and response setup is clear and well-structured.


152-154: LGTM! Clear conditional logic.

The feature flag handling is straightforward and the comment explains the behavior well.


165-166: LGTM! Good verification of request count.

The assertion ensures the feature flag correctly controls whether the request is made.

@lazebnyi lazebnyi requested a review from maxi297 January 29, 2025 17:50
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preemptively approving

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Jan 30, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@lazebnyi lazebnyi merged commit 4e7f94a into main Jan 30, 2025
19 of 24 checks passed
@lazebnyi lazebnyi deleted the lazebnyi/add-use-check-availability-flag-to-dynamic-check branch January 30, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants