-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[source-postgres]Non CDC CAT test for postgres #41650
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the database-specific code too closely, but overall it's looking good! Just a few comments/questions.
airbyte-integrations/connectors/source-postgres/integration_tests/Dockerfile
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/Dockerfile
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/Dockerfile
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/abnormal_state_copy.json
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/seed/hook.py
Outdated
Show resolved
Hide resolved
...grations/bases/connector-acceptance-test/connector_acceptance_test/tests/test_incremental.py
Outdated
Show resolved
Hide resolved
...grations/bases/connector-acceptance-test/connector_acceptance_test/tests/test_incremental.py
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/abnormal_state.json
Outdated
Show resolved
Hide resolved
today = datetime.datetime.now() | ||
yesterday = today - timedelta(days=1) | ||
formatted_yesterday = yesterday.strftime('%y%m%d') | ||
delete_schemas_with_prefix(connection, formatted_yesterday) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typically, teardown is for deleting stuff we create in this test. Is it possible to just remove the schema we just created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's tricky, because incremental test have multiple tests running at the same time. Deleting the schema could cause other tests to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another question, you mentioned we work on the same schema, if that's the case, when we run concurrent testing, would this deletion affects other tests?
with open("./generated_schema.txt", "r") as f: | ||
return f.read() | ||
|
||
def remove_all_write_files(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to perform this step? I thought when container gets destroyed after test is done, these files will be gone as they are part of the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ask the setup container to port files back to directory because config needs to be dynamically generated (to incorporate schema). thus this is needed. However this is not working right not and @clnoll is working on that!
elif command == "teardown": | ||
teardown() | ||
elif command == "prepare": | ||
prepare() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to simplify, maybe we shall just make prepare() as part of setup()? setup to me means all the preparation work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prepare is supposed to run once before we start CAT test, while setup is going to run before each test.
Reason being we realized it's easier to work on a single schema among the entire test suite, to avoid that async incremental test conflictions.
|
||
if connection: | ||
# Create the schema | ||
create_schema(connection, schema_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally, we shall check the return status of each step here, and if one step failed, we shall log, close the connection without the rest of the steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in each step it will close connection whether or not it would success. And since we don't have try catch we would always throw exceptions - I think exceptions are easier to debug
airbyte-integrations/connectors/source-postgres/integration_tests/seed/hook.py
Outdated
Show resolved
Hide resolved
airbyte-integrations/connectors/source-postgres/integration_tests/seed/init.sql
Outdated
Show resolved
Hide resolved
...ns/connectors/source-postgres/integration_tests/incremental_configured_catalog_template.json
Outdated
Show resolved
Hide resolved
looks ready mostly to me, @xiaohansong please make sure we run a couple of tests in parallel to make sure this will not have any issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me if the concurrent tests are able to pass.
implementing refreshes for destination-postgres we're bumping the CDK version to the latest, and modifying a whole lot of jsonl files for tests (in both regular and strict-encrypt)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @xiaohansong! Just a few more comments for you.
airbyte-integrations/connectors/source-postgres/integration_tests/seed/hook.py
Outdated
Show resolved
Hide resolved
return connection | ||
except Exception as error: | ||
print(f"Error connecting to the database: {error}") | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason not to just let the exception get raised here, or exit(1)
? Otherwise the None
will cause a failure downstream anyway.
Removing the None
return value will also help simplify the downstream code since we won't have to check that the return value exists before use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also (especially if you're going to sometimes return an optional value) I'd really recommend using type annotations. It's not too burdensome to add them and they'll offer protection from a large category of unhandled exceptions.
You can find examples of how to use them throughout airbyte_cdk
.
delete_schemas_with_prefix(connection, formatted_yesterday) | ||
|
||
if __name__ == "__main__": | ||
command = sys.argv[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we should probably raise an exception if the command isn't "setup"
, "teardown"
, or "prepare"
, so the script doesn't exit silently if there's a regression.
…sts/seed/hook.py Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
What
acceptance test for non cdc postgres config.
How
Review guide
User Impact
Can this PR be safely reverted and rolled back?