Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[source-postgres]Non CDC CAT test for postgres #41650

Merged
merged 1,084 commits into from
Jul 24, 2024
Merged

[source-postgres]Non CDC CAT test for postgres #41650

merged 1,084 commits into from
Jul 24, 2024

Conversation

xiaohansong
Copy link
Contributor

@xiaohansong xiaohansong commented Jul 11, 2024

What

acceptance test for non cdc postgres config.

How

  • hook.py as a setup script will generate a schema name. Acceptance tests are running async thus it could cause some confliction if we setup a new schema each time for a new test. Instead we have more flexibility in hook.py and we may be able to generate multiple schema if needed.
  • All tests in one CAT test run will share the same schema.
  • teardown logic will only clean up schema from previous days.
  • Fix a bug on incremental test where we made incorrect assumption on DB source cursor state

Review guide

User Impact

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Copy link

vercel bot commented Jul 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jul 24, 2024 7:08pm

@CLAassistant
Copy link

CLAassistant commented Jul 12, 2024

CLA assistant check
All committers have signed the CLA.

@octavia-squidington-iii octavia-squidington-iii added area/documentation Improvements or additions to documentation CDK Connector Development Kit and removed area/documentation Improvements or additions to documentation CDK Connector Development Kit labels Jul 12, 2024
@xiaohansong xiaohansong requested review from clnoll and theyueli July 12, 2024 21:31
Copy link
Contributor

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the database-specific code too closely, but overall it's looking good! Just a few comments/questions.

today = datetime.datetime.now()
yesterday = today - timedelta(days=1)
formatted_yesterday = yesterday.strftime('%y%m%d')
delete_schemas_with_prefix(connection, formatted_yesterday)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically, teardown is for deleting stuff we create in this test. Is it possible to just remove the schema we just created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's tricky, because incremental test have multiple tests running at the same time. Deleting the schema could cause other tests to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another question, you mentioned we work on the same schema, if that's the case, when we run concurrent testing, would this deletion affects other tests?

with open("./generated_schema.txt", "r") as f:
return f.read()

def remove_all_write_files():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to perform this step? I thought when container gets destroyed after test is done, these files will be gone as they are part of the container.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ask the setup container to port files back to directory because config needs to be dynamically generated (to incorporate schema). thus this is needed. However this is not working right not and @clnoll is working on that!

elif command == "teardown":
teardown()
elif command == "prepare":
prepare()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to simplify, maybe we shall just make prepare() as part of setup()? setup to me means all the preparation work...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare is supposed to run once before we start CAT test, while setup is going to run before each test.

Reason being we realized it's easier to work on a single schema among the entire test suite, to avoid that async incremental test conflictions.


if connection:
# Create the schema
create_schema(connection, schema_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally, we shall check the return status of each step here, and if one step failed, we shall log, close the connection without the rest of the steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in each step it will close connection whether or not it would success. And since we don't have try catch we would always throw exceptions - I think exceptions are easier to debug

@xiaohansong xiaohansong marked this pull request as ready for review July 18, 2024 20:59
@xiaohansong xiaohansong requested a review from a team as a code owner July 18, 2024 20:59
@theyueli
Copy link
Contributor

looks ready mostly to me, @xiaohansong please make sure we run a couple of tests in parallel to make sure this will not have any issue.

Copy link
Contributor

@theyueli theyueli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me if the concurrent tests are able to pass.

Copy link
Contributor

@theyueli theyueli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @xiaohansong! Just a few more comments for you.

return connection
except Exception as error:
print(f"Error connecting to the database: {error}")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to just let the exception get raised here, or exit(1)? Otherwise the None will cause a failure downstream anyway.

Removing the None return value will also help simplify the downstream code since we won't have to check that the return value exists before use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also (especially if you're going to sometimes return an optional value) I'd really recommend using type annotations. It's not too burdensome to add them and they'll offer protection from a large category of unhandled exceptions.

You can find examples of how to use them throughout airbyte_cdk.

delete_schemas_with_prefix(connection, formatted_yesterday)

if __name__ == "__main__":
command = sys.argv[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we should probably raise an exception if the command isn't "setup", "teardown", or "prepare", so the script doesn't exit silently if there's a regression.

@xiaohansong xiaohansong merged commit 8bf87e8 into master Jul 24, 2024
36 checks passed
@xiaohansong xiaohansong deleted the xiaohan/test branch July 24, 2024 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.