Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Fix K8s test on main brach (currently failing) #1319

Closed
1 task
tatiana opened this issue Nov 13, 2024 · 2 comments
Closed
1 task

[Bug] Fix K8s test on main brach (currently failing) #1319

tatiana opened this issue Nov 13, 2024 · 2 comments
Assignees
Labels
area:ci Related to CI, Github Actions, or other continuous integration tools area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone

Comments

@tatiana
Copy link
Collaborator

tatiana commented Nov 13, 2024

Astronomer Cosmos Version

N/A

dbt-core version

N/A

Versions of dbt adapters

Cosmos K8s integration tests are failing in our main branch.

This is visible in new PRs, since the issue started happening after the last merge to main:
https://github.com/astronomer/astronomer-cosmos/actions/runs/11796817624/job/32867560902

LoadMode

AUTOMATIC

ExecutionMode

AWS_EKS

InvocationMode

None

airflow version

N/A

Operating System

N/A

If a you think it's an UI issue, what browsers are you seeing the problem on?

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened?

K8s tests stopped working in the CI

Relevant log output

How to reproduce

Create a branch from main and see the k8s tests failing in the CI

Anything else :)?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Contact Details

No response

@tatiana tatiana added bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone labels Nov 13, 2024
@dosubot dosubot bot added area:ci Related to CI, Github Actions, or other continuous integration tools area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc labels Nov 13, 2024
pankajkoti added a commit that referenced this issue Nov 13, 2024
It appears we have a flaky Kubernetes test that failed in PR #1313. As
shown in the error log
[here](https://github.com/astronomer/astronomer-cosmos/actions/runs/11796817624/job/32867560902?pr=1313#step:7:473),
the PostgreSQL pod did not reach the ready state and instead entered an
error status. Since the cause of the error status is unclear, this PR
introduces a status check for the PostgreSQL pod to ensure it becomes
fully running and healthy. If the pod enters an ERROR state, we now run
a `kubectl describe` command on the pod to capture the event logs for
debugging. The test will also exit with an error code of 1 to prevent
further execution.

related: #1319
@pankajkoti
Copy link
Contributor

We merged PR #1320 that checks on the status of the postgres pod -- previously, postgres pod failed & errored as observed in https://github.com/astronomer/astronomer-cosmos/actions/runs/11796817624/job/32867560902 which led to late failure in the K8s tests. It would be nice to observe for a while & if all is well, we can close the issue

@pankajkoti
Copy link
Contributor

After discussing with @tatiana we agreed that we could close this ticket & re-open it if we observe any more failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:ci Related to CI, Github Actions, or other continuous integration tools area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone
Projects
None yet
Development

No branches or pull requests

2 participants