-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add solution for quarantining flaky tests #9427
Conversation
- Inspired by Airflow's Quarantined tests solution, https://github.com/apache/airflow/blob/master/TESTING.rst#quarantined-tests - Quarantined tests are not run by default. - Quarantined tests are executed when "runQuarantinedTests" system property is set to true - this is similar feature as Airflow's "--include-quarantined" flag - No other tests than Quarantined tests are executed when "runOnlyQuarantinedTests" system property is set to true. - this is similar feature as Airflow CI's "-m quarantined" flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am +1 with this framework, but we must create a high priority issue for any test marked as Quarantined
otherwise we are going to lose code coverage.
Code coverage with flaky tests is not very meaningful...so it is okay to disable them.
probably we could have a separate CI job, not bound to PRs that runs quarantined tests
Good points @eolivelli. yes that would be required. I believe Airflow CI has such a job. |
Yes we have in airflow a separate job. Failure of that job does not stop anything (it has Since those tests are flaky but not "broken", they do succeed more often than they don't so we notice and can react if we see that those tests in "Quarantine" start to fail consistently. This is easy to spot actually if you are active committer who merges a number of commits a day/week. We also have a separate workflow which is "scheduled" and only runs quarantined tests, but this did not work as well as I wanted. We were submitting status of such quarantined tests to apache/airflow#10118 automatically (last 20 runs were kept there). This however turned out to be flaky on its own. Also yeah - we have a bug for every quarantined test: https://github.com/apache/airflow/issues?q=is%3Aopen+is%3Aissue+label%3AQuarantine but - due to 2.0.0 release (2 years in the making) and follow up 2.0.1 we are working on on where we fix teething problems of 2.0.0 those are a little neglected - they are all part of the "2.0 Cleanup milestone" and I actually hope to get them fixed as soon as we release 2.0.1 |
BTW. Glad that you liked the idea :). And the name for those tests is very, very appropriate taking into account the external circumstances ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@lhotari can you please merge with current master and Kickstart CI again? |
I'm postponing this PR until I have time to continue with this. It doesn't make sense to merge this in it's current form. Matteo provided some detailed requirements for the quarantined tests solution in the last Pulsar Community meeting. One of the ideas was to have a solution where the retries could be kept for quarantined tests. The rational behind this is that even though some of the tests are flaky, they provide some value to the reviewer. Removing the tests completely would leave a gap and if the tests are flaky, they won't be useful without retries. Having this kind of hybrid solution in place would require that the quarantined test results are shown on the PR to the reviewer in a certain format. GitHub Actions has this feature called "annotations" which means that you can add custom test results that are displayed in the results for the test runs all the way on the PR. It would be necessary to revisit the solution and take this into account before merging the changes in this PR. I'm making progress on the CI clean, I'll hopefully have some public updates later today or tomorrow. |
@lhotari I see. thanks for the explanation |
Motivation
Pulsar contains a large amount of flaky tests. It will be useful to have a solution for quarantining tests.
This solution takes inspiration from Airflow's Quarantined tests solution. Quarantining (or categorizing) flaky tests has been suggested by @potiuk in the draft PIP "Changes to GitHub Actions based Pulsar CI" and also by @dlg99 in in the draft PIP "Changes to flaky test handling". Thank you @potiuk and @dlg99 for suggesting this.
Modifications
Adds
@Quarantined
annotation for quarantining flaky testsImplements TestNG
IAnnotationTransformer
for integrating with TestNG inQuarantinedTestNGAnnotationTransformer
.Quarantined tests are not run by default.
Quarantined tests are executed when
runQuarantinedTests
system property is set totrue
--include-quarantined
flagNo other tests than Quarantined tests are executed when
runOnlyQuarantinedTests
system property is set to
true
.-m quarantined
flag