CI: prevent false negatives in builds for pull requests #39835

afeld · 2021-02-16T06:16:21Z

I have checked that this issue has not already been reported.
~~I have confirmed this bug exists on the latest version of pandas.~~
(optional) I have confirmed this bug exists on the master branch of pandas.

Problem description

This issue is a bit vague, because I'm not sure that I completely understand the problem. And/or it's multiple problems. Anyway:

I have opened a handful of pull requests on pandas recently, and it seems that each time I do (or when I rebase), tests fail for a different reason…unrelated to my change. In #39364, for example, the build was failing because of:

A failing ipython cell run back in the whatsnew entry for v0.8.0 (fixed in DOC: skip evaluation of code in v0.8.0 release notes #39801, then
An AttributeError: 'float' object has no attribute 'value' error
- GitHub Actions run
- Azure Pipeline run

Anyway, the point is not those specific errors, but that the pandas build is relatively fragile. Upstream issues cause builds for pull requests to fail with errors that have nothing to do with the change, which is confusing for people trying to contribute. This could be because:

Changes that break the build get merged
Builds are flaky
?

Some ideas of how to help:

Enable branch protection on master
Make the continuous integration runs conditional, to only the part of the repository changed
Do some deeper analysis on what tests are failing most often to see if they are flaky, etc.

Curious to hear ideas from maintainers about if there are patterns of things going on that could be fixed. I imagine it makes sense to split more actionable issues and close this one after some discussion; defer to maintainers on that as well. Thanks!

The text was updated successfully, but these errors were encountered:

attack68 · 2021-02-16T10:01:15Z

I don't have a lot of experience but I'll share my pattern FWIW.

c.33% of the time I will have some test failure that is unrelated to my changes. I assume that this is just a random fail (from an flakey system) or that bugs from other core devs merges will get patched, so I just wait a day or so, re-merge upstream and then see how the chips fall.

I recently also had a failing ipython cell run in whatsnew for v0.20.0. It was a nightmare to find. I changed the cell code rather than to a code-block cell - should these old whatsnew docs be converted from actually executing code?

I rely on cross checks to ensure my code is reliable. Not sure how this fits in to the CI runs but defining what tests to run based on source code changes would be very difficult I think, and occasionally I have found errors in tests areas I would not expect to be related which led to corrective action.

MarcoGorelli · 2021-02-16T10:14:55Z

Hi @afeld

Enable branch protection on master

I recall @jorisvandenbossche saying this was already the case (though I haven't tried force pushing to master to check)

Make the continuous integration runs conditional, to only the part of the repository changed

This would hide a lot of errors - even just not having merged latest changes recently enough can hide errors (example), so I don't think this is a possibility

jreback · 2021-02-16T12:44:33Z

we do not force push to master and it would fail as the branch is protected

jorisvandenbossche · 2021-02-16T15:01:24Z

An AttributeError: 'float' object has no attribute 'value' error

I know the point of the issue is not this specific case of a failure, but just to explain what it was (because I think the pattern is useful to understand how such things can happen and could be prevented): that was a bug in a PR merged to master (#39769), and which was quickly thereafter fixed by the author of the PR (#39829). You updated your PR in the few hours between those two PRs, and thus pulled in the failures.
So this was a case of "Changes that break the build get merged".

The reason that the bug was not catched on the original PR (CI was green on that PR), is probably because of a bad interplay with another change that was just merged before merging that PR. Ideally, you would always merge master to have a CI run that is fully up to date before merging a PR. In practice we actually often do that, but it's also not always practical when merging multiple PRs (because for each PR it takes quite a while to merge master / run CI).
Now, from my experience, this doesn't happen that often (or at least not enough to make it a hard rule).

(I think if we want to solve this problem, we should look at some kind of "auto-merge bot" that ensures CI is green with latest master before merging)

Builds are flaky

I think this is something much more common, and indeed quite annoying (but often they are also annoying to figure out / fix). Recently I have seen regularly some unrelated failures due to ResourceWarnings.

Make the continuous integration runs conditional, to only the part of the repository changed

That would mainly reduce CI time, but not necessarily make it more reliable I think. As @attack68 mentioned, determining which tests to run is not that easy, and will probably regularly give false negatives, potentially causing a broken master more often.

I recently also had a failing ipython cell run in whatsnew for v0.20.0. It was a nightmare to find. I changed the cell code rather than to a code-block cell - should these old whatsnew docs be converted from actually executing code?

I think one problem might be that the error message is not that easy to find in the build log? (this might be something that could be improved)

About not executing these old whatsnew files anymore, there is an old issue about this: #6856

afeld · 2021-02-17T04:38:29Z

auto-merge bot

GitHub just added an option to automatically merge on green. Probably useful for you maintainers regardless!

ensures CI is green with latest master before merging

The protected branch setting I was thinking of was requiring checks to pass before merging. Sounds like we don't think that (accidentally?) merging pull requests with failing checks is a problem, so that wouldn't necessarily help.

There doesn't seem to be a built-in feature for ensuring merges would be a fast-forward (that master is merged). A /rebase slash command is an option.

determining which tests to run is not that easy, and will probably regularly give false negatives, potentially causing a broken master more often

Fair.

the error message is not that easy to find in the build log

Couldn't hurt! Even so, in the example above, even once I found the error, I was confused as to why it would be appearing.

All that to say: sounds like the problems don't have one easy fix. I'll close this issue since it isn't actionable, but happy to continue the conversation and track issues that make sense to split out of it. Thanks for chiming in, all!

jorisvandenbossche · 2021-02-17T10:13:56Z

The protected branch setting I was thinking of was requiring checks to pass before merging. Sounds like we don't think that (accidentally?) merging pull requests with failing checks is a problem, so that wouldn't necessarily help.

If we know the failure is unrelated (eg a flaky build that is not yet solved), we currently indeed might merge without having a fully green CI. But note that in the specific example given above, the PR that caused master to fail was actually green on the PR. So requiring checks to pass wouldn't have solved such case.

Even so, in the example above, even once I found the error, I was confused as to why it would be appearing.

You ran into a particular strange issue, for which I also don't have an explanation .. Sphinx is notoriously hard (but from my experience, most of the time doc build issues are more easy to understand, at least if you have some familiarity with the pandas doc build)

sounds like the problems don't have one easy fix.

Yeah, and in addition, CI work like tracking down a flaky build is both not the most easy nor satisfactory issue to work on ..
(which doesn't mean we shouldn't think about ways to improve this!)

afeld added Bug Needs Triage Issue that has not been reviewed by a pandas team member CI Continuous Integration Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 16, 2021

afeld closed this as completed Feb 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: prevent false negatives in builds for pull requests #39835

CI: prevent false negatives in builds for pull requests #39835

afeld commented Feb 16, 2021

attack68 commented Feb 16, 2021

MarcoGorelli commented Feb 16, 2021

jreback commented Feb 16, 2021

jorisvandenbossche commented Feb 16, 2021

afeld commented Feb 17, 2021

jorisvandenbossche commented Feb 17, 2021 •

edited

Loading

CI: prevent false negatives in builds for pull requests #39835

CI: prevent false negatives in builds for pull requests #39835

Comments

afeld commented Feb 16, 2021

Problem description

attack68 commented Feb 16, 2021

MarcoGorelli commented Feb 16, 2021

jreback commented Feb 16, 2021

jorisvandenbossche commented Feb 16, 2021

afeld commented Feb 17, 2021

jorisvandenbossche commented Feb 17, 2021 • edited Loading

jorisvandenbossche commented Feb 17, 2021 •

edited

Loading