Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous changes for CI, Docker and compiletest. #45783

Merged
merged 7 commits into from
Nov 10, 2017

Conversation

kennytm
Copy link
Member

@kennytm kennytm commented Nov 5, 2017

This PR contains 7 independent commits that improves interaction with CI, Docker and compiletest.

  1. a4e5c91 — Forces a newline every 100 dots when testing in quiet mode. Prevents spurious timeouts when abusing the CI to test Android jobs.

  2. 1b5aaf2 — Use vault.centos.org for dist-powerpc64le-linux, see [beta] ci: Fix broken link in build-powerpc64le-toolchain.sh  #45744.

  3. 33400fb — Modify src/ci/docker/run.sh so that the docker images can be run from Docker Toolbox for Windows on Windows 7. I haven't checked the behavior of the newer Docker for Windows on Windows 10. Also, "can run" does not mean all the test can pass successfully (the UDP tests failed last time I checked)

  4. d517668 — Don't emit a real warning the linker segfault, which affects UI tests like Fix a quadradic duplication in json for multi-suggestions #45489 (comment). Log it instead.

  5. 51e2247 — During run-pass, trim the output if stdout/stderr exceeds 416 KB (top 160 KB + bottom 256 KB). This is an attempt to avoid spurious failures like add TerminatorKind::FalseEdges and use it in matches #45384 (comment)

  6. 9cfdaba — Force gem update --system before deploy. This is an attempt to prevent spurious error Spurious failure while installing deploy dependencies #44159.

  7. eee10cc — Tries to print the crash log on macOS on failure. This is an attempt to debug Spurious LLDB segfault (signal 11) #45230.

@kennytm kennytm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Nov 5, 2017
.travis.yml Outdated
@@ -286,6 +286,7 @@ before_deploy:
rm -rf obj/build/dist/doc &&
cp -r obj/build/dist/* deploy/$TRAVIS_COMMIT;
fi
- gem update --system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This presumably accesses network, so needs to be put in a retry loop of some sort.

Copy link
Member Author

@kennytm kennytm Nov 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add a travis_retry after CI completed.

@@ -670,12 +670,6 @@ fn link_natively(sess: &Session,
if !out.contains(msg) {
break
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this intentionally removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mark-Simulacrum Yes, it is explained in d993ffe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I wonder if we can instead send the warning to stdout or something -- non ideal, I agree, but I feel like saying nothing, always, isn't a good idea. At least maybe adding a call to info!()? @alexcrichton probably has some thoughts here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mark-Simulacrum Since the stdout is also tested in the UI test, probably not possible either. Printing to the log crate should be good since logs are not enabled by UI test, will do that after CI is completed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm sort of ok with whatever. I was originally very wary to do something like this without notifying the user as it seems like it could baloon compilation times or otherwise be very odd if it happens silently by accident (e.g. we retry by accident too many times).

Could we perhaps only ignore emitting this if we're on CI or in UI testing mode? Because this happens so rarely I'd still want, by default, a user to see what happened and say "wait what that's not supposed to happen!" and then send a bug report to us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton The linker segfault retry won't be enabled unless RUSTC_RETRY_LINKER_ON_SEGFAULT=1, so this is already CI-specific.

@@ -71,6 +71,7 @@ use std::thread;
use std::time::{Instant, Duration};

const TEST_WARN_TIMEOUT_S: u64 = 60;
const QUIET_MODE_MAX_COLUMN: usize = 100; // insert a '\n' after 100 tests in quiet mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, I thought we printed "test xxx has run longer than 60s..." and that'd flush stdout, right? Or is that not getting printed in quiet mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton The problem is that in quiet mode all dots were in a single line. We have >2000 run-pass tests, and only 30 minutes to spend, which means every test must complete within <0.9s to not to trigger Travis.

(Alternatively, we could fix stamp to output the dot immediately, instead of waiting for a \n.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear that's a good point! Sounds like we should definitely keep this.

I'd also be ok changing stamp though if you'd prefer. I'd imagine that we could basically print the timestamp for the first byte of each line instead of for the newline character of each line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of updating stamp too, but that would require total rewrite in nonblocking mode I think.

Printing the timestamp when seeing the first byte of a line doesn't work, because Travis will still need to wait for the second to the last byte of the line. So stamp actually needs to do something like

loop {
    let output = await!(stdin.read_as_much_as_possible());
    if output.is_eof() { break; }
    println!("{}", output.text().replace("\n", timestamp));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'll send a PR, I don't think we'll have to do this but I'll see if I can game it out.

// option. This file may not be copied, modified, or distributed
// except according to those terms.

// FIXME: This is a complete copy of `cargo/src/cargo/util/read2.rs`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option (which Cargo originally did) is to just spawn threads (libstd actually used to do this too). This was only switched to read2 as an optimization later, but compiletest can probably get away with threads and not worry to much about perf if you're looking to fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton Well copy-and-paste here is simpler than writing new code :p

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

@alexcrichton
Copy link
Member

This looks awesome, thanks so much @kennytm!

@kennytm kennytm force-pushed the compiler-test-fixes branch 6 times, most recently from 0977c46 to ed92102 Compare November 5, 2017 19:39
@kennytm
Copy link
Member Author

kennytm commented Nov 5, 2017

@bors try

Test if the "gem update" thing is successful.

@bors
Copy link
Contributor

bors commented Nov 5, 2017

⌛ Trying commit ed9210264c924ec6ff13e3399557f6221f4d1b2d with merge 2cb674002e7cbd7184dd490343ae2415d3064720...

@bors
Copy link
Contributor

bors commented Nov 5, 2017

💔 Test failed - status-travis

@kennytm kennytm force-pushed the compiler-test-fixes branch from ed92102 to e88adea Compare November 5, 2017 19:48
@bors
Copy link
Contributor

bors commented Nov 5, 2017

⌛ Trying commit e88adeab9231f3e8301068d22df6faab4160f9ff with merge ee9cdb582535812f9ba1b785a46c3c954ddeb585...

@bors
Copy link
Contributor

bors commented Nov 5, 2017

☀️ Test successful - status-travis
State: approved= try=True

Rationale:

We use --quiet mode when testing a PR in the CI. Also, we use `stamp` to
prefix every line with a timestamp. Previously, when testing in --quiet
mode, we will only print a dot for each test without any line breaks.
Combined with `stamp`, this means we'd need to wait for all tests to
complete before writing the output. On Travis CI, if we don't print
anything within 30 minutes, the job will be forcefully canceled. This
makes it very easy to spuriously-timeout when testing non-default images
like arm-android using the CI. This commit tries to workaround the issue
by printing a new line every 100 dots, forcing `stamp` to emit something
to reset Travis's countdown.
This commit reverts rust-lang#45734 and applies rust-lang#45744. We expect the vault links
to be more stable than mirror.centos.org.
This is intended to prevent the spurious OOM error from
run-pass/rustc-rust-log.rs, by skipping the output in the middle when the
size is over 416 KB, so that the log output will not be overwhelmed.
@kennytm kennytm force-pushed the compiler-test-fixes branch from e88adea to eee10cc Compare November 5, 2017 19:56
@kennytm kennytm changed the title [WIP] Miscellaneous changes for CI, Docker and compiletest. Miscellaneous changes for CI, Docker and compiletest. Nov 5, 2017
@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 5, 2017
@kennytm kennytm removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Nov 5, 2017
@kennytm
Copy link
Member Author

kennytm commented Nov 6, 2017

r? @alexcrichton

@alexcrichton
Copy link
Member

@bors: r+

Thanks!

@bors
Copy link
Contributor

bors commented Nov 6, 2017

📌 Commit eee10cc has been approved by alexcrichton

@kennytm kennytm added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 7, 2017
@bors
Copy link
Contributor

bors commented Nov 10, 2017

⌛ Testing commit eee10cc with merge 32cd584965c0d8121d87d2f39c07bc07a268e104...

kennytm added a commit to kennytm/rust that referenced this pull request Nov 10, 2017
…crichton

Miscellaneous changes for CI, Docker and compiletest.

This PR contains 7 independent commits that improves interaction with CI, Docker and compiletest.

1. a4e5c91 — Forces a newline every 100 dots when testing in quiet mode. Prevents spurious timeouts when abusing the CI to test Android jobs.

2. 1b5aaf2 — Use vault.centos.org for dist-powerpc64le-linux, see rust-lang#45744.

3. 33400fb — Modify `src/ci/docker/run.sh` so that the docker images can be run from Docker Toolbox for Windows on Windows 7. I haven't checked the behavior of the newer Docker for Windows on Windows 10. Also, "can run" does not mean all the test can pass successfully (the UDP tests failed last time I checked)

4. d517668 — Don't emit a real warning the linker segfault, which affects UI tests like rust-lang#45489 (comment). Log it instead.

5. 51e2247 — During run-pass, trim the output if stdout/stderr exceeds 416 KB (top 160 KB + bottom 256 KB). This is an attempt to avoid spurious failures like rust-lang#45384 (comment)

6. 9cfdaba — Force `gem update --system` before deploy. This is an attempt to prevent spurious error rust-lang#44159.

7. eee10cc — Tries to print the crash log on macOS on failure. This is an attempt to debug rust-lang#45230.
@kennytm
Copy link
Member Author

kennytm commented Nov 10, 2017

@bors retry

GitHub may be broken. The Travis build already passed 3 hours ago but the status is still yellow. (Putting this into rollup #45907 since it already passed)

bors added a commit that referenced this pull request Nov 10, 2017
Rollup of 9 pull requests

- Successful merges: #45783, #45856, #45863, #45869, #45878, #45882, #45887, #45895, #45901
- Failed merges:
@bors
Copy link
Contributor

bors commented Nov 10, 2017

⌛ Testing commit eee10cc with merge c0d326f...

@bors bors merged commit eee10cc into rust-lang:master Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants