-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[job failure] gce-master-1.8-downgrade-cluster-parallel #56879
Comments
Now tracking against v1.9.0 (kubernetes/sig-release#40) All automated downgrade jobs are failing, this could really use some attention Maybe same issue as #56244 ? |
I think I've fixed issues with the non-parallel one (both node and master downgrade failures), but this seems weird. I think there's an error in how it's configured. From the normal downgrade (https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-master-new-downgrade-cluster/178?log#log): W1211 12:18:26.502] 2017/12/11 12:18:26 util.go:155: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=\[Feature:ClusterDowngrade\] --upgrade-target=ci/k8s-stable1 --report-dir=/workspace/_artifacts --disable-log-dump=true --report-prefix=upgrade
W1211 12:18:26.506] Project: kubernetes-es-logging
W1211 12:18:26.506] Network Project: kubernetes-es-logging
W1211 12:18:26.506] Zone: us-central1-f
W1211 12:18:26.507] Trying to find master named 'bootstrap-e2e-master'
W1211 12:18:26.507] Looking for address 'bootstrap-e2e-master-ip'
I1211 12:18:26.608] Setting up for KUBERNETES_PROVIDER="gce".
W1211 12:18:27.388] Using master: bootstrap-e2e-master (external IP: 35.225.8.199)
I1211 12:18:28.652] Dec 11 12:18:28.652: INFO: Overriding default scale value of zero to 1
I1211 12:18:28.653] Dec 11 12:18:28.652: INFO: Overriding default milliseconds value of zero to 5000
I1211 12:18:28.777] I1211 12:18:28.776762 5867 e2e.go:384] Starting e2e run "64fefedf-de6d-11e7-9b62-0a580a3d0e17" on Ginkgo node 1
I1211 12:18:28.803] Running Suite: Kubernetes e2e suite
I1211 12:18:28.804] ===================================
I1211 12:18:28.804] Random Seed: 1512994707 - Will randomize all specs
I1211 12:18:28.804] Will run 1 of 699 specs From this job's log (https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-master-new-downgrade-cluster-parallel/893?log#log): W1212 01:41:20.197] 2017/12/12 01:41:20 util.go:155: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=\[Feature:ClusterDowngrade\] --upgrade-target=ci/k8s-stable1 --report-dir=/workspace/_artifacts --disable-log-dump=true --report-prefix=upgrade
W1212 01:41:20.199] Project: k8s-jkns-e2e-gce-gci
W1212 01:41:20.200] Network Project: k8s-jkns-e2e-gce-gci
W1212 01:41:20.200] Zone: us-central1-f
W1212 01:41:20.200] Trying to find master named 'bootstrap-e2e-master'
W1212 01:41:20.200] Looking for address 'bootstrap-e2e-master-ip'
I1212 01:41:20.301] Setting up for KUBERNETES_PROVIDER="gce".
W1212 01:41:21.064] Using master: bootstrap-e2e-master (external IP: 35.202.181.15)
I1212 01:41:24.401] Running Suite: Kubernetes e2e suite
I1212 01:41:24.401] ===================================
I1212 01:41:24.402] Random Seed: 1513042881 - Will randomize all specs
I1212 01:41:24.403] Will run 699 specs What worries me is the last line. For some reason, this is running every e2e test we have, which just won't work. edit: config is here https://github.com/kubernetes/test-infra/blob/master/jobs/config.json#L2906 |
[MILESTONENOTIFIER] Milestone Issue Needs Attention @spiffxp @kubernetes/sig-cluster-lifecycle-misc Action required: During code freeze, issues in the milestone should be in progress. Note: This issue is marked as Example update:
Issue Labels
|
@BenTheElder any ideas on the above? -^ |
This was a wild goose chase. That message doesn't mean it's running all the specs... it's just the reporting is changed slightly for parallel runs... I think. |
ACK, meetings all morning, catching up on these things now. I think this probably was flipping on parallel actually, @krzyzacy can you confirm? |
We've (@krousey wrote, I just deployed) rolled out a change that hopefully will be safe and flip these to not run in parallel. It should take effect on any future runs. |
Just to clarify @BenTheElder 's update. The downgrade step won't run in parallel. The tests that follow will still honor the parallel flag. |
Ok from the new logs, I can see that the parallel and non-parallel jobs are getting hung on the same points now. And also helped me quickly debug that my latest fix wasn't sufficient for the test environment. |
thanks @krousey ! |
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-master-new-downgrade-cluster-parallel/904 succesfully downgraded. Also, all tests passed. If this continues overnight, I say we close this issue. |
@krousey awesome! |
That has a separate tracking issue. No need to wait for it.
…On Dec 12, 2017 21:53, "Peter (XiangPeng) Zhao" ***@***.***> wrote:
@krousey <https://github.com/krousey> awesome!
We should also wait for https://k8s-testgrid.appspot.
com/sig-release-master-upgrade#gce-master-1.8-downgrade-cluster to turn
green. But I believe it will :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#56879 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAJlm3fkOfgbaMBysF5E24yM8W4wH8_4ks5s_2ZtgaJpZM4Q3Tpf>
.
|
SGTM :) |
/close Thank you all |
/priority critical-urgent
/priority failing-test
/kind bug
/status approved-for-milestone
@kubernetes/sig-cluster-lifecycle-test-failures
This job has been failing since at least 2017-11-21. It's on the sig-release-master-upgrade dashboard,
and prevents us from cutting [v1.9.0-beta.2] (kubernetes/sig-release#39). Is there work ongoing to bring this job back to green?
https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-master-1.8-downgrade-cluster-parallel
The text was updated successfully, but these errors were encountered: