Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot save/edit node-test-commit-linux-containered CI job #2602

Closed
richardlau opened this issue Mar 31, 2021 · 10 comments
Closed

Cannot save/edit node-test-commit-linux-containered CI job #2602

richardlau opened this issue Mar 31, 2021 · 10 comments
Assignees

Comments

@richardlau
Copy link
Member

Refs: #2601

I'm currently unable to edit/save https://ci.nodejs.org/job/node-test-commit-linux-containered or a copy of it. Neither the Save nor Apply buttons appear to do anything in Edge. On Firefox, the Save button doesn't do anything but Appy pops up an empty error Window:
image

There's a few issues on Jenkins' issue tracker, e.g. https://issues.jenkins.io/browse/JENKINS-65247 which all suggest reviewing https://www.jenkins.io/doc/upgrade-guide/2.277/ and plugin compatibility for the table to div layout migration.

@richardlau
Copy link
Member Author

I can edit other jobs, e.g. https://ci.nodejs.org/job/nodereport-continuous-integration-latest/ and since the recent Jenkins update (#2593) I have been able to edit jobs to replace the Multiple SCMs plugin usage (#2595).

I'll grep through the config for https://ci.nodejs.org/job/node-test-commit-linux-containered and enumerate the plugins and then search to see if any of them have known issues to do with the Jenkins update.

@richardlau
Copy link
Member Author

For investigation:

$ sed -rn 's/^.* plugin="(.*)".*$/\1/p' jobs/node-test-commit-linux-containered.xml | sort -u
build-blocker-plugin@1.7.3
build-timeout@1.20
conditional-buildstep@1.4.1
envinject@2.3.0
git@4.4.5
github@1.32.0
jira@3.1.3
join@1.21
junit@1.48
matrix-combinations-parameter@1.3.1
matrix-groovy-execution-strategy@1.0.7
matrix-project@1.18
metadata@1.1.0b
naginator@1.18.1
node-version-jenkins-plugin@1.0-SNAPSHOT
parameterized-trigger@2.39
periodic-reincarnation@1.13
postbuild-task@1.9
rebuild@1.31
run-condition@1.3
script-security@1.75
throttle-concurrents@2.1
timestamper@1.11.8
$

@rvagg
Copy link
Member

rvagg commented Mar 31, 2021

Confirmed, although I don't see that "Error" box

One thing I can think of that may impact this job is the removal of the Joyent Docker host and therefore its associated Jenkins nodes which were attached to this job. Maybe there wasn't a clean removal of all of those through to this job and it's got some hang-overs that's causing a save conflict? This might have to be resolved by stopping Jenkins, editing config.xml and starting it again. There may also be interesting things in the Jenkins log file when you try to save it.

You have access to the server don't you @richardlau? are you able to have a go at tackling it from that end?

@richardlau
Copy link
Member Author

Yes, I have access to the server. I haven't found anything that looks relevant yet. For example, https://ci.nodejs.org/administrativeMonitor/OldData/manage has entries corresponding to the jobs affected by the removal of the Multi SCMs plugin (#2595) but nothing about this particular job. Also no related errors show up in https://ci.nodejs.org/log/all. I'll keep looking (including directly on the Jenkins server itself).

I'll also compare the plugins in use by the job (#2602 (comment)) to those used by other jobs to see if there's any unique to this job that might warrant further investigation. In the worst case I'll try recreating the job from scratch.

I don't think it's the removal of the Joyent Docker host because I haven't yet removed any of them from Jenkins (#2572) -- they're still there but offline:
image

@rvagg
Copy link
Member

rvagg commented Mar 31, 2021

watching devtools I can see that my button presses aren't even getting to the server, I think there may be a UI problem with this job, maybe from a plugin I had to remove that was being used here?

@richardlau
Copy link
Member Author

Based on the list in #2602 (comment) and https://issues.jenkins.io/secure/Dashboard.jspa?selectPageId=20741 my current suspect is the Join plugin -- it wasn't removed but may have issues with the Jenkins update: https://issues.jenkins.io/browse/JENKINS-64639

@richardlau
Copy link
Member Author

I don't recall if we use the Join plugin in any of the other jobs, I'll investigate further tomorrow (including if it's possible to not use it).

@richardlau
Copy link
Member Author

This is definitely the Join plugin.

I took a copy of node-test-commit-linux-containered as node-test-commit-linux-containered-richard and then verified I was unable to save edits to the copy. I then directly edited /var/lib/jenkins/jobs/node-test-commit-linux-containered-richard/config.xml on the CI server to remove the xml fragment referencing the Join plugin and then reloaded Jenkins' config. I was then able to save edits to node-test-commit-linux-containered-richard.

-    <join.JoinTrigger plugin="join@1.21">
-      <joinProjects/>
-      <joinPublishers>
-        <hudson.plugins.parameterizedtrigger.BuildTrigger plugin="parameterized-trigger@2.39">
-          <configs>
-            <hudson.plugins.parameterizedtrigger.BuildTriggerConfig>
-              <configs>
-                <hudson.plugins.parameterizedtrigger.PredefinedBuildParameters>
-                  <properties>IDENTIFIER=node-test-commit-linux-containered
-STATUS=success
-URL=${BUILD_URL}
-COMMIT=${GIT_COMMIT}
-REF=${GIT_REMOTE_REF}</properties>
-                  <textParamValueOnNewLine>false</textParamValueOnNewLine>
-                </hudson.plugins.parameterizedtrigger.PredefinedBuildParameters>
-              </configs>
-              <projects>post-build-status-update</projects>
-              <condition>UNSTABLE_OR_BETTER</condition>
-              <triggerWithNoParameters>false</triggerWithNoParameters>
-              <triggerFromChildProjects>false</triggerFromChildProjects>
-            </hudson.plugins.parameterizedtrigger.BuildTriggerConfig>
-          </configs>
-        </hudson.plugins.parameterizedtrigger.BuildTrigger>
-      </joinPublishers>
-      <resultThreshold>
-        <name>SUCCESS</name>
-        <ordinal>0</ordinal>
-        <color>BLUE</color>
-        <completeBuild>true</completeBuild>
-      </resultThreshold>
-    </join.JoinTrigger>

@richardlau
Copy link
Member Author

FWIW these are the jobs that use the Join plugin:

$ grep join jobs/*.xml | cut -d ':' -f 1 | sort -u
jobs/node-test-commit-linux-containered.xml
jobs/node-test-commit-linux-mhdawson.xml
jobs/node-test-commit-linux-richardlau.xml
jobs/node-test-commit-linux-sam-github.xml
jobs/node-test-commit-linux.xml
jobs/node-test-commit-v8-linux.xml
jobs/rvagg-test-commit-linux-containered.xml
$

@richardlau richardlau self-assigned this Apr 1, 2021
@richardlau
Copy link
Member Author

richardlau commented Apr 1, 2021

I've manually edited the configs for

to remove the references to the Join plugin and reloaded the Jenkins config.

We've been using the Join plugin on those jobs to post a status update when all of the child jobs in the matrix have finished. Without the Join plugin, posting the status update happens at the end of each child job from the matrix, which leads to a race condition (as they overwrite each other, except in the linux-containered case where we have logic to rename the job name to represent the matrix combination being tested). However this race condition is present in other matrix jobs we have in the CI and for those started by node-test-commit the parent job will still pass/fail and post appropriate status to GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants