Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extensible background build discarders #4368

Merged
merged 12 commits into from
Feb 14, 2020

Conversation

daniel-beck
Copy link
Member

@daniel-beck daniel-beck commented Nov 22, 2019

(Description updated 2019-12-15)

This is the extensible implementation of #4336, allowing to pull out regex matching into a plugin.

A new extension point for configuring global build discarders is added to core. These are run:

  1. After a build finishes, for the given job.
  2. Periodically.

It provides two simple implementations of a new extension point in core:

  • Run the already configured per-project build discarder (can probably be considered a bug fix, and is therefore enabled by default).
  • Run a specific, globally configured build discarder on all jobs.

Additionally, the API for the new extension point BackgroundBuildDiscarderStrategy should be flexible enough to support regexes and other complex criteria as discussed in the original issue #4336.

Screenshot:

Configuration UI:

Screenshot

Manual testing hint: Trigger periodic execution manually using ExtensionList.lookupSingleton(BackgroundBuildDiscarder.class).doRun() in script console.

On classification: Major bug if the lack of periodic execution of the configured build rotator is considered a bug at all, as it could result in data loss with carelessly configured projects relying on the previous behavior.

TODO List:

  • Internalization
  • Basic test coverage
  • Help files
  • Better symbols ("global" is overloaded)
  • Rename classes and update labels as needed since it's no longer just "periodic"

Open Questions (feedback welcome):

  • Should the "Job" Global Build Discarder be implemented differently, without a UI option? If we actually consider it to be a bug fix (see above and below), why would there be a UI that boils down to "[x] fix the bug"?
  • Should we really limit each descriptor to one occurrence? This should make it more difficult to implement more flexible strategies, e.g. the proposed "regex match" one would need its own list of (regex/build discarder) pairs, rather than just one of each as a property, and then have N instances of the extension in the top level list…

Proposed changelog entries

  • Major RFE: Add globally configured build discarders that delete old builds not marked as "keep forever" even if there is no, or a less aggressive, per-project build discarder configured, executed periodically and after a build finishes.
  • Major Bug: Jenkins will by default execute the configured per-project build discarder periodically even if no build is currently finishing. This may delete old builds of projects that got a more aggressive build discarder configuration since the last build was run.
  • RFE: Developer: Add new extension point BackgroundBuildDiscarderStrategy to allow more flexible build discarding strategies for the global build discarder configuration.

Submitter checklist

  • JIRA issue is well described
  • Changelog entry appropriate for the audience affected by the change (users or developer, depending on the change). Examples
    * Use the Internal: prefix if the change has no user-visible impact (API, test frameworks, etc.)
  • Appropriate autotests or explanation to why this change has no tests
  • For dependency updates: links to external changelogs and, if possible, full diffs

Desired reviewers

@awittha (author of #4336)
@res0nance (reviewer of #4336)

Default implementation: Run per-job configured discarder periodically
@daniel-beck daniel-beck added major-rfe For changelog: Major enhancement. Will be highlighted on the top major-bug For changelog: Major bug. Will be highlighted on the top of the changelog developer Changes which impact plugin developers plugin-api-changes Changes the API of Jenkins available for use in plugins. needs-testcase Test automation is required for this pull request work-in-progress The PR is under active development, not ready to the final review labels Nov 22, 2019
strategy.apply(job);
} catch (Exception ex) {
listener.error("An exception occurred when executing " + displayName + ": " + ex.getMessage());
LOGGER.log(Level.WARNING, "An exception occurred when executing " + displayName, ex);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for logging with listener and the Java logger 👍

try {
apply(run);
} catch (IOException|InterruptedException ex) {
// TODO should these actually be caught, or just thrown up to stop applying?
Copy link

@awittha awittha Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, IOException should be logged well, but chomped. One "bad" build in a project with 1000 builds shouldn't block attempts to clean up the other 999.

InterruptedException is a little less-certain: Since this would be happening in the background, what can raise an InterruptedException in a background task? If there's anything that thinks it will stop a BG task by interrupting it, we'd like to honor that semantic.

If it's more of a "well, threads get interrupted sometimes and that checked exception is part of the Java interface," then, chomp & log along with IOException.

@@ -27,7 +27,7 @@ THE SOFTWARE.
-->
<?jelly escape-by-default='true'?>
<j:jelly xmlns:j="jelly:core" xmlns:st="jelly:stapler" xmlns:d="jelly:define" xmlns:l="/lib/layout" xmlns:t="/lib/hudson" xmlns:f="/lib/form" xmlns:i="jelly:fmt">
<j:if test="${it.parent.buildDiscarder!=null and it.canToggleLogKeep()}">
<j:if test="${it.canToggleLogKeep()}">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 for remembering to enable Keep Build Forever button in the case where a explicit build discarder isn't set for the job

this.discarder = discarder;
}

public BuildDiscarder getDiscarder() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: @CheckForNull to catch potential bugs

@timja
Copy link
Member

timja commented Nov 23, 2019

I've tested this with jcasc, it works, but it's not possible to remove build discarders :(

i.e. this:

unclassified:
  globalBuildDiscarders:
    configuredBuildDiscarders:
    - globalBuildDiscarder:
        discarder:
          logRotator:
            artifactDaysToKeepStr: "15"
            artifactNumToKeepStr: "12"
            daysToKeepStr: "11"
            numToKeepStr: "10"
    - "jobBuildDiscarder"

to

unclassified:
  globalBuildDiscarders:
    configuredBuildDiscarders:
    - globalBuildDiscarder:
        discarder:
          logRotator:
            artifactDaysToKeepStr: "15"
            artifactNumToKeepStr: "12"
            daysToKeepStr: "11"
            numToKeepStr: "10"

doesn't remove jobBuildDiscarder

@daniel-beck
Copy link
Member Author

@timja Interesting. Did you try to change some of the values to see whether any kind of update works at all? Or what happens when you go from only globalBuildDiscarder to only jobBuildDiscarder?

@daniel-beck
Copy link
Member Author

Note to self: Come up with some symbols that don't result in

  globalBuildDiscarders:
    configuredBuildDiscarders:
    - globalBuildDiscarder:

@timja
Copy link
Member

timja commented Nov 24, 2019

@timja Interesting. Did you try to change some of the values to see whether any kind of update works at all? Or what happens when you go from only globalBuildDiscarder to only jobBuildDiscarder?

I tried changing values and that worked

@timja
Copy link
Member

timja commented Nov 24, 2019

@timja Interesting. Did you try to change some of the values to see whether any kind of update works at all? Or what happens when you go from only globalBuildDiscarder to only jobBuildDiscarder?

removing was ignored

@daniel-beck
Copy link
Member Author

@timja Thanks for that. Weird.

I basically use the same approach as @Symbol(value="artifactManager") does, to get databinding from f:repeatableHeteroProperty for free. Does the same problem exist there as well (hinting at a problem with core or JCasC rather than this PR)? Implementations there are from https://jenkins.io/doc/developer/extensions/jenkins-core/#artifactmanagerfactory

@timja
Copy link
Member

timja commented Nov 26, 2019

@timja Thanks for that. Weird.

I basically use the same approach as @Symbol(value="artifactManager") does, to get databinding from f:repeatableHeteroProperty for free. Does the same problem exist there as well (hinting at a problem with core or JCasC rather than this PR)? Implementations there are from jenkins.io/doc/developer/extensions/jenkins-core/#artifactmanagerfactory

No the same problem does not exist there,
I installed all 3 plugins,

Configured with this:

unclassified:
  artifactManager:
    artifactManagerFactories:
    - "compressing"
    - azure:
        config:
          container: "abcd"
          prefix: "abc"

And then I updated it to:

unclassified:
  artifactManager:
    artifactManagerFactories:
    - azure:
        config:
          container: "abcdf"
          prefix: "abc"

It removed compressing and updated the azure container config correctly

@timja
Copy link
Member

timja commented Nov 26, 2019

Weird I just tested this PR again and JCasC wise everything is working, (except some symbols would be nice to shorten the names)

@daniel-beck
Copy link
Member Author

@timja

some symbols would be nice to shorten the names

Which symbols are missing/wrong? Based on #4368 (comment) it looks OK to me. The ambiguous double use of "global" I already know and plan to fix.

@timja
Copy link
Member

timja commented Nov 26, 2019

@timja

some symbols would be nice to shorten the names

Which symbols are missing/wrong? Based on #4368 (comment) it looks OK to me. The ambiguous double use of "global" I already know and plan to fix.

original:

unclassified:
  globalBuildDiscarders:
    configuredBuildDiscarders:
    - globalBuildDiscarder:
        discarder:
          logRotator:
            artifactDaysToKeepStr: "15"
            artifactNumToKeepStr: "12"
            daysToKeepStr: "11"
            numToKeepStr: "10"
    - "jobBuildDiscarder"

possibly something like this (if symbols don't clash):

unclassified:
  globalBuildDiscarders:
    configured:
    - global:
        discarder:
          logRotator:
            artifactDaysToKeepStr: "15"
            artifactNumToKeepStr: "12"
            daysToKeepStr: "11"
            numToKeepStr: "10"
    - "job"

Ideally it would be flattened to not need the discarder and configuredBuildDiscarders or configured renamed to something else

but it works so 🤷‍♂

@timja timja mentioned this pull request Nov 28, 2019
4 tasks
@daniel-beck
Copy link
Member Author

daniel-beck commented Dec 8, 2019

This update now applies the globally configured build discarders after a build finishes so they're no longer only running once an hour. Labels (and probably class names…) are not yet updated accordingly and will still need to be.

Naming is hard 😦

@daniel-beck daniel-beck removed the needs-testcase Test automation is required for this pull request label Dec 8, 2019
@daniel-beck
Copy link
Member Author

Ideally it would be flattened to not need the discarder and configuredBuildDiscarders

@timja
I don't think the former is possible at all. The latter would require the extension class itself to be a List, which doesn't look like a great approach. Or how else would I do this?

@timja
Copy link
Member

timja commented Dec 8, 2019

Ideally it would be flattened to not need the discarder and configuredBuildDiscarders

@timja
I don't think the former is possible at all. The latter would require the extension class itself to be a List, which doesn't look like a great approach. Or how else would I do this?

Hmm, been taking a look around,
I can't see a way, it looks like it just comes down to naming to make it sensible.

This is what the configuration now looks like:

unclassified:
  buildDiscarders:
    configuredBuildDiscarders:
      - "jobBuildDiscarder"
      - globalBuildDiscarder:
          discarder:
            logRotator:
              daysToKeepStr: "7"
              numToKeepStr: "10"

@timja timja requested a review from fcojfernandez December 16, 2019 00:05
@timja
Copy link
Member

timja commented Dec 16, 2019

Requested review from everyone who reviewed the previous PR #4336

String displayName = strategy.getDescriptor().getDisplayName();
listener.getLogger().println("Offering " + job.getFullName() + " to " + displayName);
if (strategy.isApplicable(job)) {
listener.getLogger().println(job.getFullName() + " accepted by " + displayName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly concerned that this might be excessive logging on very large instances

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but it's all going to a rotated set of log files, so will never fill up a disk (except when it's tiny to begin with):

if (f.isFile()) {
if ((lastRotateMillis + logRotateMillis < System.currentTimeMillis())
|| (logRotateSize > 0 && f.length() > logRotateSize)) {
lastRotateMillis = System.currentTimeMillis();
File prev = null;
for (int i = 5; i >= 0; i--) {
File curr = i == 0 ? f : new File(f.getParentFile(), f.getName() + "." + i);
if (curr.isFile()) {
if (prev != null && !prev.exists()) {
if (!curr.renameTo(prev)) {
logger.log(getErrorLoggingLevel(), "Could not rotate log files {0} to {1}",
new Object[]{curr, prev});
}
} else {
if (!curr.delete()) {
logger.log(getErrorLoggingLevel(), "Could not delete log file {0} to enable rotation",
curr);
}
}
}
prev = curr;
}
}

Copy link
Contributor

@res0nance res0nance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks fine to me

@daniel-beck daniel-beck requested a review from a team December 16, 2019 09:59
@daniel-beck
Copy link
Member Author

Added another Open Question to the PR comment for reviewers' consideration.

try {
apply(run);
} catch (IOException|InterruptedException ex) {
// TODO should these actually be caught, or just thrown up to stop applying?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever you catch an InterruptedException, it's usually a good idea to re-set the interrupt flag again via Thread.currentThread().interrupt(); Otherwise, you can end up with hung threads if they continue running (which in this case, it very well could).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would not actually want to catch the InterruptedException here? The idea was that if there's something wrong with deleting one build's artifacts, e.g. it taking too long to delete a billion files, an initial abort would continue with the next build, rather than just stop this process entirely.

Obviously the downside of that is that it's fairly difficult to really interrupt this thread while it's iterating here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fairly difficult to really interrupt this thread while it's iterating here.

What processes will intentionally attempt such an interrupt? Is there any jenkins-initiated or user-initiated workflow that can target a specific AsyncPeriodicWork and try to stop it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the interrupt flag doesn't re-throw the interrupted exception btw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to leave a comment about this and noticed that I already did.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniel-beck Are you planning on making this change? either not catching the InterruptedException or setting the interrupt flag?

@jvz jvz self-requested a review December 16, 2019 19:50
@@ -0,0 +1,3 @@
blurb = Build discarders configured for a job are only run after a build finishes. \
This option runs jobs'' configured build discarders periodically, applying configuration changes even when no new builds are run. \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applying configuration changes even when no new builds are run.

This wording seems a bit confusing to me. From where will configuration changes that can be applied to the job come, if not from the job itself?

What kind of configuration change could exist but in an "unapplied" state, that this option would apply?

Should it read

applying build discarder rules even when no new builds are run

or is it really talking about job configuration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or is it really talking about job configuration?

Well, build discarder rules are a part of the job configuration. So in a sense, yes.

Currently, you'd need to run a new build to apply a changed build discarder configuration, since it's only consulted after a build finishes. So without new builds, you can change this option and it doesn't have any consequences. For an obsolete/archived job, it would never actually delete builds/artifacts.

Perhaps

…applying changes to a job's build discarder configuration even when…

?


@Override
public void apply(Job<?, ?> job) throws IOException, InterruptedException {
job.logRotate();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job is a non-final, public, abstract class, and logRotate() is a non-final, public method on Job - and neither are annotated with @NoExternalUse.

It is not guaranteed that job.logRotate() will actually use the build discarder that was checked in isApplicable(...). This may cause the documentation around this step to fail to align with what actually happens.

Is there a reason to prefer job.logRotate() over job.getBuildDiscarder().perform( job )?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think the expectation is that #logRotate() not be overridden, but it's not enforced, true.

I am unsure which is the more correct option.

#logRotate() is called by Run#execute(RunExecution), so an argument can be made that this is the expected way to rotate logs, and if the job overrides it in a way that differs from what getBuildDiscarder().perform(…) does, #logRotate() should be take precedence.

OTOH, the way this feature is explained on the UI would indicate we want getBuildDiscarder().perform(…) instead. Obviously, much of that is by necessity.

private static final Logger LOGGER = Logger.getLogger(BackgroundGlobalBuildDiscarder.class.getName());

public BackgroundGlobalBuildDiscarder() {
super("Periodic background build discarder"); // TODO i18n
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the TODO required for merge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I'm not sure we're even localizing these.

@timja
Copy link
Member

timja commented Dec 20, 2019

Should we really limit each descriptor to one occurrence?

Would be good to allow multiple so that I.e a regex based extension could have different rules for master branch and other branches / prs

@daniel-beck
Copy link
Member Author

Would be good to allow multiple so that I.e a regex based extension could have different rules for master branch and other branches / prs

This is still possible, just requires more complex individual discarders (as they'd need to support N patterns/rules instead of 1).

Forgot to mention a downside: Especially the default (unconfigurable) jobBuildDiscarder could be added multiple times, possibly leading to user confusion; it seems to me it also potentially be more difficult to write build discarders that have rules that depend on a larger context (e.g. builds across jobs).

@timja
Copy link
Member

timja commented Dec 20, 2019

hmm I guess that would be fine

@oleg-nenashev oleg-nenashev self-requested a review December 21, 2019 01:16
@timja timja requested a review from a team January 7, 2020 08:24
@timja timja added the needs-more-reviews Complex change, which would benefit from more eyes label Jan 7, 2020
@daniel-beck
Copy link
Member Author

hmm I guess that would be fine

Could you clarify which behavior is preferable in your opinion?

@timja
Copy link
Member

timja commented Jan 9, 2020

More complex individual discarders is fine,
so that you don’t get multiples of discarders that shouldn’t be added multiple times

Copy link
Member

@jvz jvz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Would certainly like to see this feature in production where disk space errors like to hang out.

try {
apply(run);
} catch (IOException|InterruptedException ex) {
// TODO should these actually be caught, or just thrown up to stop applying?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to leave a comment about this and noticed that I already did.


f = namespace(lib.FormTagLib)

f.section(title: _("Global Build Discarders")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: does this view really need to be Groovy? Seems fairly trivial to convert to Jelly.

@timja
Copy link
Member

timja commented Jan 28, 2020

@daniel-beck are the questions in the PR description still open / needing feedback?

Open Questions (feedback welcome):

Should the "Job" Global Build Discarder be implemented differently, without a UI option? If we actually consider it to be a bug fix (see above and below), why would there be a UI that boils down to "[x] fix the bug"?
Should we really limit each descriptor to one occurrence? This should make it more difficult to implement more flexible strategies, e.g. the proposed "regex match" one would need its own list of (regex/build discarder) pairs, rather than just one of each as a property, and then have N instances of the extension in the top level list…

There's 4 approvals on this, just checking if everything is in order to move forward on it

@daniel-beck
Copy link
Member Author

@timja

are the questions in the PR description still open / needing feedback?

I have received no responses to the first question. I'm fairly OK with the way it works, just flagging it for attention if someone cares. It doesn't look like it so far.

I have only received one response to the second question (from you), and it was approving of the current implementation after a short conversation. FWIW this can always be opened up further, but restricting would be difficult.

There's one unaddressed feedback of the code catching an InterruptedException when it shouldn't, but have held off addressing that until I got more feedback to limit iteration count. It doesn't look like that will be forthcoming, so perhaps it's time to address that so this can be merged.

@timja timja added ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback and removed needs-more-reviews Complex change, which would benefit from more eyes labels Feb 8, 2020
@timja
Copy link
Member

timja commented Feb 8, 2020

There appears to be no outstanding issues and 4 approvals, looks like it's time to get this in.

After a 24 hour timeout for last minute feedback, this can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developer Changes which impact plugin developers major-bug For changelog: Major bug. Will be highlighted on the top of the changelog major-rfe For changelog: Major enhancement. Will be highlighted on the top plugin-api-changes Changes the API of Jenkins available for use in plugins. ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback squash-merge-me Unclean or useless commit history, should be merged only with squash-merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants