-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow pruning task #16533
Workflow pruning task #16533
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deanmarcussen I directly committed into your branch to ensure the code in the PR following the standard we have been doing in OC. Also, I used some new extensions to help cleanup the code.
Just few minor requests.
src/OrchardCore.Modules/OrchardCore.Workflows/Views/WorkflowPruning.Fields.Edit.cshtml
Outdated
Show resolved
Hide resolved
src/OrchardCore.Modules/OrchardCore.Workflows/Views/WorkflowPruning.Fields.Edit.cshtml
Outdated
Show resolved
Hide resolved
...OrchardCore.Modules/OrchardCore.Workflows/WorkflowPruning/Services/WorkflowPruningManager.cs
Outdated
Show resolved
Hide resolved
...ardCore.Modules/OrchardCore.Workflows/WorkflowPruning/ViewModels/WorkflowPruningViewModel.cs
Outdated
Show resolved
Hide resolved
…Services/WorkflowPruningManager.cs
…uning.Fields.Edit.cshtml
…uning.Fields.Edit.cshtml
src/OrchardCore.Modules/OrchardCore.Workflows/WorkflowPruning/AdminMenu.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bit too quick of a merge @MikeAlhayek. If there's some larger PR with multiple reviewers requested, please leave some time for the others too. I usually also @mention them to see if they have anything else to add. Just to make sure everyone's on the same page.
@deanmarcussen could you please open a second PR to address my remarks?
@(Model.LastRunUtc.HasValue ? await DisplayAsync(await New.DateTime(Utc: | ||
Model.LastRunUtc.Value, Format: "g")) : T["Never"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This middle of the argument line break is curious.
@(Model.LastRunUtc.HasValue ? await DisplayAsync(await New.DateTime(Utc: | |
Model.LastRunUtc.Value, Format: "g")) : T["Never"]) | |
@(Model.LastRunUtc.HasValue | |
? await DisplayAsync(await New.DateTime(Utc: Model.LastRunUtc.Value, Format: "g")) | |
: T["Never"]) |
model.LastRunUtc = settings.LastRunUtc; | ||
model.Disabled = settings.Disabled; | ||
model.Statuses = | ||
settings.Statuses ?? new WorkflowStatus[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
settings.Statuses ?? new WorkflowStatus[] | |
settings.Statuses ?? |
var prunedQty = await workflowCleanUpManager.PruneWorkflowInstancesAsync( | ||
TimeSpan.FromDays(workflowPruningSettings.RetentionDays) | ||
); | ||
logger.LogDebug("Pruned {PrunedCount} workflow instances.", prunedQty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var prunedQty = await workflowCleanUpManager.PruneWorkflowInstancesAsync( | |
TimeSpan.FromDays(workflowPruningSettings.RetentionDays) | |
); | |
logger.LogDebug("Pruned {PrunedCount} workflow instances.", prunedQty); | |
var prunedQuantity = await workflowCleanUpManager.PruneWorkflowInstancesAsync( | |
TimeSpan.FromDays(workflowPruningSettings.RetentionDays) | |
); | |
logger.LogDebug("Pruned {PrunedCount} workflow instances.", prunedQuantity); |
); | ||
logger.LogDebug("Pruned {PrunedCount} workflow instances.", prunedQty); | ||
|
||
workflowPruningSettings.LastRunUtc = clock.UtcNow; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what in particular do you feel is unnecesary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is writing to a read-only object while the actual update happens a few lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note about the feature to the release 2.0 notes.
|
||
public async Task<int> PruneWorkflowInstancesAsync(TimeSpan retentionPeriod) | ||
{ | ||
var dateThreshold = _clock.UtcNow.AddDays(1) - retentionPeriod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why AddDays(1)
? If the retention period is 3 days, and now is 8 AM on the 10th, then I'd expect those workflows to be deleted which are older or the same age as 8 AM on the 7th. This will only delete ones older than 8 AM on the 8th.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know our guy copied this from the audit trail module (because I cleaned up some left over audit trail comments when I pulled it out), so that logic comes directly from there, if its an issue, needs to be fixed in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so, but I don't understand why this logic would make sense.
var workflowInstances = await _session | ||
.Query<Workflow, WorkflowIndex>(x => x.WorkflowStatus.IsIn(statuses) && x.CreatedUtc <= dateThreshold) | ||
.ListAsync(); | ||
|
||
var total = 0; | ||
foreach (var item in workflowInstances) | ||
{ | ||
_session.Delete(item); | ||
total++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about how this will fair when it's run the first time, with hundreds of thousands of items for one of our customer's site. I'll try it with a snapshot of the DB, but exporting it will take a day, so I'll only be able to report back tomorrow.
My guess is though that it'll time out and in such cases one would need to do manual pruning, gradually with smaller time frames, before letting the background task run here. This would be useful to document too, with a warning that with too many workflows your app may fail when this first runs.
I did a local test, with 260k generated simple dummy workflows with the below code in WorkflowController, using a local SQL Server, and it took less than 2s. That's really promising, though with a remote DB server this will surely be orders of magnitude slower.
[Admin("Workflows/CreateTestWorkflows", "CreateTestWorkflows")]
public async Task<ActionResult> CreateTestWorkflows()
{
var workflowType = await _session.GetAsync<WorkflowType>(61);
for (int i = 0; i < 9999; i++)
{
var workflow = _workflowManager.NewWorkflow(workflowType);
await _workflowStore.SaveAsync(workflow);
}
return Content("<html><body><script>location.reload();</script></body></html>", "text/html");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, yeah we did batching on yessql some time ago, which kinda solved that.
It didn't cause us a problem when we released this code 6 months ago, I would hesitate a guess that we were killing of 100K+ workflows on that initial run, but we have reasonable performance on the boxes too.
May pay to limit it to say 10,000 per run, and they will clean up over a matter of days, rather than that original hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we actually have batching! sebastienros/yessql#310 I totally forgot about this. The default page size for that is 500, so the 860k workflows I mentioned under #16447 should be less than 2000 queries. Even if these take like 5 ms each, which they shouldn't by far, this will only take 10s, which should be safe.
I'll still do a test with the real DB once the export is ready, to see if there are any surprises with some actual data. I'll report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let me know what you find, and I'll get another pr going
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now checked, and it took 90s with a local SQL Server instance to delete ~870k Workflows. Apparently, my mock workflows were too simple to provide an accurate model. And it didn't succeed even locally, despite the connection timeout being larger in the connstring (I don't know actually what happens, because there's no exception, control just never goes to the next line after here).
So, the DB connection will definitely time out for this in prod (where it'll take at least 10x or 100x as long due to the DB server being not local but over the network).
Adding Take(50000)
brings down the runtime of the deletion to ~4s, but it still gets stuck on LoadSiteSettingsAsync
as mentioned above, 10k worked). So, limiting the number here can definitely be a workaround. I'd add an IOptions
config and use something lower like 5k (which takes <1s) as the default. Then, we can advise people to increase this if their hosting environment allows, and to temporarily increase the background task frequency when they first deploy this to an existing site to get rid of the garbage faster (running this once every minute shouldn't be an issue for any proper server, really, the problem is just the timeout of doing too much at once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW let it run with 10k to completion, it worked well.
I totally agree with Zoltan to have some time in PRs, but Mike don't want to make PRs take so long then they will be deactive or ignored over the time, which the same thing I am facing in some of my PRs |
|
||
namespace OrchardCore.Workflows.WorkflowPruning.Services; | ||
|
||
internal sealed class WorkflowStatusBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bad change, whoever added that.
Orchard is suposed to be an extensible framework, and I have far too much code we have to copy in from Orchard because someone decided to make something internal, and it needed tweaking
|
||
builder.Add(S["Configuration"], configuration => configuration | ||
.Add(S["Settings"], settings => settings | ||
.Add(S["Workflow Pruning"], S["Workflow Pruning"], pruning => pruning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to call this "pruning" vs "trimming" what we have for Audit Trail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely just a comment from Seb on the original issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trimming would be better, not to have a new term for the same concept.
Will you open a new PR any time soon @deanmarcussen? Otherwise, I'll tackle the above feedback. |
<div class="form-check"> | ||
<input type="checkbox" class="form-check-input workflow-status" id="@Html.IdFor(m => m.Statuses[i])" | ||
name="@Html.NameFor(x => x.Statuses)" checked="@status[i].IsSelected" value="@status[i].Value"> | ||
<label class="form-check-label" for="@Html.IdFor(m => m.Statuses[i])">@T[status[i].Value]</label> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the meeting today: this doesn't allow the PO extractor to find all the strings, since it's not coming from a hard-coded string. Also, the WorkflowStatus
values are technical identifiers, not necessarily suitable to be displayed to users directly.
So, better would be to list the statuses explicitly here.
yeah where were we at with it @Piedone ? is it just some formatting things, or where there concerns around functionality? |
Well, yes, see the comments above. |
Opened a PR for this: #16565. |
Fixes #11609
One of our guys, Jose, added this feature to our project some time ago, but left before he was able to contribute it back, so doing so on his behalf.
It is similar to the audit trail settings, however you can select a range of workflow statuses to prune.
Currently its part of the main workflows feature, and by default will be on, but on a site where the statuses have not been set, it will set them to all at 90 days when the background task runs.
Will let the community decide what behaviour they want there.
Perhaps it would be better to make it a feature.