-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] JIndex: Prevent updates to migrating configs and upgrade tests #36425
[ML] JIndex: Prevent updates to migrating configs and upgrade tests #36425
Conversation
52c56a5
to
7e37a75
Compare
Pinging @elastic/ml-core |
There are intermittent failures testing that configs have been migrated as if there isn't a cluster change event to trigger the migration it will never happen so waiting for migration is redundant. Currently migration is triggered by a change in persistent tasks, as there is a job left open during upgrade re-assigning that task will usually trigger the migration if it happens after the nodes have been upgraded. The problem is that this change rejects updates to jobs that have not been migrated and updates includes opening a job. Opening a job causes a cluster state change which in turn will trigger migration, but we can't pull that trigger as updates are prevented. Therefore I've widened the set of events that can trigger migration to any cluster state change event (this was what we originally discussed). This does make me think that a manual trigger my be required in the event where there are not clusterstate updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change (as you alluded), we may need to add a manual trigger for the migration ml/_upgrade
or something.
Triggering on ALL state changes may work just fine, I honestly don't know enough about how often state changes occur, or if it is feasible for somebody to try and do something with an ML configuration between moving to 6.6 and the cluster state being updated in some way.
b361a7a
to
0b9fcf1
Compare
There are other tests running in the upgrade cluster
0b9fcf1
to
8035089
Compare
Updates to jobs and datafeeds are blocked if the config is eligible for migration. The conditions that satisfy this criteria are:
A 503 service unavailable status code is returned in this situation.
Because of this the rolling upgrade tests which modify a job/datafeed must accept either a successful update or an 503 error as a test pass. This is impossible with the yml tests so I've created
MlMigrationIT
as a more flexible test.