Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

[feature request] Skip rollover on non-write index #261

Open
jgprogramming opened this issue Jul 28, 2020 · 10 comments
Open

[feature request] Skip rollover on non-write index #261

jgprogramming opened this issue Jul 28, 2020 · 10 comments
Labels
enhancement An improvement on the existing feature’s functionalities

Comments

@jgprogramming
Copy link

Hey,

We've noticed a small issue during rollover action. If we rollover an alias manually (for whatever reason) we see that we have multiple managed indices in ISM trying to rollover. This causes alias to rollover (based on time) more often than it should.

Ideally, ISM would either:

  • throw an error when rolling over alias due to non-write index ISM policy
  • provide configuration to skip rollover if it was already done and managed index is not the write index.

Regards

@jgprogramming jgprogramming added the enhancement An improvement on the existing feature’s functionalities label Jul 28, 2020
@jgprogramming
Copy link
Author

Concrete example of this problem,
image

Previously we've rolled over this index multiple times and let it run for a few hours. Right now each index in hot creates another rollover action.

@dbbaughe
Copy link
Contributor

dbbaughe commented Aug 3, 2020

Hi @jgprogramming,

If you have a managed index on the rollover action and you manually rollover the alias I would expect it to fail when it executes rollover as it's not the write alias. Is that not happening?

To be more specific it should fail when it actually executes rollover. We moved the condition checking to be done prior to the rollover API call so that we could add our own extra conditions to it in the future. So the current implementation does something like [Evaluate conditions] -[true?]-> [Call Rollover API with no conditions].

We can look into exposing errors that will happen during the rollover API call before evaluating the conditions.

@jgprogramming
Copy link
Author

Hey @dbbaughe , no, it is not happening. See above screenshot, we have multiple manually rolled over indices that continue to rollover every day or so, based on age condition.

I suspect it it because rollover step only checks if index was rolled over by ISM (https://github.com/opendistro-for-elasticsearch/index-management/blob/master/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexstatemanagement/step/rollover/AttemptRolloverStep.kt#L55) and if it was rolled over manually, this flag is not going to be set in managedIndexMetaData.

After that, you rollover on an alias (https://github.com/opendistro-for-elasticsearch/index-management/blob/master/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexstatemanagement/step/rollover/AttemptRolloverStep.kt#L115) which doesn't throw an error ever, even if index is not a write-index.

I'd suggest to:

  1. Create a check after condition check to see if managed index is the write index of the alias
  2. Throw an error if managed index is not the write index for the alias
  3. Add a boolean parameter to skip rollover instead of throwing error.

I'm suggesting the additional parameter because I don't see a reason to block pipeline if index is already in a state that we wanted it to be in - why would we throw an error during rollover because index was already rolled over.

@dbbaughe
Copy link
Contributor

dbbaughe commented Aug 4, 2020

Ah, I forgot it was rolling over on the alias directly. That makes sense then.

For your suggestions if we did add an optional boolean to skip in this scenario, from the sounds of it your team would always have this enabled?

Trying to think of ways this could behave badly.. off the top of my head if someone had an index (and only one index) with the alias then it automatically treats it as the write index without explicitly having to specify it. And then if someone or some system accidentally created a new index also with that alias then neither would be the write index and your policy might move forward by accident. Perhaps having the check for skipping also confirm that there is a concrete write index for the alias before allowing skipping. Do you see anyway that would negatively affect your scenario?

As for the check if the current index is the write index and throwing an error. Any reason you're suggesting it go after the condition check? My first thought was it would be the first thing to check before even evaluating any conditions as it might be an error you want to fix instead of potentially not being notified of it for a long time (if the condition is large).

@jgprogramming
Copy link
Author

jgprogramming commented Aug 4, 2020

I think it would be good if checked for existence for actual write index for an alias before skipping but I don't see how it could affect our usecase except for some kind of node failure where we only have one replica on write index (which again, could happen so it's good you bring it up).

You're right, looking at code I thought it was more natural to put this check right before rolling over but it would lead to failure in some scenarious, for example when conditions are only based on index size.

For starters, I think we can just throw an error when rolling index that is not the write index as that alone would allow us to fix our issues with some cron for policy restarting and we can think about adding the parameters later. If you'd like I can prepare a PR for that.

@dbbaughe
Copy link
Contributor

dbbaughe commented Aug 5, 2020

Hey @jgprogramming ,

Yes, it would be great if you could get a PR started.

Thanks!

@jgprogramming
Copy link
Author

PR is ready, let me know what you think.

@dbbaughe
Copy link
Contributor

Hey @jgprogramming,

Left some comments on PR, if you're able to get to it this week that would be great as we are planning to merge in some changes that will require some merge conflicts for any outstanding PRs.

Thanks!

@jgprogramming
Copy link
Author

Hey @dbbaughe ,

Thank you for your comments, I didn't have capacity to resolve them yet. I'll try to do it during this weekend but if I'm not able I'm fine with merging this PR with any changes that you introduce, so no need to wait for me.

@jsirianni
Copy link

This would be very nice to have. My team performs manual rollover sometimes, and we expect that the old index would not be rolled over by ISM.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement on the existing feature’s functionalities
Projects
None yet
Development

No branches or pull requests

3 participants