Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ep_item_sync_kill filter breaks indexing #2755

Closed
juliecampbell opened this issue May 11, 2022 · 4 comments · Fixed by #2761
Closed

Using ep_item_sync_kill filter breaks indexing #2755

juliecampbell opened this issue May 11, 2022 · 4 comments · Fixed by #2761
Assignees
Labels
confirmed bug module:sync Issues related to the Sync functionality
Milestone

Comments

@juliecampbell
Copy link

We are working on upgrading a site from using Elasticpress 3.6.5 to 4.1.0. Before the upgrade, Elasticpress/Elasticsearch worked 100%. After the upgrade, the indexer is only indexing a small subset of the posts that we would expect to be indexed.

Command used for indexing:
wp elasticpress index --setup --nobulk --show-bulk-errors

Beginning subset of the indexing output:

Indexing with setup option needs to delete Elasticsearch index first, are you sure you want to delete your Elasticsearch index? [y/n] y
Success: Indexing posts...
Success: Mapping sent
Processed posts 0 - 1 of 25577. Last Object ID: 93582
Processed posts 1 - 2 of 25577. Last Object ID: 93581
Processed posts 2 - 3 of 25577. Last Object ID: 93580
Processed posts 3 - 4 of 25577. Last Object ID: 93579
Processed posts 4 - 5 of 25577. Last Object ID: 93578
Processed posts 5 - 6 of 25577. Last Object ID: 93577
Processed posts 6 - 7 of 25577. Last Object ID: 93489
Processed posts 7 - 8 of 25577. Last Object ID: 93488
Processed posts 8 - 9 of 25577. Last Object ID: 93487
Processed posts 9 - 10 of 25577. Last Object ID: 93486
Time elapsed: 3.51
Memory Usage: 39.45mb (Peak: 40.71mb)
Processed posts 10 - 11 of 25577. Last Object ID: 93485
Processed posts 11 - 12 of 25577. Last Object ID: 93484
Processed posts 12 - 13 of 25577. Last Object ID: 93427
Processed posts 13 - 14 of 25577. Last Object ID: 93427
Processed posts 14 - 15 of 25577. Last Object ID: 93427
Processed posts 15 - 16 of 25577. Last Object ID: 93427
Processed posts 16 - 17 of 25577. Last Object ID: 93427
Processed posts 17 - 18 of 25577. Last Object ID: 93427
Processed posts 18 - 19 of 25577. Last Object ID: 93427
Processed posts 19 - 20 of 25577. Last Object ID: 93427
Time elapsed: 3.96 (+0.45)
Memory Usage: 39.57mb (Peak: 40.83mb)

Please note that after item 12 the Last Object ID never changes. Allowed to run, the indexer only indexes 7 posts when previously it indexed thousands. Terms index as expected.

We have done some investigation and found that usage of the ep_item_sync_kill filter is part of this issue. We do not want to index images, certain categories, and some custom post types, so we had added the filter function per the Elasticpress documentation. Disabling our ep_item_sync_kill function allows the indexer to index more than 7 posts.

We have further found that index_meta['current_sync_item']['last_processed_object_id'] is not updated when a post is skipped due to the filter. As a result the first skipped post is continued to be passed into the indexer functions until the indexer thinks it has gone through the appropriate number of posts, when the indexer has mostly just been looking at the same skipped post over and over.

Environment information

  • WordPress version: 5.9.3
@juliecampbell juliecampbell added the bug Something isn't working label May 11, 2022
@felipeelia
Copy link
Member

Hi @juliecampbell,

Thanks for opening the issue. I am partially able to reproduce the problem, but as far as I could test, it seems to be much more related to the --no-bulk flag rather than the usage of the ep_item_sync_kill filter. Do you mind trying to run wp elasticpress index --setup --show-errors instead and letting us know the results? Also, would it be possible for you to share the code being used to skip images, some cpts, etc.?

Regarding the --no-bulk, we've made some changes in the sync algorithm, and some problems that made that necessary may be gone now. You can try wp elasticpress index --setup --show-errors --per-page=2 too?

@felipeelia
Copy link
Member

Actually, I take that back, sorry. It is indeed a mix of the --no-bulk and the ep_item_sync_kill filter.

@felipeelia
Copy link
Member

I think I've fixed the problem in #2761, @juliecampbell. Do you mind giving it a try? I've added a zip there to make it easier to test. Thanks!

@felipeelia felipeelia added confirmed bug module:sync Issues related to the Sync functionality and removed bug Something isn't working labels May 13, 2022
@felipeelia felipeelia self-assigned this May 13, 2022
@felipeelia felipeelia added this to the 4.2.0 milestone May 13, 2022
@juliecampbell
Copy link
Author

Thanks @felipeelia! Adding this fix allows the indexer to index all the expected posts again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed bug module:sync Issues related to the Sync functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants