Consider droping off throttling in Ingestion Server filtered index creation #3977
Labels
💻 aspect: code
Concerns the software code in the repository
✨ goal: improvement
Improvement to an existing user-facing feature
🟨 priority: medium
Not blocking but should be addressed soon
🧱 stack: ingestion server
Related to the ingestion/data refresh server
🔧 tech: elasticsearch
Involves Elasticsearch
🐍 tech: python
Involves Python
Problem
Due to previous performance problems with Elasticsearch (ES), a
requests_per_second
limit was settled to prevent index creation from affecting the search performance (#2975). Since indexes are created before promoting them to live usage, this feature should no longer be necessary.openverse/ingestion_server/ingestion_server/indexer.py
Lines 495 to 500 in a675f46
The ES CPU usage has been consistently below 50% for several weeks. Looking at the graph below, you can notice the highest peaks correspond to the creation of the image index and the following one to the capped filtered index. Letting ES autoregulate the number of items to ingest should optimize resource use and speed up the process.
Description
Remove the
requests_per_second
setting from thereindex
call in the previously shown code block.Alternatives
The other option is to keep trying to reach a number for
ES_FILTERED_INDEX_THROTTLING_RATE
that allows more ingestions per second without compromising the cluster stability.The text was updated successfully, but these errors were encountered: