Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve post-checkout reindexing #152

Closed
bob2021 opened this issue Jan 5, 2017 · 17 comments
Closed

Improve post-checkout reindexing #152

bob2021 opened this issue Jan 5, 2017 · 17 comments

Comments

@bob2021
Copy link
Contributor

bob2021 commented Jan 5, 2017

Many of the indexers call Mage_Index_Model_Resource_Abstract::disableTableKeys(), which runs a statement like this: ALTER TABLE ... DISABLE KEYS. Then, after some rows are inserted into the index table, enableTableKeys() is called.

As far I can see DISABLE/ENABLE KEYS only works with MyISAM tables, so it isn't used by modern Magento installations. What's worse is in MySQL 5.5+ the ALTER TABLE statement requires a table metadata lock.

As the statement doesn't actually do anything it could be removed and the lock would be avoided (though its possible it might cause problems for older installations that might still have MyISAM tables)

It may seem like it wouldn't have that much impact as the lock should only be for a very short time, but in production I've seen large queues of queries 'Waiting for table metadata lock' on tables like cataloginventory_stock_status and catalog_product_index_price when the ALTER TABLE statement is blocked by some other very slow transaction.

@colinmollenhour
Copy link
Member

Interesting.. Also worth noting is that ALTER TABLE causes in-progress transactions to be implicitly committed. I didn't look to see if these happen in the middle of a transaction or not but I'm definitely in favor of completely removing support for MyISAM wherever possible (see #105 for all reasons).

@LeeSaferite
Copy link
Contributor

LeeSaferite commented Jan 5, 2017 via email

@bob2021
Copy link
Contributor Author

bob2021 commented Jan 30, 2017

Following up on this I've been looking into the indexing that happens during checkout (specifically price & stock). I was seeing a couple of database errors when many orders were placed simultaneously:

  • A deadlock on the event_* tables
  • A duplicate insert error on the price indexer *_tmp tables

There are a few requirements to see this behaviour:

  • The order must bring a product's inventory level below the 'Qty for Item's Status to Become Out of Stock' or 'Notify for Quantity Below'
  • Ordered products must have backorders set to 'No backorders' or 'Allow Qty Below 0 and Notify Customer'
  • The config option 'Display out of stock products' must be set to 'No'

While looking for the solution I found a few issues:

Some database transactions with missing/unsafe try catch blocks (though most are probably harmless)

The rest of the issues are related to Mage_CatalogInventory_Model_Observer::reindexQuoteInventory():

Stock & price indexers are called directly, bypassing the indexer locking and mode check.

  • No locking can cause duplicate insert errors in the price index tmp tables
  • The indexers always run, even if the indexer update mode is set to manual

If a product's stock status is likely to change (ie. it has fallen below the minimum qty) then its stock item is saved

  • The stock items were loaded before the order was placed - they potentially contain stale data
  • Saving the stock item triggers a full reindex (by all indexers) of the ordered products.

That last point is where things get very confusing - products are re-indexed multiple times:

  1. Stock indexer is run directly in reindexQuoteInventory()
  2. Stock indexer is run after the stock item is saved.
    2a. When the stock indexer runs, it also registers a mass_action catalog_product index event.
    2b. Once the stock indexer has finished, it forces the mass_action event to be processed
  3. The price indexer is run directly in reindexQuoteInventory()

Step 2a. results in calls to Mage_Index_Model_Indexer::logEvent(), which causes deadlocks if it is run at the same time as step 2b. (or any other indexer)

So the stock indexer will run 3 times, the price indexer 2 times, and any other indexers set to real time mode will run once.

EDIT: The stock indexer will run 2 times, the price indexer 2 times, and if set to real-time mode these indexers will run once: catalog_product_attribute, catalogsearch_fulltext

I'm just guessing really, but I think the original intention was to make sure the price indexer runs after the stock indexer, because the price index is responsible for hiding out of stock products. But it looks like the logic has been duplicated and is being run twice, once by reindexQuoteInventory() and again by $stockItem->save(). I don't think there is any reason for the other indexers to run, its just a side effect of the mass_action event.

I've created a couple of branches with a proof of concept alternative solution:
added index locking and removed stock item save
removed mass_action index event from stock item save

@bob2021
Copy link
Contributor Author

bob2021 commented Feb 23, 2017

Here's a version I've been using in production

I doubt it can be accepted in its current state, but I'm unable to spend any more time on it so I'm leaving it here in case anyone else is interested.

The main benefits I've seen are:

  • Reduced database errors under heavy checkout load
  • Increased checkout performance (orders/min), especially for backordered/low stock items

@bob2021 bob2021 changed the title avoid ALTER TABLE statement during indexing improve post-checkout reindexing Feb 23, 2017
@colinmollenhour
Copy link
Member

This is excellent work, @bob2021! I don't have time to play with it at the moment but I would love to see this type of improvement get merged. I'd encourage you to go ahead and submit a PR, but just don't expect it to be merged really soon since it is pretty deep.

@bob2021
Copy link
Contributor Author

bob2021 commented Feb 27, 2017

Thanks, I've split it up into a few PRs to try and make it easier

@ProxiBlue
Copy link
Contributor

@colinmollenhour @bob2021

Firstly, thanks for the effort here.

Wanted to share a thought/idea with you, related to this topic, and get some feedback.
If promising (and not WTF are you doing!, are you MAD? results), I will go through effort to get a PR going.

So, my thought is as thus: If a product is not setup to do stock management, then is there any need to even initiate a re-index for stock (and price) for said product? I would think no.
Thus, in a store setup, where stock management is not done at all, or is limited to less products than the total products available, eliminating the final stock/price indexing in this routine (for some/all products in order) should improve matters.

Resulting code/patch that I done to eliminate those products:

diff --git a/docroot/app/code/core/Mage/CatalogInventory/Model/Observer.php b/docroot/app/code/core/Mage/CatalogInventory/Model/Observer.php
index 1e5f1664b..ab0daa8f4 100755
--- a/docroot/app/code/core/Mage/CatalogInventory/Model/Observer.php
+++ b/docroot/app/code/core/Mage/CatalogInventory/Model/Observer.php
@@ -737,6 +737,15 @@ class Mage_CatalogInventory_Model_Observer
             }
         }

+        /**
+         * get product stock setup.
+         * We only want to re-index products that actually have stock management setup
+        **/
+        $stockCollection = Mage::getModel('cataloginventory/stock_item')->getCollection()
+            ->addFieldToFilter('product_id' , array('in' => $productIds))
+            ->addFieldToFilter('manage_stock' , array('eq' => 1));
+        $productIds = $stockCollection->getColumnValues('product_id');
+
         if (count($productIds)) {
             Mage::getResourceSingleton('cataloginventory/indexer_stock')->reindexProducts($productIds);
         }

Thoughts? See any issue I did not consider?

@seansan
Copy link
Contributor

seansan commented Sep 5, 2018

@ProxiBlue @bob2021

Any thoughts on this? Did you maybe find some time to test this / or even run in a production env?

@ProxiBlue
Copy link
Contributor

ProxiBlue commented Sep 5, 2018

@seansan

I have been using this in a production environment for about 7 months now. Not implemented as a patch, but extended the actual observer with a custom module.

We used to have a lot of checkout lock issues, which was identified as stock indexing locking the tables, and caused issues with checkout (ye old lock table timeout error) especially under big load (major promotional periods)

I had actually forgotten about this code, most likely as we have never had any lock timeouts again.

I have this in my custom module for overriding the observer

<sales_model_service_quote_submit_success>
                <observers>
                    <inventory>
                        <type>disabled</type>
                    </inventory>
                    <enjo_inventory>
                        <class>enjo_catalog/inventory_observer</class>
                        <method>reindexQuoteInventory</method>
                    </enjo_inventory>
                </observers>
            </sales_model_service_quote_submit_success>

and my custom class obviously extends the core one

class Enjo_Catalog_Model_Inventory_Observer extends Mage_CatalogInventory_Model_Observer

and my method:

public function reindexQuoteInventory($observer)
    {
        // Reindex quote ids
        $quote = $observer->getEvent()->getQuote();
        $productIds = array();
        foreach ($quote->getAllItems() as $item) {
            $productIds[$item->getProductId()] = $item->getProductId();
            $children = $item->getChildrenItems();
            if ($children) {
                foreach ($children as $childItem) {
                    $productIds[$childItem->getProductId()] = $childItem->getProductId();
                }
            }
        }

        /**
         * get product stock setup.
         * We only want to re-index products that actually have stock management setup
         **/
        $stockCollection = Mage::getModel('cataloginventory/stock_item')->getCollection()
            ->addFieldToFilter('product_id', array('in' => $productIds))
            ->addFieldToFilter('manage_stock', array('eq' => 1));
        $productIds = $stockCollection->getColumnValues('product_id');

        if (count($productIds)) {
            Mage::getResourceSingleton('cataloginventory/indexer_stock')->reindexProducts($productIds);
        }

        // Reindex previously remembered items
        $productIds = array();
        foreach ($this->_itemsForReindex as $item) {
            $item->save();
            $productIds[] = $item->getProductId();
        }
        Mage::getResourceSingleton('catalog/product_indexer_price')->reindexProductIds($productIds);

        $this->_itemsForReindex = array(); // Clear list of remembered items - we don't need it anymore

        return $this;
    }

It is from this I derived the previously noted patch
Note: unless theer is a stock-out sale on the site, none of the products ever have stock management setup. So, for us, stock re-indexing hardly ever runs, which I think is a great improvement.

colinmollenhour pushed a commit that referenced this issue Nov 21, 2018
#562)

* Limit post-checkout stock re-index to only products that manage stock. ref #152

* Restrict collection to just one column in result. Cast id values to int prior to usage in SQL IN statement
edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Dec 6, 2018
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Feb 14, 2019
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
fplantinet pushed a commit to fplantinet/magento-lts that referenced this issue Mar 27, 2019
OpenMage#562)

* Limit post-checkout stock re-index to only products that manage stock. ref OpenMage#152

* Restrict collection to just one column in result. Cast id values to int prior to usage in SQL IN statement
edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Apr 1, 2019
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
rafaelpatro pushed a commit to rafaelpatro/magento-lts that referenced this issue May 14, 2019
OpenMage#562)

* Limit post-checkout stock re-index to only products that manage stock. ref OpenMage#152

* Restrict collection to just one column in result. Cast id values to int prior to usage in SQL IN statement
edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Aug 22, 2019
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Oct 25, 2019
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
@doctea
Copy link

doctea commented Jun 10, 2020

@bob2021 Suffering quite badly with timeouts at checkout in our shop and we have a lot of stock that needs inventory managed. Do you know if your modifications (the one you share as having been run in production) still works OK and is believed safe to put into production?

Thanks for yours (and everyone else's) hard work on figuring this out!

@bob2021
Copy link
Contributor Author

bob2021 commented Jun 11, 2020

Hi @doctea, I left the job where I was working on this a few years ago so I can't say for sure if the changes still work ok. They were certainly running in production at the time I originally created this issue, but unfortunately that doesn't mean they will work well in your specific Magento deployment or that problems weren't found after I left that I just don't know about. From the comments on some of the PRs it seems like these patches have issues with 3rd party indexing modules.

What tables are you seeing lock timeouts on?

@doctea
Copy link

doctea commented Jun 11, 2020

Hi @doctea, I left the job where I was working on this a few years ago so I can't say for sure if the changes still work ok. They were certainly running in production at the time I originally created this issue, but unfortunately that doesn't mean they will work well in your specific Magento deployment or that problems weren't found after I left that I just don't know about. From the comments on some of the PRs it seems like these patches have issues with 3rd party indexing modules.

What tables are you seeing lock timeouts on?

Thanks for getting back to me, @bob2021 ! Seeing sporadic timeout problems in many different tables, mostly customer and sales related. (Perhaps I'm conflating multiple issues but my investigations lead me here anyway -- timeouts seemed to surround order creation/indexing mostly).

I also found some problems when using the MantiCorp_SphinxSearch module (DDL statements can't be executed in a transaction..) but I think I found a way to work around that.

In the end I found that hundreds of downloadable products had stock management enabled unnecessarily, so I've disabled all of those and hopefully that'll solve the issue. I've got a local branch based on your patch ready to try if that doesn't work though :)

Thanks again!

edannenberg pushed a commit to edannenberg/magento-lts that referenced this issue Aug 20, 2020
…ducts that manage stock

* Limit post-checkout stock re-index to only products that manage stock.

* Restrict collection to just one column in result. Cast id values to int
  prior to usage in SQL IN statement

refs: OpenMage#152
@fballiano
Copy link
Contributor

Did this improvement materialize in a PR or did it remain in the discussion stage?

there are many PRs linked to this issue, thanks to the original author ;-)

@kanevbg
Copy link
Contributor

kanevbg commented Mar 15, 2023

@ProxiBlue
Copy link
Contributor

Not 100% correct. This is an overall discussion on the topic. The merge you are refering to was here: #562

@kanevbg
Copy link
Contributor

kanevbg commented Mar 15, 2023

@ProxiBlue Okay, thanks for making it clear.

@addison74 addison74 changed the title improve post-checkout reindexing Improve post-checkout reindexing Aug 11, 2023
@fballiano
Copy link
Contributor

Some PRs were merged, some were closed, I think the whole thing would have to be rediscussed based on the current codebase. For the time being I'll close this. Such wide, complex and old topics are impossible for us to re-start now. In case new PRs arise I'll make sure to test them as soon as possible but leaving this open doesn't help anybody anymore.

@fballiano fballiano closed this as not planned Won't fix, can't repro, duplicate, stale May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants