Speed Magento 2 Indexer optimization

Hello,

The biggest bottleneck when you are using Magento 2 application is when your database grows. A huge number of products, orders, invoices, quotes, attributes, categories all can make application resources demanding. On top of that, the application was designed as a "single" threaded by default :( which adds slowness even more. To fix these parts and make Magento 2 indexing process faster again we will talk today about something which is not new in Magento 2 (started with 2.2.6 version), but somehow silently announced and still not used by default for unknown reasons. Many indexers (like catalog price for example) are now scoped and multithreaded.

Magento 2 explained really nicely on https://devdocs.magento.com/guides/v2.4/extension-dev-guide/indexer-optimization.html page how to optimize Indexes by adjust batch sizes or translated the amount of data processed per each run. Besides that and configuring scopes which was explained a long time ago on https://devdocs.magento.com/guides/v2.2/config-guide/cli/config-cli-subcommands-index.html#config-cli-subcommands-index-reindex-parallel page, we can question ourselves what if we have many CPU cores available but still process single? Right... :(

The best fix is to edit the env.php file and add the following line:

'MAGE_INDEXER_THREADS_COUNT' => (int)`nproc`,  

This is going to fit your deployment to check how many CPU cores are available and use them all where possible.

I have tested on fresh Magento 2.4.3 installation using Performance Fixtures (medium.xml) and here are results:

  Magento System Information  


+------------------+----------------------------------+
| name             | value                            |
+------------------+----------------------------------+
| Name             | Magento                          |
| Version          | dev-2.4-develop                  |
| Edition          | Community                        |
| Root             | /workspace/magento2gitpod        |
| Application Mode | default                          |
| Session          | redis                            |
| Crypt Key        | 63fc9be49098a73ac467066d8c798b98 |
| Install Date     | Wed, 29 Sep 2021 18:12:03 +0000  |
| Cache Backend    | Cm_Cache_Backend_File            |
| Vendors          | Magento                          |
| Attribute Count  | 393                              |
| Customer Count   | 2000                             |
| Category Count   | 302                              |
| Product Count    | 40000                            |
+------------------+----------------------------------+

Without:

real    4m22.295s  
user    0m54.003s  
sys     0m6.056s  

With option added to env.php file:

real    2m38.801s  
user    0m59.741s  
sys     0m7.201s  

Entire logic is grabbed from the Magento Indexer module or specific Model/ProcessManager.php file

alt

Then checked if possible to raise Threads here:
alt

Conclusion:
Today we have learned that Magento 2 indexing can be optimized from many aspects. If multiple websites are used then it can be scoped by website type. You can also set dimensions per customer group especially useful for B2B type of store, but there is also a way to speed up things by executing indexing in parallel by raising the number of threads that your server allows from nproc variable from Linux.

Hope this will help someone struggling. Wishing you all the best until the next article. Salute o/