Project

General

Profile

Actions

Bug #88134

open

Lots a database queries to sys_refindex cause slow page saving

Added by Patrick Broens about 5 years ago. Updated 11 months ago.

Status:
New
Priority:
Must have
Assignee:
-
Category:
Performance
Target version:
-
Start date:
2019-04-10
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
9
PHP Version:
7.3
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

I'm maintaining a news website with intensive use of categories, which are assigned to pages. The amount of categories is not much, but some categories are assigned to a lot of pages. Some categories are assigned to over 10.000 pages.

When saving a page, the saving time can be up to 30 seconds, in which the editor has to wait until he/she can continue work.

This is due to the following:

The category references are stored in the table sys_refindex by the class ReferenceIndex. When adding or removing a reference from a page to a category, the DataHandler calls the ReferenceIndex to update the references for this certain category.

In the ReferenceIndex the following happens:
  • Fetches all current references with the right table name and uid and puts the hash in the array $currentRelationHashes.
  • Recalculates all references to the category with the uid and gives it a sorting integer according to the loading order the page has been found. This loading order defines the sorting integer for all these recalculated references. Since one is inserted somewhere, be it in the beginning, middle or end, all existing references after the new one get a loading order + 1. A hash is calculated with the loading order included.
  • A comparison is made between the hashes in $currentRelationHashes and the recalculated references

Since a lot of recalculated references have got another sorting integer, hashes don't match anymore. Non existing hashes from the recalculated references are inserted in the database, hashes in $currentRelationHashes but not in the recalculated ones are deleted.

In our case it sometimes happens a new page-category reference is added somewhere at the beginning and 10.000 relations after that get a new sorting integer, which will cause 10.000 references deleted and 10.001 (including new reference) are inserted in the database.

Related to this issue is that the deleted references are put in an ->in() statement with 10.000 hashes. At one point the amount of characters put in the ->in() statement will exceed the max_allowed_packet value, causing errors.

This happens both in v8 and v9


Files

typo3-core-refindex.diff (960 Bytes) typo3-core-refindex.diff T K, 2022-11-04 11:59

Related issues 2 (1 open1 closed)

Related to TYPO3 Core - Task #92356: Deferred reference index updatingClosed2020-09-21

Actions
Related to TYPO3 Core - Epic #93547: Collection of problems with large sitesAccepted2021-02-19

Actions
Actions #1

Updated by Patrick Broens about 5 years ago

  • Related to Bug #83160: Remove unused column sys_file_reference.sorting added
Actions #2

Updated by Patrick Broens about 5 years ago

  • Related to deleted (Bug #83160: Remove unused column sys_file_reference.sorting)
Actions #3

Updated by Christian Kuhn over 3 years ago

  • Related to Task #92356: Deferred reference index updating added
Actions #4

Updated by Riccardo De Contardi over 3 years ago

  • Related to Epic #93547: Collection of problems with large sites added
Actions #5

Updated by T K over 1 year ago

Attached is a patch that fixes the problem on TYPO3-v9.

Actions #6

Updated by Hannes Bochmann 11 months ago

I'm facing the same issue although sorting isn't exactly the problem. The proposed patch doesn't change anything except causing duplicated entry exceptions.

In my case the problem is simply the amount of data. I'm running into out of memory errors after a long time not even reaching a timeout. There are ~220000 reference index entries for a given category and around ~120000 related records (the difference is because there is not just one field which holds relations to categories). Saving a record with a relation to said category takes ages before running into a out of memory error.

I guess there is not much to tune like adding indexes to the database table or is there? Any ideas are appreciated as raising timeouts and memory limits just doesn't feel right.

Actions

Also available in: Atom PDF