Project

General

Profile

Actions

Bug #61814

closed

Caching framework garbage collection scheduler task taking ages with database backend and large websites

Added by Patrick Broens over 9 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Caching
Target version:
-
Start date:
2014-09-23
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
6.2
PHP Version:
Tags:
Complexity:
Is Regression:
No
Sprint Focus:

Description

When using the database backend in the caching framework, having a large website, will have the garbage collection scheduler task take forever. This is related to a few issues:

The scheduler task will call collectGarbage() in Typo3DatabaseBackend. In this method the first query will get the identifiers for records, f.i. cf_cache_hash, where the "expires" field is lower than the current timestamp. The expires field is indexed, but not in the right way. It is a multiple column index, cache_id (identifier,expires). MySQL cannot use the index to perform lookups if the columns do not form a leftmost prefix of the index. Since the SELECT statement only has the "expires" field in the WHERE, the index is not used. When having a large amount of records in CF tables, like cf_cache_hash, this will take a long time.

The query above will get the identifiers for all records found. In a huge installation this can be a large amount of identifiers. These are all put in an IN clause to delete tag rows connected to expired cache entries. In this case the InnoDB buffer pool can be too small or it's possible you do not have enough physical memory to hold your working set. This can be avoided by using a multiple DELETE by using joins.

The solution would be to optimize the method collectGarbage(), combining the queries into one multi table DELETE statement using joins and redefining the keys for the caching tables. Furthermore performance tests should be done with large amounts of records in the caching tables.

In my case the site is holding over 100.000 pages, having a default caching time of 2 hours. A side effect of the above is that the cache is filled faster than the garbage collection task can delete, getting an ever growing database.


Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #77204: Typo3DatabaseBackend flushByTag() and collectGarbage()Closed2016-07-21

Actions
Precedes TYPO3 Core - Bug #70928: Garbage collection of entries without tags failClosed2015-10-21

Actions
Actions #1

Updated by Patrick Broens over 9 years ago

  • Description updated (diff)
Actions #2

Updated by Gerrit Code Review over 9 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #3

Updated by Stefan Froemken over 9 years ago

Hello t3batman :-)

Thank you for that information, that was a pretty good hint. I have added a new index with new ordering. Maybe you're interested to test that patch.
Nice greetings from germany.

Frömmi (Stefan)

Actions #4

Updated by Gerrit Code Review over 9 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #5

Updated by Gerrit Code Review over 9 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #6

Updated by Stephan Großberndt over 9 years ago

Patrick, could you please test if the current patch works better for you?

Actions #7

Updated by Patrick Broens over 9 years ago

Hi Stefan and Stephan,

Are you sure the index does not need to be changed?

Actions #8

Updated by Patrick Broens over 9 years ago

Did my check with an EXPLAIN

id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra
'1','SIMPLE','cf_cache_hash_tags','ALL','cache_id',NULL,NULL,NULL,'10',''
'1','SIMPLE','cf_cache_hash','ref','cache_id','cache_id','252','incubator.cf_cache_hash_tags.identifier','1','Using where'

Seems to be good. For cf_cache_hash the index is used.

Actions #9

Updated by Gerrit Code Review over 9 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #10

Updated by Gerrit Code Review over 9 years ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #11

Updated by Gerrit Code Review over 9 years ago

Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #12

Updated by Gerrit Code Review over 9 years ago

Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #13

Updated by Gerrit Code Review over 9 years ago

Patch set 8 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #14

Updated by Gerrit Code Review over 9 years ago

Patch set 9 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/33254

Actions #15

Updated by Gerrit Code Review over 9 years ago

Patch set 1 for branch TYPO3_6-2 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/36105

Actions #16

Updated by Stephan Großberndt over 9 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #17

Updated by Benni Mack over 5 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF