Task #55541
closedAvoid redundant data in page cache
0%
Description
When a page gets cached this is done regardless of any logic. Pages are "stupidly" cached without taking the data they contain into account.
This results in the following issues:- When a page (HTML) is the same for logged-in users and not-logged-in users the whole HTML code will still get stored twice.
- When two or more pages share the same TypoScript template it will get stored again and again with each page cache entry.
This results in the page cache growing linearely with each cached page altough large parts of the data (TypoScript) is redundant. If a FE-Login is possible the situation is even worse: Normal content pages without access restrictions will double the amount of cached data.
Having a large page cache impacts negatively on performance as MySql (and filesystem) won't be able to keep as many page cache entries in memory!
Solution:
Make a caching frontend and backend especially for caching pages. This frontend and backend can have intelligence for handling those redundancies.
A proposed soultion is attached and pushed to the review server. The proposed soultion created a frontend/backend cache. When a page shall get cached the HTML and the serialized TypoScript will get stored each by itself using a hash of their data as identifier. The orginal page cache entry will only contain a surrogate (class PageCacheEntry) for replaced big data value.
The attached PDF shows the differences in storage usage for a 40 page site. The page requests 0-40 are made without login. Request 40-80 (~) are made with a logged in user.
Files