gr_list concept needs to be improved
TYPO3 uses the "gr_list" which indicates the groups of which a frontend user is a member of.
This information is used to identify cache records, indexed records, and probably more.
If user A is a member of group 1 while user B is a number of groups 1 AND 2, both can't use the cached page which has been built for the other gr_list, even if their content is exactly the same.
One of the resulting problems of this are:
- page needs to be built multiple times (which is not needed in many situations)
- users cannot find indexed content before someone else having the same gr_list has visited (=> indexed) the page
(issue imported from #M6403)
Updated by Michael Stucki over 13 years ago
Idea #1: Use "loginMode" flag to disable login information for a whole page.
Drawbacks: No personalized content is possible at all!
- I've made a patch already which can restore the login information for INT objects which are never cached anyways. However, this requires loading the full template (which is luckily stored in the cache, so it should not need to be re-parsed again).
Updated by Michael Stucki over 13 years ago
Idea 2: Introduce new flag which indicates if a page needs personalization, authorization (on page level), or none of both.
The flag has a default value and can be changed by extensions (similar to $TSFE->set_no_cache()).
If none of both is needed, then the gr_list of the page in the cache is set to "0,-3" (todo: maybe "0" is enough?) which means: No page access check is needed.
If only authorization is needed, then the gr_list of the page in the cache is set to "0,-4" (or "0,-3", see above) which means "lookup page access again".
If personalization is needed, the old behaviour will be used ("0,-1" applies to no logged in users, "0,-2,<groups>" applies to any logged in users matchin all specified groups.
- not sure how good it will work
Updated by Steffen Kamper over 13 years ago
this is really a problem that e.g. for properuser access it has to be done with USER_INT to prevent this cached information. I don't have a real idea atm how to solve.
It would help if $TSFE->fe_user never be cached, but this is a complex thing. May be it's better to make this only instead of whole page method.
There were such a lot discussions about that and it's difficult to extract the essence out of it. Hope that a solution can be found.
Updated by Ernesto Baschny over 13 years ago
We struggle with the same problems on several customer sites. On most we have chosen "custom" search solutions over indexed_search exactly because of this limitation.
Some ideas also occurred in our team to improve that (specifically for indexed_search, but maybe it also can be used generally). Basically it would mean that the admin has to tell the indexed/cacher more about its page tree structure and which type of fe_group usage he has. For example:
- Be able as an admin to specify which fe_groups are relevant for a certain page tree. So saying in a certain page tree: only consider group "-1" (anonymous) and "2" (fullaccess), the cacher/indexed wouldn't care if the user is also in group 1, 3, 4 or whatever. We will always have maximum 2 cached entries and 2 indexed-search variants for any phash on that tree.
- Maybe it can help if we could say (in TypoScript for a specific page tree again): We don't use "user-specific content elements" in this page tree. So please just cache and index pages as a whole if the user has access to the page at all.
What we also usually require is a "fulltext search" that is able to scan the complete fulltext database without regarding gr_list at all. If the admin likes and configures it to be so the search form in "indexed search" will always find every content on the site (and never show "different" results for different user groups). When following the link will then either don't show the content or redirect to a login page instead or whatever. Some more complex configuration on that matter could "fine-tune" that behaviour (like a list of fe_group uid's to always include in the fulltext search results).
Also about the gr_list problematic there must be an easy way to "crawl" the whole site with different permissions settings. Is there? Using "crawler" I couldn't find an "easy" way to do it.
Well, so far for the "brainstorming"... :) Maybe some of the ideas here are not implementable at all, maybe they are.
Updated by Michael Stucki over 7 years ago
- Status changed from Closed to New
the problem still exists, and I really like to get back on it one day. It would also be a nice GSOC project, for example.
Therefore, please leave it open for the time being. It would still be great to have this one day...
Updated by Alexander Opitz about 7 years ago
My idea to solve this problem:
- we save for which user groups this view is used instead of which user groups the user had who view this.
- this may collide with plugins which depends on the yet existing handling of the caching.
Examples of how it should work, to see if this is manageable:
On write Page Access + User => Access List No FE Groups => 0 -1 + 0, -1 => -1 -2 + 0, -2 => -2 2 + 0, -2, 2 => 2 2, 3 + 0, -2, 2, 3 => 2, 3 On Read Access List + User => read/no access 0 + No FE Groups => read -1 + 0, -1 => read -1 + 0, -2 => no access -2 + 0, -2 => read -2 + 0, -2, 2, 3 => read 2 + 0, -2, 2 => read 2, 3 + 0, -2, 2 => no access 2, 3 + 0, -2, 2, 3 => read
A page can now have different content elements with different user groups. Situation, one page have three content elements:
Element A: Show for logged out users (-1)
Element B: Show for logged in users (-2)
Element C: Show for users with fe_group 2
Element D: Show for users with fe_group 3
User => positive Access List + negative Access List 0, -1 => -1 + 0, -2 => -2 + 2, 3 0, -2, 2 => 2 + 3 0, -2, 2, 3 => 2, 3 + 0, -2, 3 => 3 + 2
We need the negative list, cause for user "0, -2, 2, 3" we don't know if the cache entry with the positive Access List "2" is all he can get or do we need to generate a new page.
A page with content elements which have multiple user groups:
Element A: Show for users with fe_group 2 or 3
Element B: Show for users with fe_group 3 or 4
User => positive Access List + negative Access List 0, -1 => 0 + 2, 3, 4 0, -2 => 0 + 2, 3, 4 0, -2, 2 => 2 + 3, 4 0, -2, 2, 3 => 3 + 0, -2, 3 => 3 + 0, -2, 4 => 4 + 2, 3 0, -2, 2, 4 => 3 +
=> So we get a maximum of 4 pages in Cache.
Updated by Benni Mack 3 months ago
OK. This is my proposal:
- We should cache on "CE level" or "per entity" level. Because if a CE does not have any restrictions (fe_group), it should be "ALWAYS CACHED". This should be cached away, and fetched.
- When "0,-1" is called, it should be possible to fetch everything statically already and just put the pieces in.
- If a page itself has pages.fe_group set, we need to differentiate between "Is allowed" or "is not allowed".
Basically we work like a "firewall" for the cache. Yes, cached under these circumstances. CEs / Pages should be based on the content, not on the visitor!