Task #89058: PageRouter::matchRequest could use a cache - TYPO3 Core - TYPO3 Forge

Actions

Copy link

Task #89058

open

PageRouter::matchRequest could use a cache

Added by Andreas Kienast almost 5 years ago. Updated 12 months ago.

Status:

Needs Feedback

Priority:

Should have

Assignee:

Category:

Link Handling, Site Handling & Routing

Target version:

Start date:

2019-09-02

Due date:

% Done:

Estimated time:

TYPO3 Version:

PHP Version:

Tags:

Complexity:

Sprint Focus:

Description

Currently, on every request PageRouter::matchRequest collects all slug candidates and chooses the best candidate. Depending on the amount of records in the database, this may lead to a lot of queries.

For a route that has been resolved previously, the result might get cached for faster resolving. This needs a proper concept in terms of cache invalidation.

Files

Download all files

PageRouter.patch (1.71 KB) PageRouter.patch		Carlos Meyer, 2019-11-12 16:24
typo3-core-speedup-slug-resolution-with-many-sites.patch (1.55 KB) typo3-core-speedup-slug-resolution-with-many-sites.patch		Xavier Perseguers, 2023-04-13 08:24
typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch (2.45 KB) typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch		Xavier Perseguers, 2023-04-13 09:44

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by Benni Mack almost 5 years ago

Hey Andy,

where exactly are the multiple DB queries located?

AFAIK getPagesFromDatabaseForCandidates() does only one query per language. If workspaces are enabled, we have more, of course.

Actions

Copy link

Updated by Christian Eßl almost 5 years ago

Category set to Link Handling, Site Handling & Routing

Actions

Copy link

Updated by Benni Mack over 4 years ago

Status changed from New to Needs Feedback

Actions

Copy link

Updated by Carlos Meyer over 4 years ago

We have the performance problem currently in an installation with 1600 domains too. In our case we fixed it by extending the query in the getPagesFromDatabaseForCandidates, so that only the pages of the current rootline are returned. But I'm not sure if there could be problems with mountpoints?

Changes in PageRouter.php line 334 and below

  /**
     * Get all uids below given page uid
     *
     * @param $pageUID
     * @return string
     */

    private function getRecursivePageUIDs($pageUID)
    {
        $depth = 999999;
        $queryGenerator = GeneralUtility::makeInstance('TYPO3\\CMS\\Core\\Database\\QueryGenerator');
        $rGetTreeList = $queryGenerator->getTreeList($pageUID, $depth, 0, 1);
        //Will be a string
        return explode(',', $rGetTreeList);

    }

    /**
     * Check for records in the database which matches one of the slug candidates.
     *
     * @param array $slugCandidates
     * @param int $languageId
     * @return array
     */
    protected function getPagesFromDatabaseForCandidates(array $slugCandidates, int $languageId): array
    {
        $context = GeneralUtility::makeInstance(Context::class);
        $searchLiveRecordsOnly = $context->getPropertyFromAspect('workspace', 'isLive');
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
            ->getQueryBuilderForTable('pages');
        $queryBuilder
            ->getRestrictions()
            ->removeAll()
            ->add(GeneralUtility::makeInstance(DeletedRestriction::class))
            ->add(GeneralUtility::makeInstance(FrontendWorkspaceRestriction::class, null, null, $searchLiveRecordsOnly));
        //restrict query to sites in rootline
        $uids = $this->getRecursivePageUIDs($this->site->getRootPageId());
        $statement = $queryBuilder
            ->select('uid', 'l10n_parent', 'pid', 'slug')
            ->from('pages')
            ->where(
                $queryBuilder->expr()->in('uid',$uids),
                $queryBuilder->expr()->eq(
                    'sys_language_uid',
                    $queryBuilder->createNamedParameter($languageId, \PDO::PARAM_INT)
                ),
                $queryBuilder->expr()->in(
                    'slug',
                    $queryBuilder->createNamedParameter(
                        $slugCandidates,
                        Connection::PARAM_STR_ARRAY
                    )
                )
            )
            // Exact match will be first, that's important
            ->orderBy('slug', 'desc')
            ->execute();

Actions

Copy link

Updated by Carlos Meyer over 4 years ago

File PageRouter.patch PageRouter.patch added

Actions

Copy link

Updated by Carlos Meyer over 4 years ago

Please use this patch source: https://raw.githubusercontent.com/protos1575/PageRouter.patch/master/PageRouter.patch

Actions

Copy link

Updated by Susanne Moog over 4 years ago

Sprint Focus set to On Location Sprint

Actions

Copy link

Updated by Yohann CERDAN over 4 years ago

Same problem here, we have a large website with 100+ domains and some mountpoints.
The method getPagesFromDatabaseForCandidates() iterate a lot of non necessary pages (/home/about-us/offices/,/home/about-us/,/home/,/ for each site) and call a lot of methods for each.
For examples, in my case, i will iterate overs 500+ pages to find the slug that appear in the first row of the query. This result in a big performance issue.

So, even if we find a page wich are on the same site, we iterate all the others pages.
If we have the good site ID and the exact same slug, maybe we could stop iteration here?

Actions

Copy link

Updated by Xavier Perseguers over 1 year ago

File typo3-core-speedup-slug-resolution-with-many-sites.patch typo3-core-speedup-slug-resolution-with-many-sites.patch added

Just investigated this problem this morning and indeed, I implemented a simple cache which tremendously speeds up all my websites. This is day and night!

From my Slack investigations (https://typo3.slack.com/archives/C025BQLFA/p1681370066698969):

- Around 300 domains in the install
- Using mount point for shared pages (such as the authentication page which is the same for every site)
- Resolving the candidates for /share/authentication for a given domain takes around 3.5 sec (locally, needs to loop over 600+ rows)
- Resolving the root page such as /fr for a given domain takes around 1.6 sec.
- Using the simple cache I did then only takes a few ms (!!!)

Please see patch as attachement.

I'm using a cache for a custom extension that automatically flushes the caches based on modified table name + uid by hooking into DataHandler, that's why I think I can safely use a 90 day cache validity and I generate tags of the form pages%UID

Actions

Copy link

#10

Updated by Xavier Perseguers over 1 year ago

After adapting the original patch from @Carlos Meyer as provided in #89058-6 to work with current TYPO3 v10, it turns out this does not work at all to restrict the query on the subpages of a given site when you work with mount points.

Too bad indeed, I liked the idea even though the very first call was then much longer to complete since it had to compute the full page tree of my domain once.

So I'll go back to my caching solution for the time being.

Actions

Copy link

#11

Updated by Xavier Perseguers over 1 year ago

File typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch added

Please find enclosed (typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch) a generic solution that doesn't rely on my own caching mechanism.

Actions

Copy link

#12

Updated by Christian Eßl over 1 year ago

Related to Epic #95690: Performance issues when hosting a large amount of websites, and optimizations propositions added

Actions

Copy link

#13

Updated by Benni Mack 12 months ago

Sprint Focus deleted (~~On Location Sprint~~)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

TYPO3 Core

Custom queries

Watchers (3)

Task #89058

PageRouter::matchRequest could use a cache

Updated by Benni Mack almost 5 years ago

Updated by Christian Eßl almost 5 years ago

Updated by Benni Mack over 4 years ago

Updated by Carlos Meyer over 4 years ago

Updated by Carlos Meyer over 4 years ago

Updated by Carlos Meyer over 4 years ago

Updated by Susanne Moog over 4 years ago

Updated by Yohann CERDAN over 4 years ago

Updated by Xavier Perseguers over 1 year ago

Updated by Xavier Perseguers over 1 year ago

Updated by Xavier Perseguers over 1 year ago

Updated by Christian Eßl over 1 year ago

Updated by Benni Mack 12 months ago