Project

General

Profile

Actions

Task #89058

open

PageRouter::matchRequest could use a cache

Added by Andreas Kienast over 4 years ago. Updated 9 months ago.

Status:
Needs Feedback
Priority:
Should have
Assignee:
-
Category:
Link Handling, Site Handling & Routing
Target version:
-
Start date:
2019-09-02
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
10
PHP Version:
Tags:
Complexity:
Sprint Focus:

Description

Currently, on every request PageRouter::matchRequest collects all slug candidates and chooses the best candidate. Depending on the amount of records in the database, this may lead to a lot of queries.

For a route that has been resolved previously, the result might get cached for faster resolving. This needs a proper concept in terms of cache invalidation.


Files


Related issues 1 (1 open0 closed)

Related to TYPO3 Core - Epic #95690: Performance issues when hosting a large amount of websites, and optimizations propositionsAccepted2021-10-18

Actions
Actions #1

Updated by Benni Mack over 4 years ago

Hey Andy,

where exactly are the multiple DB queries located?

AFAIK getPagesFromDatabaseForCandidates() does only one query per language. If workspaces are enabled, we have more, of course.

Actions #2

Updated by Christian Eßl over 4 years ago

  • Category set to Link Handling, Site Handling & Routing
Actions #3

Updated by Benni Mack over 4 years ago

  • Status changed from New to Needs Feedback
Actions #4

Updated by Carlos Meyer over 4 years ago

We have the performance problem currently in an installation with 1600 domains too. In our case we fixed it by extending the query in the getPagesFromDatabaseForCandidates, so that only the pages of the current rootline are returned. But I'm not sure if there could be problems with mountpoints?

Changes in PageRouter.php line 334 and below

  /**
     * Get all uids below given page uid
     *
     * @param $pageUID
     * @return string
     */

    private function getRecursivePageUIDs($pageUID)
    {
        $depth = 999999;
        $queryGenerator = GeneralUtility::makeInstance('TYPO3\\CMS\\Core\\Database\\QueryGenerator');
        $rGetTreeList = $queryGenerator->getTreeList($pageUID, $depth, 0, 1);
        //Will be a string
        return explode(',', $rGetTreeList);

    }

    /**
     * Check for records in the database which matches one of the slug candidates.
     *
     * @param array $slugCandidates
     * @param int $languageId
     * @return array
     */
    protected function getPagesFromDatabaseForCandidates(array $slugCandidates, int $languageId): array
    {
        $context = GeneralUtility::makeInstance(Context::class);
        $searchLiveRecordsOnly = $context->getPropertyFromAspect('workspace', 'isLive');
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
            ->getQueryBuilderForTable('pages');
        $queryBuilder
            ->getRestrictions()
            ->removeAll()
            ->add(GeneralUtility::makeInstance(DeletedRestriction::class))
            ->add(GeneralUtility::makeInstance(FrontendWorkspaceRestriction::class, null, null, $searchLiveRecordsOnly));
        //restrict query to sites in rootline
        $uids = $this->getRecursivePageUIDs($this->site->getRootPageId());
        $statement = $queryBuilder
            ->select('uid', 'l10n_parent', 'pid', 'slug')
            ->from('pages')
            ->where(
                $queryBuilder->expr()->in('uid',$uids),
                $queryBuilder->expr()->eq(
                    'sys_language_uid',
                    $queryBuilder->createNamedParameter($languageId, \PDO::PARAM_INT)
                ),
                $queryBuilder->expr()->in(
                    'slug',
                    $queryBuilder->createNamedParameter(
                        $slugCandidates,
                        Connection::PARAM_STR_ARRAY
                    )
                )
            )
            // Exact match will be first, that's important
            ->orderBy('slug', 'desc')
            ->execute();
Actions #7

Updated by Susanne Moog over 4 years ago

  • Sprint Focus set to On Location Sprint
Actions #8

Updated by Yohann CERDAN about 4 years ago

Same problem here, we have a large website with 100+ domains and some mountpoints.
The method getPagesFromDatabaseForCandidates() iterate a lot of non necessary pages (/home/about-us/offices/,/home/about-us/,/home/,/ for each site) and call a lot of methods for each.
For examples, in my case, i will iterate overs 500+ pages to find the slug that appear in the first row of the query. This result in a big performance issue.

So, even if we find a page wich are on the same site, we iterate all the others pages.
If we have the good site ID and the exact same slug, maybe we could stop iteration here?

Actions #9

Updated by Xavier Perseguers about 1 year ago

Just investigated this problem this morning and indeed, I implemented a simple cache which tremendously speeds up all my websites. This is day and night!

From my Slack investigations (https://typo3.slack.com/archives/C025BQLFA/p1681370066698969):

- Around 300 domains in the install
- Using mount point for shared pages (such as the authentication page which is the same for every site)
- Resolving the candidates for /share/authentication for a given domain takes around 3.5 sec (locally, needs to loop over 600+ rows)
- Resolving the root page such as /fr for a given domain takes around 1.6 sec.
- Using the simple cache I did then only takes a few ms (!!!)

Please see patch as attachement.

I'm using a cache for a custom extension that automatically flushes the caches based on modified table name + uid by hooking into DataHandler, that's why I think I can safely use a 90 day cache validity and I generate tags of the form pages%UID

Actions #10

Updated by Xavier Perseguers about 1 year ago

After adapting the original patch from @Carlos Meyer as provided in #89058-6 to work with current TYPO3 v10, it turns out this does not work at all to restrict the query on the subpages of a given site when you work with mount points.

Too bad indeed, I liked the idea even though the very first call was then much longer to complete since it had to compute the full page tree of my domain once.

So I'll go back to my caching solution for the time being.

Actions #11

Updated by Xavier Perseguers about 1 year ago

Please find enclosed (typo3-core-89058-speed-up-slug-resolution-with-many-sites.patch) a generic solution that doesn't rely on my own caching mechanism.

Actions #12

Updated by Christian Eßl about 1 year ago

  • Related to Epic #95690: Performance issues when hosting a large amount of websites, and optimizations propositions added
Actions #13

Updated by Benni Mack 9 months ago

  • Sprint Focus deleted (On Location Sprint)
Actions

Also available in: Atom PDF