Project

General

Profile

Actions

Bug #93308

closed

Routing with Chinese letters as language parameter does lead to 404

Added by Stefano Kowalke over 3 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Should have
Assignee:
Category:
Link Handling, Site Handling & Routing
Target version:
-
Start date:
2021-01-18
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
10
PHP Version:
7.4
Tags:
routing, unicode, symfony
Complexity:
Is Regression:
Sprint Focus:

Description

We have the following site configuration for Simplified Chinese:

title: 'Simplified Chinese'
enabled: true
base: /简/
typo3Language: cn
locale: zh_CN.UTF-8
iso-639-1: zh
navigationTitle: 简
hreflang: zh-Hans
direction: ltr
fallbackType: fallback
fallbacks: '0'
flag: cn
languageId: '4'
websiteTitle: ''

This configuration leads to a 404 error.

What happens

When a user navigates to a url including 简 as a language param, the path is automatically url encoded by the browser or the server (we use Nginx). In the `Request` object the sign becomes `%E7%AE%80`.

Middleware SiteResolver

This middleware tries to match the request url with the site configuration to get the correct site and language for the request.

SiteMatcher

It delegates the matching to `SiteMatcher::matchRequest()`. Within `SiteMatcher::getRouteCollectionForAllSites()` the site configuration is loaded from YAML and a `\TYPO3\CMS\Core\Routing\Route` is created from the routing settings. The `$path´ variable holds the decoded Chinese sign.

UrlMatcher

The method returns a collection of routes built from the site configuration, which gets passed to `Symfony\Component\Routing\Matcher\UrlMatcher::__construct()`. Right after that, `SiteMatcher` delegates the matching to `Symfony\Component\Routing\Matcher::match($pathinfo)`, with the current path from the `Request` object.

When passed to `match()` the value is `%E7%AE%80` but this gets passed deeper down to `Symfony\Component\Routing\Matcher\UrlMatcher::matchCollection()` but before the path gets decoded by `rawurldecode()`:

if ($ret = $this->matchCollection(rawurldecode($pathinfo) ?: '/', $this->routes)) {
    return $ret;
}

Inside `Symfony\Component\Routing\Matcher\UrlMatcher::matchCollection()` finally the matching happens by comparing `$trimmedPathinfo` with `$staticPrefix` or in other words the encoded path `%E7%AE%80` with the decoded path 简.

Patch

My patch resolves the problem by decoding the base path from site configuration before it gets stored into `\TYPO3\CMS\Core\Routing\Route`.


Files

fix-broken-chinese-urls.patch (1.04 KB) fix-broken-chinese-urls.patch Stefano Kowalke, 2021-01-18 10:59
Actions #1

Updated by Stefan Bürk over 2 years ago

  • Assignee set to Stefan Bürk
Actions #2

Updated by Stefan Bürk over 2 years ago

  • % Done changed from 0 to 70
Actions #3

Updated by Gerrit Code Review over 2 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71513

Actions #4

Updated by Gerrit Code Review over 2 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71513

Actions #5

Updated by Stefan Bürk over 2 years ago

Thanks for reporting this Stefano Kowalke.

I have taken your provided patch and created a core patch from it, adding some tests to verify this and also cover them for the future.

I could not find a way to test the first case, the site base variant where you added rawurldecode() also, as languages always kicks in. Would love to have a verify for this to.

Maybe you can provide the case how you find out to rawurldecode() it in the first, non-language base route adding ?

Actions #6

Updated by Gerrit Code Review over 2 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71513

Actions #7

Updated by Gerrit Code Review over 2 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71513

Actions #8

Updated by Gerrit Code Review over 2 years ago

Patch set 1 for branch 10.4 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71578

Actions #9

Updated by Stefan Bürk over 2 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 70 to 100
Actions #10

Updated by Gerrit Code Review over 2 years ago

  • Status changed from Resolved to Under Review

Patch set 2 for branch 10.4 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/71578

Actions #11

Updated by Stefan Bürk over 2 years ago

  • Status changed from Under Review to Resolved
Actions #12

Updated by Benni Mack over 1 year ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF