Bug #87295
closedChinese Language url not working with TYPO3 9.5
100%
Description
I have site with Chinese language and TYPO3 9.5 but Chinese page title in url are replaced with some string like e5b7a5e4bd9ce4b88ee8818ce4b89a
In realurl version its /cn/应用/ and in new(TYPO3 9 slug) its /cn/e5ba94e794a8
So is Chinese supported in TYPO3 9 slug?
According to doc https://docs.typo3.org/typo3cms/extensions/core/Changelog/9.4/Feature-84729-NewTCATypeSlug.html unicode is supported.
Its seems
line 126 $slug = rawurlencode($slug); \\ \TYPO3\CMS\Core\DataHandling\SlugHelper->sanitize() encoded it.
Files
Updated by Riccardo De Contardi almost 6 years ago
- Category set to Site Handling, Site Sets & Routing
Updated by Ricky Mathew almost 6 years ago
- Priority changed from Should have to Must have
- Target version set to Candidate for patchlevel
Any updates?
Updated by Lars Peter Søndergaard almost 6 years ago
The rawurlencode
line can probably be removed.
A few lines earlier there is the regular expression:
$slug = preg_replace('/[^\p{L}0-9\/' . preg_quote($fallbackCharacter) . ']/u', '', $slug);
That line removes anything that is not a Letter (using unicode properties), not a digit (0-9), not a forward-slash and not the fallback character.
Letters within ASCII are only a-z
and A-Z
. Anything else has a codepoint beyond U+80, outside of ASCII and those characters should not cause trouble for URLs. None of them are used as URL or HTML delimiters, as far as I know.
Letters, digits and slashes are safe in the path segments. Only the fallback-character might be unsafe if defined unusual.
The only characters that could cause trouble in the path segment of a URL, is the '?'
, '&'
, '"'
and "'"
(for HTML).
Those characters however, are properly encoded by the symfony UrlGenerator class:
Symfony\Component\Routing\Generator\UrlGenerator::doGenerate
It uses rawurlencode
when building the url, but decodes a set of characters predefined as protected $decodedChars
.
I only followed the case where a Site configuration is available, so I wouldn't know what happens in "traditional" setups, or if those slugs are even relevant in that case.
Greetings.
Updated by Ricky Mathew almost 6 years ago
I also think there is no need for rawurlencode() as the preg_replace() before handles eveything perfectly.if rawurlencode() can't be avoided then i have a workaround of using rawurldecode() at the end of sanitize() function if sanitize() is referenced only on purpose of url generation.
What all other thoughts?I think this must be considered in the next patch level as typo3 url system isn't supporting Asian languages at all !!.
Updated by Sven Juergens almost 6 years ago
here same problem with Arabic language
Updated by Gerrit Code Review over 5 years ago
- Status changed from New to Under Review
Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59796
Updated by Gerrit Code Review over 5 years ago
Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59796
Updated by Gerrit Code Review over 5 years ago
Patch set 1 for branch 9.5 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59822
Updated by Guido Schmechel over 5 years ago
- Status changed from Under Review to Resolved
- % Done changed from 0 to 100
Applied in changeset 0299afec77ad32449b2a929f3c66d630af50cf68.