Bug #87295

Chinese Language url not working with TYPO3 9.5

Added by bharat parmar 7 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Must have
Assignee:
-
Category:
Link Handling, Site Handling & Routing
Start date:
2018-12-26
Due date:
% Done:

100%

TYPO3 Version:
9
PHP Version:
7.2
Tags:
slug
Complexity:
Is Regression:
Sprint Focus:

Description

I have site with Chinese language and TYPO3 9.5 but Chinese page title in url are replaced with some string like e5b7a5e4bd9ce4b88ee8818ce4b89a

In realurl version its /cn/应用/ and in new(TYPO3 9 slug) its /cn/e5ba94e794a8

So is Chinese supported in TYPO3 9 slug?

According to doc https://docs.typo3.org/typo3cms/extensions/core/Changelog/9.4/Feature-84729-NewTCATypeSlug.html unicode is supported.

Its seems
line 126 $slug = rawurlencode($slug); \\ \TYPO3\CMS\Core\DataHandling\SlugHelper->sanitize() encoded it.

slug.png View (182 KB) bharat parmar, 2018-12-26 06:47

Associated revisions

Revision 0299afec (diff)
Added by Guido Schmechel 5 months ago

[BUGFIX] Support non ASCII url slugs

Resolves: #87295
Releases: master, 9.5
Change-Id: Ib4fb1a8283c79a02cbc8cb52d91e2448ad9292ec
Reviewed-on: https://review.typo3.org/c/59796
Tested-by: Benni Mack <>
Tested-by: TYPO3com <>
Tested-by: Jürgen Venne <>
Tested-by: Anja Leichsenring <>
Reviewed-by: Benni Mack <>
Reviewed-by: Jürgen Venne <>
Reviewed-by: Anja Leichsenring <>

Revision 394decab (diff)
Added by Guido Schmechel 5 months ago

[BUGFIX] Support non ASCII url slugs

Resolves: #87295
Releases: master, 9.5
Change-Id: Ib4fb1a8283c79a02cbc8cb52d91e2448ad9292ec
Reviewed-on: https://review.typo3.org/c/59822
Tested-by: TYPO3com <>
Tested-by: Anja Leichsenring <>
Reviewed-by: Anja Leichsenring <>

History

#1 Updated by Riccardo De Contardi 6 months ago

  • Category set to Link Handling, Site Handling & Routing

#2 Updated by Ricky Mathew 5 months ago

  • Priority changed from Should have to Must have
  • Target version set to Candidate for patchlevel

Any updates?

#3 Updated by Lars Peter Søndergaard 5 months ago

The rawurlencode line can probably be removed.

A few lines earlier there is the regular expression:

$slug = preg_replace('/[^\p{L}0-9\/' . preg_quote($fallbackCharacter) . ']/u', '', $slug);

That line removes anything that is not a Letter (using unicode properties), not a digit (0-9), not a forward-slash and not the fallback character.

Letters within ASCII are only a-z and A-Z. Anything else has a codepoint beyond U+80, outside of ASCII and those characters should not cause trouble for URLs. None of them are used as URL or HTML delimiters, as far as I know.

Letters, digits and slashes are safe in the path segments. Only the fallback-character might be unsafe if defined unusual.

The only characters that could cause trouble in the path segment of a URL, is the '?', '&', '"' and "'" (for HTML).
Those characters however, are properly encoded by the symfony UrlGenerator class:

Symfony\Component\Routing\Generator\UrlGenerator::doGenerate

It uses rawurlencode when building the url, but decodes a set of characters predefined as protected $decodedChars.

I only followed the case where a Site configuration is available, so I wouldn't know what happens in "traditional" setups, or if those slugs are even relevant in that case.

Greetings.

#4 Updated by Ricky Mathew 5 months ago

I also think there is no need for rawurlencode() as the preg_replace() before handles eveything perfectly.if rawurlencode() can't be avoided then i have a workaround of using rawurldecode() at the end of sanitize() function if sanitize() is referenced only on purpose of url generation.
What all other thoughts?I think this must be considered in the next patch level as typo3 url system isn't supporting Asian languages at all !!.

#5 Updated by Sven Juergens 5 months ago

here same problem with Arabic language

#6 Updated by Gerrit Code Review 5 months ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59796

#7 Updated by Gerrit Code Review 5 months ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59796

#8 Updated by Gerrit Code Review 5 months ago

Patch set 1 for branch 9.5 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/59822

#9 Updated by Guido Schmechel 5 months ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100

#10 Updated by Benni Mack 3 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF