Bug #93883
closedTransliteration of german umlauts fails partly on file upload for files created on mac
100%
Description
How to reproduce on Mac:
(i) OK
touch a file named test-ö-ä-ü.txt with your favourite terminal and upload it in the file list.
Result: The file is renamed to test-oe-ae-ue.txt
The hex representation of öäü is
0000000 c3 a4 c3 b6 c3 bc
You get it with bin2hex()
(ii) Not OK
Save a file with TextEdit or simple create new file in Finder with the name test-ö-ü-ä.txt and upload it in the file list
Result: The file is renamed to test-o__u__a__.txt
The transliteration of öäü fails (or lets say is incomplete) when its representation is
0000000 61 cc 88 6f cc 88 75 cc 88
German umlauts have multiple representations in utf8 charset. One of them seems not handled correctly by \TYPO3\CMS\Core\Resource\Driver\LocalDriver::sanitizeFileName() or in \TYPO3\CMS\Core\Charset\CharsetConverter
Updated by Christoph Lehmann almost 4 years ago
- Related to Bug #20612: scandinavian letters are transliterated wrong added
Updated by Christoph Lehmann almost 4 years ago
A very simple solution for this issue is to use
\Normalizer::normalize();
in/before
\TYPO3\CMS\Core\Charset\CharsetConverter::specCharsToASCII()
Updated by Martin Kutschker almost 4 years ago
Will fail if "intl" is not enabled, but that can be checked. Better use it when it's available then not use it at all.
Updated by Martin Kutschker over 3 years ago
Ah, there is a polyfill:
https://github.com/symfony/polyfill-intl-normalizer
Updated by Martin Kutschker over 3 years ago
A brute force removal of ALL nonspacing marks:
Transliterator::createFromRules('any-NFD; [\p{Mn}] Remove; any-NFC')->transliterate($subject);
https://www.compart.com/en/unicode/category/Mn
The character should probably changed to list only Latin combining marks.
Updated by Martin Kutschker over 3 years ago
- Related to Bug #93764: SlugHelper can create bad urls added
Updated by Gerrit Code Review over 3 years ago
- Status changed from New to Under Review
Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/69144
Updated by Gerrit Code Review over 3 years ago
Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/69144
Updated by Gerrit Code Review over 3 years ago
Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/69144
Updated by Gerrit Code Review over 3 years ago
Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/69144
Updated by Anonymous over 3 years ago
- Status changed from Under Review to Resolved
- % Done changed from 0 to 100
Applied in changeset 2196c85e644a849096ae17249f0ab27d3883adec.
Updated by Gerrit Code Review over 3 years ago
- Status changed from Resolved to Under Review
Patch set 1 for branch 10.4 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/70255
Updated by Gerrit Code Review over 3 years ago
Patch set 2 for branch 10.4 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/c/Packages/TYPO3.CMS/+/70255
Updated by Riccardo De Contardi about 2 years ago
- Status changed from Under Review to Closed
- Target version deleted (
Candidate for patchlevel)
Closed as requested by the reporter;
If you think that this is the wrong decision, please reopen it or open a new issue with a reference to this one.
Thank you.
Updated by Benni Mack over 1 year ago
- Related to Feature #57695: Implement unicode normalization in TYPO3 Core's charset conversion routines, especially for filepaths in TYPO3 FAL's LocalDriver. added