Bug #15690
closedIncorrect conversion of non-ascii (Latvian) characters in URL
0%
Description
(switch your browser to utf-8 to see this report properly!)
Non-ascii characters in URLs are not properly converted to their acsii equivalents. For example, in Latvian language Ä, Ä“, Ä« and Å« are converted to aa, ee, ii and uu. This is wrong. According to Latvian tradition they should be a, e, i, u. aa, ee, ii and uu are used in text but in URLs only single letter is used, not double! Improper conversion of characters prevents Google from finding keywords in URLs and lowers site position in search results.
The attached patch fixes the problem with Latvian characters.
(issue imported from #M2654)
Files
Updated by Dmitry Dulepov over 18 years ago
Example Latvian sites with proper URLs:
www.zemesgramata.lv - Zemes grÄmata (government property registration agenecy)
www.berni.lv - BÄ“rni (about childhood)
www.calis.lv - CÄlis (also about childhood)
http://www.bm.gov.lv/lat/valdibas_deklaracijas_izpilde/ - "ValdÄ«bas deklarÄcijas izpilde" page on a government site
Updated by Miroslav Monkevic over 18 years ago
Issue also applies to Lithuanian characters 016A and 016A
Updated by Dmitry Dulepov over 18 years ago
Miroslav, my patch should help you too. You need to apply it and remove all generated files in ./typo3temp/cs directory
Updated by Karsten Dambekalns over 18 years ago
Dmitry, feel free to handle this yourself, I'll give you a +1 on the core list.
A note about the patch: the "description" should match the replacements, so UU should be U and so on...
Updated by Dmitry Dulepov over 18 years ago
Corresponding lines commented (suggested by Martin Kutschker)