Story #66138
closed"Migrate all file links of RTE-enabled fields to FAL" fails with UTF-8-filenames
0%
Description
On about any Upgrade I've made from TYPO3 4.5 to 6.2 until now, I've had to manually correct UTF8 filenames (like `communiqué-geändert.pdf`) and their corresponding links. The server can handle utf-8 filenames without issues, `[SYS][UTF8filesystem]` is set, as well as `[SYS][systemLocale] = de_DE.UTF-8`. Also, new files that are uploaded via the BE don't make any problems.
But the Upgrade wizard "Migrate all file links of RTE-enabled fields to FAL" keeps failing on all filenames that have special characters.
When the file name and the reference in tt_content are adapted, it goes through.
I am not sure if this is a bug (so I set the ticket to "Story"), due to some missing settings (php, linux?) or what.
I found https://forge.typo3.org/issues/65776, but that's about length in sys_refindex.
Updated by Markus Klein over 9 years ago
What is the content of the refindex / tt_content in the 4.5 instance?
Updated by Urs Braem over 9 years ago
Hi Markus
In the most recent case (where I can get the data quickly), there was another component, DAM.
In the DAM table, the data was saved as such:
tx_dam.file_dl_name: qualité_zulassungskriterien.pdf
tx_dam.file_name: qualité_zulassungskriterien.pdf
and then sys_refindex: fileadmin/user_upload/path/to/qualité_zulassungskriterien.pdf (filename anonymized, original string length 115chars)
I can also look for a site without DAM if that is a big difference.
Cheers
Urs
Updated by Markus Klein over 9 years ago
I never used DAM, so I can't help you here. Sorry.
Updated by Urs Braem over 9 years ago
Another site, without DAM:
tt_content.bodytext:
<link fileadmin/redaktion/dateien/Pr%C3%A4sentation/Stiftung_Juli2013_-_d__Kompatibilit%C3%A4tsmodus_.pdf _blank download>Präsentation</link>
or
<link fileadmin/redaktion/dateien/QS-Zertifikate/Syst%C3%A8mes_dassurance_qualit%C3%A9_nationaux_f.pdf _blank download "Startet den Datei-Download">Assurance qualité nationale</link>
hmm that looks quite different!
Updated by Markus Klein over 9 years ago
Can you please also check what is in the refindex?
because the wizard actually takes care of this.
$regularExpression = '$<((link|LINK) ' . str_replace('%2F', '/', rawurlencode($reference['ref_string'])) . ').*>$';
Updated by Urs Braem over 9 years ago
Aha!
sys_refindex.ref_string :
fileadmin/redaktion/dateien/Präsentation/Stiftung_Juli2013_-_d__Kompatibilitätsmodus_.pdf
fileadmin/redaktion/dateien/QS-Zertifikate/Systèmes_dassurance_qualité_nationaux_f.pdf
No URL-Encoding here
Updated by Markus Klein over 9 years ago
That is ok, because the code posted above does the url encoding for finding the links.
Updated by Markus Klein over 9 years ago
So am I right, that you only have those upgrade issues with DAM installations?
Updated by Urs Braem over 9 years ago
No, the second example is from a normal site (without DAM)
Updated by Markus Klein over 9 years ago
I fear you have to get the debugger started, because the code and your DB content looks ok.
fileadmin/redaktion/dateien/Präsentation/Stiftung_Juli2013_-_d__Kompatibilitätsmodus_.pdf
will be converted by
str_replace('%2F', '/', rawurlencode($reference['ref_string']))
to
fileadmin/redaktion/dateien/Pr%C3%A4sentation/Stiftung_Juli2013_-_d__Kompatibilit%C3%A4tsmodus_.pdf
So the resulting regexp should match.
Updated by Urs Braem over 9 years ago
I fear you have to get the debugger started
I'm bad at such things, what do you mean? :-)
Updated by Markus Klein over 9 years ago
I mean I would need to debug the upgrade process, otherwise I've no clue what's going wrong.
Updated by Urs Braem over 9 years ago
For the next run (more sites coming up), can you give me a hint how to get debug information from the update wizard?
And, while we're at it: is there a trick to re-run an update wizard explicitly, a second time?
Updated by Markus Klein over 9 years ago
You will not get debug information, but you would have to connect a debugger to the site and walk through the code step by step.
Do you have the possibility to the debug that instance while doing the upgrade? If so, I can tell you where to set the breakpoints to gather the information.
There is no switch to enable the wizard again. It automatically shows up if it finds links that need to be converted in the database.
So the only proper way would be to repeat the upgrade such that the database contains the old links.
Updated by Urs Braem over 9 years ago
I can do it on MAMP and thus try xdebug (never used it before)
Updated by Markus Klein over 9 years ago
great.
Suggested breakpoints for data inspection:
typo3/sysext/install/Classes/Updates/RteFileLinksUpdateWizard.php:170
Check $reference and $record there
Maybe also check
typo3/sysext/install/Classes/Updates/RteFileLinksUpdateWizard.php:199
for $content and $regularExpression
Updated by Urs Braem over 9 years ago
ok, thanks! this will take a while until I do it
Updated by Urs Braem over 9 years ago
I've found out that some of my UTF-8 named files were mangled when transfering them via SFTP (with Coda/Transmit) to and from a Mac. I think this is connected. http://wiki.typo3.org/Exception/CMS/1319455097
I'll use scp from now on, Markus, I think you could close this