Project

General

Profile

Actions

Bug #21569

closed

Wrong character encoding in cache tables breaks frontend rendering

Added by Steffen Kamper over 14 years ago. Updated over 13 years ago.

Status:
Closed
Priority:
Must have
Assignee:
Category:
-
Target version:
-
Start date:
2009-11-16
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
4.3
PHP Version:
5.3
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

There are several issues where cache data makes problems, eg if renderCharset != metaCharset. Then it happens if the cache data has special char like an Umlaut, the returned data is corrupted.

Reason is that cache-table use field TEXT for serialized arrays, therfore MySQL respect the charset.

Solution: use BLOB instead

(issue imported from #M12613)


Files

12613.diff (2.34 KB) 12613.diff Administrator Admin, 2009-11-16 17:01
0012613_v2.patch (1.41 KB) 0012613_v2.patch Administrator Admin, 2009-11-25 12:16

Related issues 5 (0 open5 closed)

Related to TYPO3 Core - Feature #21525: No typoscript template found - AddonClosed2009-11-10

Actions
Related to TYPO3 Core - Bug #17091: "No template found" after update from 4.0.4 to 4.1ClosedSteffen Kamper2007-03-07

Actions
Related to TYPO3 Core - Bug #17437: When accessing pages form cache "No Temlpate found!" appearsClosedRupert Germann2007-07-21

Actions
Related to TYPO3 Core - Bug #20092: Typo3 FE crashs with single-char umlauts in typoscriptClosed2009-02-25

Actions
Related to TYPO3 Core - Bug #21421: slow t3lib_TSparser::parseSubClosedBernhard Kraft2009-11-01

Actions
Actions #1

Updated by Martin Kutschker over 14 years ago

In cache_pages the field HTML stores the complete page. Yet this must be also a BLOB since Mysql (and other DBs) take it ill if the sent data is not in the charset of the column.

To be precise: if you have the DB in utf-8 the content will be truncated at the first byte that is invalid in utf8.

Actions #2

Updated by Bernhard Kraft over 14 years ago

What exactly do you mean by having the DB in utf-8. Today I tried setting the collation of a database to utf-8 and also the collation of tables, but for some reason I could not reproduce the this case. I know it happens, but I would like to know under which circumstances.

Which settings do I have to make to the database to let this case happen?

Actions #3

Updated by Bernhard Kraft over 14 years ago

Ok. Just tested the description of #17053 which seems to be the same problem.

Back in these days (2007) for some reason the "content" field of cache_hash got changed from "mediumblob" to "mediumtext" which seemed to introduce this error.

I could reproduce the error using Michaels bug note 0013133 in bug #17053.

Changing the field "content" in table "cache_hash" back from mediumtext to mediumblob solved the problem for me.

Actions #4

Updated by Stefan Geith over 14 years ago

I applied your patch and it works!
But one note:

CREATE TABLE cache_pagesection (
page_id int(11) unsigned DEFAULT '0' NOT NULL,
mpvar_hash int(11) unsigned DEFAULT '0' NOT NULL,
- content text,
+ content mediumblob, <--------------- shouldn't this be blob, not mediumblob ?
tstamp int(11) unsigned DEFAULT '0' NOT NULL,
PRIMARY KEY (page_id,mpvar_hash)
) ENGINE=InnoDB;
Actions #5

Updated by Oliver Hader over 14 years ago

That was the situation when the caching framework was introduced:
http://forge.typo3.org/repositories/diff/typo3v4-core?rev=4336

The tables of the caching framework can stay as they are (with "TEXT") since the caching framework performs an additional serialize() before writing to the database.
The only tables that have to be changed are cache_hash, cache_pages and cache_pagesection.

Actions #6

Updated by Steffen Kamper over 14 years ago

committed in trunk, rev 6525

Actions #7

Updated by Martin Kutschker over 14 years ago

Serializing does not help if you write iso-8859-1/latin1 (or any other charset) into a utf.8 field. The data will be TRUNCATED (!) at the first character that is not valid in utf-8.

This is similar as with iconv (maybe Mysql uses it). iconv stops any operation when it encounters invalid input,

Actions #8

Updated by Oliver Hader over 14 years ago

Masi, why did you reopen this issue again?

Actions #9

Updated by Oliver Hader over 14 years ago

Could not reproduce with the caching framework since the data gets serialized twice there...
I tested it with regular caching (I could reproduce the bad behaviour) and with caching framework (I could not reproduce). Now we have the situation we had before modifying the caching tables due to the caching framework and back again - with forgetting some database types...

I there is still something to optimize, please open a new issue.

Actions #10

Updated by Martin Kutschker over 14 years ago

[Reopening just to add this comment]

I reopened it because your comment about serializing is - sorry - nonsense. I explained what happens why data gets corrupted in certain conditions. Maybe they do not apply in all situations (I did not check), but your comment tells me that you (and sadly many more Core devs) simply do not grasp charset handling.

Actions

Also available in: Atom PDF