Bug #36754

SQL warning about collation in the install tool, Upgrade wizard

Added by Oliver Klee over 7 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Install Tool
Target version:
Start date:
2012-05-01
Due date:
% Done:

100%

TYPO3 Version:
6.0
PHP Version:
Tags:
Complexity:
Is Regression:
Yes
Sprint Focus:

Description

My TYPO3 installation is running today's master. All my DB tables are using the utf8_unicode_ci collation.

When I log in to the Install Tool and click on "Upgrade Wizard", I get the following SQL warning:

exec_SELECTquery
caller t3lib_DB::exec_SELECTquery
ERROR Illegal mix of collations (utf8_general_ci,IMPLICIT) and (utf8_unicode_ci,IMPLICIT) for operation '<>'
lastBuiltQuery SELECT COUNT FROM tt_content WHERE media <> '' AND CAST AS CHAR) <> media OR (CType = 'uploads' AND select_key != '')
debug_backtrace require(init.php),install/index.php#147 // tx_install->init#346 // tx_install->updateWizard#495 // tx_install->updateWizard_parts#6401 // Tx_Install_Updates_Base->shouldRenderWizard#6442 // Tx_Install_Updates_File_TtContentUploadsUpdateWizard->checkForUpdate#157 // t3lib_DB->exec_SELECTcountRows#111 // t3lib_DB->exec_SELECTquery#337 // t3lib_DB->debug#191


Related issues

Related to TYPO3 Core - Bug #41596: Content of $TYPO3_CONF_VARS['SYS']['setDBinit'] is completely overruled / ignored Closed 2012-10-04
Duplicated by TYPO3 Core - Bug #54952: Update Wizard - DB Error Closed 2014-01-13
Precedes TYPO3 Core - Bug #75932: SQL-Error in ProcessedFileChecksumUpdate Closed 2016-04-26

Associated revisions

Revision 587f8763 (diff)
Added by Lars Trebing over 5 years ago

[BUGFIX] "Illegal mix of collations" in Upgrade Wizard

When directly comparing CAST with a CHAR field (including
VARCHAR and TEXT) whose collation isn't the same as that of the database
connection, MySQL gives an error like this:

Illegal mix of collations (utf8_general_ci,IMPLICIT) and
(utf8_unicode_ci,IMPLICIT) for operation '<>'

Wrapping the right side of the comparison in CAST as well
keeps the data unchanged while ensuring that both sides have the same
collation.

Change-Id: Icaec66412d6b6441ae30065ffb92565991e9a1e4
Resolves: #36754
Releases: 6.3, 6.2
Reviewed-on: http://review.typo3.org/32117
Reviewed-by: Markus Klein <>
Tested-by: Markus Klein <>
Reviewed-by: Jigal van Hemert <>
Tested-by: Jigal van Hemert <>

Revision 85408c4b (diff)
Added by Lars Trebing over 5 years ago

[BUGFIX] "Illegal mix of collations" in Upgrade Wizard

When directly comparing CAST with a CHAR field (including
VARCHAR and TEXT) whose collation isn't the same as that of the database
connection, MySQL gives an error like this:

Illegal mix of collations (utf8_general_ci,IMPLICIT) and
(utf8_unicode_ci,IMPLICIT) for operation '<>'

Wrapping the right side of the comparison in CAST as well
keeps the data unchanged while ensuring that both sides have the same
collation.

Change-Id: Icaec66412d6b6441ae30065ffb92565991e9a1e4
Resolves: #36754
Releases: 6.3, 6.2
Reviewed-on: http://review.typo3.org/32198
Reviewed-by: Jigal van Hemert <>
Tested-by: Jigal van Hemert <>

History

#1 Updated by unkonsorten no-lastname-given over 6 years ago

Had the same problem. Some of my columns (not tables) in tt_content and pages for some reason utf8_unicode_ci. After changing everything to utf8_general_ci the error message disappeared.

#2 Updated by Alexander Schnitzler about 6 years ago

  • Is Regression set to No

I am also having this bug.

My environment looks like this:

show variables like "collation_database";

+--------------------+-----------------+
| Variable_name      | Value           |
+--------------------+-----------------+
| collation_database | utf8_unicode_ci |
+--------------------+-----------------+

show table status;

+--------------------+-----------------+
| Name               | Collation       |
+--------------------+-----------------+
| tt_content         | utf8_unicode_ci |
+--------------------+-----------------+

Create Syntax:

CREATE TABLE `tt_content` (
  `uid` int(11) NOT NULL AUTO_INCREMENT,
  `pid` int(11) NOT NULL DEFAULT '0',
  `t3ver_oid` int(11) NOT NULL DEFAULT '0',
  `t3ver_id` int(11) NOT NULL DEFAULT '0',
  `t3ver_wsid` int(11) NOT NULL DEFAULT '0',
  `t3ver_label` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `t3ver_state` tinyint(4) NOT NULL DEFAULT '0',
  `t3ver_stage` int(11) NOT NULL DEFAULT '0',
  `t3ver_count` int(11) NOT NULL DEFAULT '0',
  `t3ver_tstamp` int(11) NOT NULL DEFAULT '0',
  `t3ver_move_id` int(11) NOT NULL DEFAULT '0',
  `t3_origuid` int(11) NOT NULL DEFAULT '0',
  `tstamp` int(11) unsigned NOT NULL DEFAULT '0',
  `crdate` int(11) unsigned NOT NULL DEFAULT '0',
  `cruser_id` int(11) unsigned NOT NULL DEFAULT '0',
  `hidden` tinyint(4) unsigned NOT NULL DEFAULT '0',
  `sorting` int(11) unsigned NOT NULL DEFAULT '0',
  `CType` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `header` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `header_position` varchar(6) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `bodytext` mediumtext COLLATE utf8_unicode_ci,
  `image` text COLLATE utf8_unicode_ci,
  `imagewidth` mediumint(11) unsigned NOT NULL DEFAULT '0',
  `imageorient` tinyint(4) unsigned NOT NULL DEFAULT '0',
  `imagecaption` text COLLATE utf8_unicode_ci,
  `imagecols` tinyint(4) unsigned NOT NULL DEFAULT '0',
  `imageborder` tinyint(4) unsigned NOT NULL DEFAULT '0',
  `media` text COLLATE utf8_unicode_ci,
  `layout` int(11) unsigned NOT NULL DEFAULT '0',
  `deleted` tinyint(4) unsigned NOT NULL DEFAULT '0',
  `cols` int(11) unsigned NOT NULL DEFAULT '0',
  `records` text COLLATE utf8_unicode_ci,
  `pages` tinytext COLLATE utf8_unicode_ci,
  `starttime` int(11) unsigned NOT NULL DEFAULT '0',
  `endtime` int(11) unsigned NOT NULL DEFAULT '0',
  `colPos` int(11) unsigned NOT NULL DEFAULT '0',
  `subheader` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `spaceBefore` smallint(5) unsigned NOT NULL DEFAULT '0',
  `spaceAfter` smallint(5) unsigned NOT NULL DEFAULT '0',
  `fe_group` varchar(100) COLLATE utf8_unicode_ci NOT NULL DEFAULT '0',
  `header_link` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `imagecaption_position` varchar(6) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `image_link` text COLLATE utf8_unicode_ci,
  `image_zoom` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `image_noRows` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `image_effects` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `image_compression` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `altText` text COLLATE utf8_unicode_ci,
  `titleText` text COLLATE utf8_unicode_ci,
  `longdescURL` text COLLATE utf8_unicode_ci,
  `header_layout` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '0',
  `text_align` varchar(6) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `text_face` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `text_size` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `text_color` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `text_properties` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `menu_type` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '0',
  `list_type` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '0',
  `table_border` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `table_cellspacing` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `table_cellpadding` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `table_bgColor` int(11) unsigned NOT NULL DEFAULT '0',
  `select_key` varchar(80) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `sectionIndex` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `linkToTop` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `file_collections` text COLLATE utf8_unicode_ci,
  `filelink_size` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `filelink_sorting` tinytext COLLATE utf8_unicode_ci NOT NULL,
  `target` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `section_frame` int(11) unsigned NOT NULL DEFAULT '0',
  `date` int(10) unsigned NOT NULL DEFAULT '0',
  `multimedia` tinytext COLLATE utf8_unicode_ci,
  `image_frames` int(11) unsigned NOT NULL DEFAULT '0',
  `recursive` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `imageheight` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `rte_enabled` tinyint(4) NOT NULL DEFAULT '0',
  `sys_language_uid` int(11) NOT NULL DEFAULT '0',
  `tx_impexp_origuid` int(11) NOT NULL DEFAULT '0',
  `pi_flexform` mediumtext COLLATE utf8_unicode_ci,
  `accessibility_title` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `accessibility_bypass` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `accessibility_bypass_text` varchar(30) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `l18n_parent` int(11) NOT NULL DEFAULT '0',
  `l18n_diffsource` mediumblob,
  `selected_categories` text COLLATE utf8_unicode_ci,
  `category_field` varchar(64) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `tx_phpunit_is_dummy_record` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `categories` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`uid`),
  KEY `t3ver_oid` (`t3ver_oid`,`t3ver_wsid`),
  KEY `parent` (`pid`,`sorting`),
  KEY `language` (`l18n_parent`,`sys_language_uid`),
  KEY `phpunit_dummy` (`tx_phpunit_is_dummy_record`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

As you can see there is absolutely no utf8_general_ci in use but my global mysql settings are utf8_general_ci.
So I guess the global settings are used for string comparison. Therefore the used Query has to set the collaction explicitely as such:

SELECT COUNT(uid) FROM tt_content WHERE media <> '' COLLATE utf8_unicode_ci AND CAST(CAST(media AS DECIMAL) AS CHAR) COLLATE utf8_unicode_ci <> media OR (CType = 'uploads' AND select_key != '')

#3 Updated by Kamen Stanev about 6 years ago

Actually I would suggest that the check if the field is only numeric is made outside of the SQL statement, since that way it will be database portable. The collation and collate statement is supported by MySQL but not by PostgreSQL for example. I understand that the other databases are getting little to no attention, but they will never get any if there are such specific statements in the core.

Also I needed like 2 days to investigate why the upload/media content is not migrated, and there was no clear error message in the upgrade wizard what is happening. For a Typo3 newbie it was very frustrating and I am surprised to see that almost nobody had similar problem. Also, this bug is year and a half old, and I think that it should be fixed properly if you want more people to upgrade to the latest version.

#4 Updated by Alexander Opitz about 6 years ago

  • Target version set to 6.2.0
  • Is Regression changed from No to Yes

Mark this as regression as it hinders some users from upgrade to new TYPO3 versions.

#5 Updated by Thomas Skierlo about 6 years ago

This error even is triggered if all collations are set to utf8_unicode_ci (global, connection, db, table, fields) AND it is triggered even on an empty tt_content table of a fresh install.

It is extremely annoying since no sane person would go one step further with a product if having collation errors in the very beginning.

#6 Updated by Gerrit Code Review almost 6 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/28259

#7 Updated by Jigal van Hemert almost 6 years ago

The problem is in the collation of the database/table/column combined with the connection settings.

The MySQL manual mentions:
If you use CAST without specifying CHARACTER SET, the resulting character set and collation are defined by the character_set_connection and collation_connection system variables. http://dev.mysql.com/doc/refman/5.5/en/charset-convert.html

In the query media is first cast to DECIMAL and then back to CHAR. Because the decimal value has no character set and collation it will use the settings from the connection.
The connection has the setting UTF-8 and the default collation for the utf8 character set is utf8_general_ci. http://dev.mysql.com/doc/refman/5.5/en/charset-charsets.html

If we can set a collation while setting the character set of the connection in mysqli initialization, the problem can be solved in a nice way.

#8 Updated by Jigal van Hemert almost 6 years ago

Further investigation has given me a bit more insight in UTF-8 handling.

utf8_general_ci is a bit faster (ca. 10%) than utf8_unicode_ci. utf8_unicode_ci is more accurate in sorting and comparing (for non-western language, but also handling 'ß' and 'ss', 'Œ' and 'OE', etcetera).

The mysqli functions only allow to set the character set and not the collation, because the C API doesn't have functionality for this and it has nothing to do with the actions the client has to perform.

The utf8 character set in MySQL doesn't support all characters. Since MySQL 5.5.3 there is a utf8mb4 character set which supports a larger part of the unicode set.

By default TYPO3 uses utf8_general_ci. Any other collation on the server can be handled by using ['SYS']['setDBinit']. For example:
['SYS']['setDBinit'] = 'SET NAMES utf8 COLLATE utf8_unicode_ci;'

We'll have to see how we can add some form of support for this in the Install Tool.

#9 Updated by Markus Klein almost 6 years ago

So if I see that correctly, it is not unlikely we are forced to introduce the handing of different character sets again for utf8mb4. (at least towards the DB)

For the collation we'd need an additional field in the Step-Installer in step 1.

#10 Updated by Gerrit Code Review over 5 years ago

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32117

#11 Updated by Gerrit Code Review over 5 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32117

#12 Updated by Gerrit Code Review over 5 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32117

#13 Updated by Gerrit Code Review over 5 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32117

#14 Updated by Gerrit Code Review over 5 years ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32117

#15 Updated by Gerrit Code Review over 5 years ago

Patch set 1 for branch TYPO3_6-2 of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at http://review.typo3.org/32198

#16 Updated by Anonymous over 5 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100

#17 Updated by Riccardo De Contardi about 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF