Feature #80398
closedMake default charset and collation for new tables configurable
100%
Description
to be able to store 4 byte unicode characters we need to set database to utf8mb4. since typo3 8 there is a configuration parameter for that but it seems that it is not taken into account.
LocalConfiguration.php
'DB' => [ 'Connections' => [ 'Default' => [ 'charset' => 'utf8mb4', 'dbname' => '--dbname--', 'driver' => 'mysqli', 'host' => '127.0.0.1', 'password' => '--mypassword--', 'port' => 3306, 'user' => '--myuser--', ], ], ],
create table statements do have a fallback but do not read from configuration
private function buildTableOptions(array $options) { if (isset($options['table_options'])) { return $options['table_options']; } $tableOptions = array(); // Charset if ( ! isset($options['charset'])) { $options['charset'] = 'utf8'; } .... }
DatabaseConnection class also does not read charset configuration either it takes utf8 as a default.
$connection = \Doctrine\DBAL\DriverManager::getConnection([ 'driver' => 'mysqli', 'wrapperClass' => Connection::class, 'host' => $host, 'port' => (int)$this->databasePort, 'unix_socket' => $this->databaseSocket, 'user' => $this->databaseUsername, 'password' => $this->databaseUserPassword, 'charset' => $this->connectionCharset, ]);
it was stated that it would be fixed in CMS 8
https://forge.typo3.org/issues/71454
is this on roadmap? before LTS?
Files
Updated by Marco von Arx over 7 years ago
the issue is not DatabaseConnection class. charset is read properly from configuration there
it seems that
TYPO3\CMS\Core\Database\Schema\ConnectionMigrator
or TYPO3\CMS\Core\Database\Schema\SchemaMigrator
does not read that configuration parameter
I was able to work arround by adding the following
in TYPO3\CMS\Core\Database\Schema\ConnectionMigrator line 1211
$tableOptions = $table->getOptions();
$connectionParams = $connection->getParams();
if (isset($connectionParams['charset'])) {
$tableOptions['charset'] = $connectionParams['charset'];
}
if (isset($connectionParams['collate'])) {
$tableOptions['collate'] = $connectionParams['collate'];
}
Updated by Morton Jonuschat over 7 years ago
- Status changed from New to Needs Feedback
Hi!
I think you are mixing two concepts here. Also I think the buildTableOptions() code example is from Doctrine, which is a 3rd-Party Library and has no idea about TYPO3 configuration
1. Connection Charset
This defines the character set the client will use to send SQL statements to the server. It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a SELECT statement.)
2. Storage character set
This defines in which way the Database stores data on disk/in memory. This is controlled by Server/Database/Table/Column options, not the Connection Charset.
If I understand your report correctly you are looking for a way to tell TYPO3 to override the UTF8 default character set (and collation?) for created tables?
Updated by Marco von Arx over 7 years ago
Hi Morton
we need to store 4 Byte Unicode characters like emoji 'http://apps.timwhitlock.info/emoji/tables/unicode'
the default utf8 does only allow storing 3 byte unicode characters. most of emoji characters cannot be stored into utf8.
thats why i need the connection to be utf8mb4 and the database to create tables with charset utf8mb4 and collate utf8mb4_unicode_ci
the first part does indeed work. Typo3 does connect with charset utf8mb4 if i set it in LocalConfiguration.php
but how can I ensure that tables are created with correct charset and collate during setup?
Updated by Morton Jonuschat over 7 years ago
- Status changed from Needs Feedback to New
- Priority changed from Must have to Should have
- Target version set to Candidate for Major Version
This is a new feature which could be implemented for TYPO3 9.0. Doing it using the connectionParameters is not the preferred way as the connection and the tablespace are two different things.
Also this needs to be supported across multiple database connections and database engines.
Updated by Morton Jonuschat over 7 years ago
- Tracker changed from Bug to Feature
- Subject changed from connection charset ignored to Make default charset and collation for new tables configurable
Updated by Marco von Arx over 7 years ago
Symfony has separate config parameter for table schemes http://symfony.com/doc/current/doctrine.html
doctrine:
dbal:
charset: utf8mb4
default_table_options:
charset: utf8mb4
collate: utf8mb4_unicode_ci
as a suggestion
'DB' => [
'Connections' => [
'Default' => [
'charset' => 'utf8mb4',
'dbname' => '--dbname--',
'driver' => 'mysqli',
'host' => '127.0.0.1',
'password' => '--mypassword--',
'port' => 3306,
'user' => '--myuser--',
'tableoptions' => [
'charset' => 'utf8mb4',
'collate' => 'utf8mb4_unicode_ci'
]
],
],
],
Updated by Tymoteusz Motylewski almost 7 years ago
please keep in mind that utf8mb4 uses 4 bytes per char, while "standard" utf8 collation uses 3 bytes per char, which means that indices created might exceed maximum key length limit in mysql.
E.g. by default the key size is 767 which is lower than varchar(255) in utf8, but exceeded with varchar(255) with utf8mb4 (255*4 = 1020)
Updated by Tymoteusz Motylewski over 6 years ago
- Related to Bug #82551: Upgrade Wizard Deadlock added
Updated by Tymoteusz Motylewski over 6 years ago
- Related to Bug #82080: Indexes too large for some tables with utf8mb4 added
Updated by Tymoteusz Motylewski over 6 years ago
FYI, MySQL 8 will come with utf8mb4 as default charset
Updated by Gerrit Code Review over 6 years ago
- Status changed from New to Under Review
Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Lienhart Woitok over 6 years ago
I have pushed a change to gerrit which implements the config suggestion by Marco von Arx. I'm not entirely sure I found all relevant places to change, but in my tests this worked for the database analyzer in the install tool. Newly created tables are generated with utf8mb4.
Updated by David Henninger over 6 years ago
- Has duplicate Bug #85524: Charset for DB Connections in LocalConfiguration.php ignored added
Updated by Riccardo De Contardi over 6 years ago
- Related to Feature #71454: Allow setting Connection Charset added
Updated by Gerrit Code Review about 6 years ago
Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Tymoteusz Motylewski about 6 years ago
- Target version changed from Candidate for Major Version to 9 LTS
Updated by Gerrit Code Review about 6 years ago
Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 8 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Gerrit Code Review about 6 years ago
Patch set 9 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/56440
Updated by Lienhart Woitok about 6 years ago
- Status changed from Under Review to Resolved
- % Done changed from 0 to 100
Applied in changeset ed806ef550a63d9034bf4edba8b38b92b1fd71ed.
Updated by Lienhart Woitok about 6 years ago
- File typo3-utf8mb4-0.png typo3-utf8mb4-0.png added
- File typo3-utf8mb4-1.png typo3-utf8mb4-1.png added
- File typo3-utf8mb4-2.png typo3-utf8mb4-2.png added
As requested by Tymoteusz Motylewski some demonstration screenshots of utf8mb4 support in content (using the introduction package). For the first screenshot with normal utf8 (utf8mb3) I added the heart again to demonstrate the failed content, it wasn't there after saving as it couldn't be written to the database.
Updated by Helmut Hummel about 6 years ago
- Related to Bug #86793: Renamed columns are not correctly detected by database schema diff added
Updated by Jeff C over 3 years ago
- Has duplicate deleted (Bug #85524: Charset for DB Connections in LocalConfiguration.php ignored)
Updated by Stefan Bürk over 2 years ago
- Related to Bug #97961: Transform `tableoptions` early to valid `doctrine/dbal` option added
Updated by Stefan Bürk about 1 month ago
- Related to Task #105289: Mitigate deprecated Doctrine DBAL connection options added
Updated by Stefan Bürk about 1 month ago
- Related to Task #105297: Deprecate `tableoptions` and `collate` connection configuration added