Bug #79397
closedSQL error in ReferenceIndexing
100%
Description
Just with a relative small amount and tables the ReferenceIndexing crashes with an SQL error.
This seems to be due to the fact that the method updateIndex uses the NOT IN () in 2 SQL statements.
There is a limit on the number of elements in a IN clause that can be exceeded and in addition a better
and more efficient way is to use the AND NOT EXISTS (SUB-QUERY)
Updated by Gerrit Code Review almost 8 years ago
- Status changed from New to Under Review
Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Gerrit Code Review almost 8 years ago
Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Gerrit Code Review almost 8 years ago
Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Morton Jonuschat almost 8 years ago
This comment is in relation to Patchset #3 in the review system. As it doesn't allow for formatting text/code I put this explanation here.
Base configuration¶
This is done with reduced tables to ease preparing data.
Your MySQL connection id is 827 Server version: 5.7.16-0ubuntu0.16.04.1 (Ubuntu) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> DESCRIBE table1; SELECT * FROM table1; +------------+-----------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+-----------+------+-----+-------------------+----------------+ | uid | int(11) | NO | PRI | NULL | auto_increment | | created_at | timestamp | NO | | CURRENT_TIMESTAMP | | +------------+-----------+------+-----+-------------------+----------------+ 2 rows in set (0.00 sec) +-----+---------------------+ | uid | created_at | +-----+---------------------+ | 1 | 2017-01-21 07:59:28 | | 2 | 2017-01-21 07:59:33 | | 3 | 2017-01-21 07:59:35 | | 4 | 2017-01-21 09:01:40 | +-----+---------------------+ 4 rows in set (0.00 sec) mysql> DESCRIBE table2; SELECT * FROM table2; +------------+-----------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+-----------+------+-----+-------------------+----------------+ | uid | int(11) | NO | PRI | NULL | auto_increment | | created_at | timestamp | NO | | CURRENT_TIMESTAMP | | +------------+-----------+------+-----+-------------------+----------------+ 2 rows in set (0.00 sec) +-----+---------------------+ | uid | created_at | +-----+---------------------+ | 1 | 2017-01-21 07:59:47 | | 3 | 2017-01-21 07:59:51 | +-----+---------------------+ 2 rows in set (0.01 sec) mysql> DESCRIBE table3; SELECT * FROM table3; +------------+-----------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+-----------+------+-----+-------------------+----------------+ | uid | int(11) | NO | PRI | NULL | auto_increment | | created_at | timestamp | NO | | CURRENT_TIMESTAMP | | +------------+-----------+------+-----+-------------------+----------------+ 2 rows in set (0.01 sec) +-----+---------------------+ | uid | created_at | +-----+---------------------+ | 1 | 2017-01-21 07:59:28 | | 2 | 2017-01-21 07:59:33 | +-----+---------------------+ 2 rows in set (0.00 sec) mysql> DESCRIBE sys_refindex; SELECT * FROM sys_refindex; +-----------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+--------------+------+-----+---------+----------------+ | uid | int(11) | NO | PRI | NULL | auto_increment | | recuid | int(11) | NO | | 0 | | | tablename | varchar(255) | NO | | | | +-----------+--------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) +-----+--------+-----------+ | uid | recuid | tablename | +-----+--------+-----------+ | 1 | 1 | table1 | | 2 | 2 | table1 | | 3 | 3 | table1 | | 4 | 4 | table1 | | 5 | 2 | table2 | | 6 | 3 | table2 | | 7 | 1 | table3 | | 8 | 2 | table3 | | 9 | 3 | table3 | | 10 | 4 | table3 | +-----+--------+-----------+ 10 rows in set (0.00 sec)
Expected outcome¶
- no records from sys_refindex need to be deleted for table1
- the record with uid 5 needs to be deleted from sys_refindex for table2 as there is no record with uid 2 in table2
- the records with uid 9 and 10 need to be deleted from sys_refindex for table3 as there is no record with uid 3 or 4 in table3
Current outcome¶
Evaluation the subquery¶
First observation: It's not working at all - needs fixing in multiple places.
mysql> SELECT uid FROM table1 sub_table1 WHERE sub_table1.uid = sys_refindex.recuid; ERROR 1054 (42S22): Unknown column 'sys_refindex.recuid' in 'where clause' mysql> SELECT uid FROM table1 sub_table1,sys_refindex WHERE sub_table1.uid = sys_refindex.recuid; ERROR 1052 (23000): Column 'uid' in field list is ambiguous
Checking for "lost indexes"¶
mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table1' AND NOT EXISTS (SELECT sub_table1.uid FROM table1 sub_table1,sys_refindex WHERE sub_table1.uid = sys_refindex.recuid); +------------+ | COUNT(uid) | +------------+ | 0 | +------------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table2' AND NOT EXISTS (SELECT sub_table2.uid FROM table2 sub_table2,sys_refindex WHERE sub_table2.uid = sys_refindex.recuid); +------------+ | COUNT(uid) | +------------+ | 0 | +------------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND NOT EXISTS (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid); +------------+ | COUNT(uid) | +------------+ | 0 | +------------+ 1 row in set (0.00 sec)
This is a bit unexpected. There should be work to be done for table2 and table3.
Debugging the subquery¶
mysql> SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid; +-----+ | uid | +-----+ | 1 | | 2 | | 2 | | 1 | | 2 | +-----+ 5 rows in set (0.00 sec)
That's more rows than expected - restricting on the tablename is missing.
mysql> SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3'; +-----+ | uid | +-----+ | 1 | | 2 | +-----+ 2 rows in set (0.00 sec)
2 rows is what we want and those are the uids which should be part of sys_refindex.
mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND NOT EXISTS (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3'); +------------+ | COUNT(uid) | +------------+ | 0 | +------------+ 1 row in set (0.00 sec)
Wait, what? Still not finding what we are looking for!
NOT EXISTS vs NOT IN¶
According to the docs: NOT EXISTS (<subquery>) is FALSE if the subquery returns any rows at all.
Interpretation: The subquery returns two rows in the example, so the NOT EXISTS part evaluates to FALSE.
This basically reduces constraint to WHERE tablename='table3' AND FALSE.
Obviously nothing to count here.
Compare that to the NOT IN behavior which actually checks for records that are not in the result set of the subquery:
mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND recuid NOT IN (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3'); +------------+ | COUNT(uid) | +------------+ | 2 | +------------+ 1 row in set (0.00 sec)
Finally, the expected result, we need to get rid of the records with uids 9 and 10, so a count of two is what we want.
Let's validate that we are selecting those records:
mysql> SELECT * FROM sys_refindex WHERE tablename='table3' AND recuid NOT IN (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3'); +-----+--------+-----------+ | uid | recuid | tablename | +-----+--------+-----------+ | 9 | 3 | table3 | | 10 | 4 | table3 | +-----+--------+-----------+ 2 rows in set (0.00 sec)
YAY! This is what we want to delete.
Updated by Gerrit Code Review almost 8 years ago
Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Gerrit Code Review almost 8 years ago
Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Gerrit Code Review almost 8 years ago
Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Gerrit Code Review almost 8 years ago
Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370
Updated by Thomas Hohn almost 8 years ago
- Status changed from Under Review to Resolved
- % Done changed from 0 to 100
Applied in changeset 40299330d47144dc9589c024a071d0e3d97b5a54.
Updated by Riccardo De Contardi about 7 years ago
- Status changed from Resolved to Closed