Project

General

Profile

Actions

Bug #79397

closed

SQL error in ReferenceIndexing

Added by Thomas Hohn almost 8 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Must have
Assignee:
Category:
Database API (Doctrine DBAL)
Target version:
Start date:
2017-01-20
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
8
PHP Version:
7.0
Tags:
Complexity:
Is Regression:
No
Sprint Focus:

Description

Just with a relative small amount and tables the ReferenceIndexing crashes with an SQL error.
This seems to be due to the fact that the method updateIndex uses the NOT IN () in 2 SQL statements.

There is a limit on the number of elements in a IN clause that can be exceeded and in addition a better
and more efficient way is to use the AND NOT EXISTS (SUB-QUERY)

Actions #1

Updated by Gerrit Code Review almost 8 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #2

Updated by Gerrit Code Review almost 8 years ago

Patch set 2 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #3

Updated by Gerrit Code Review almost 8 years ago

Patch set 3 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #4

Updated by Morton Jonuschat almost 8 years ago

This comment is in relation to Patchset #3 in the review system. As it doesn't allow for formatting text/code I put this explanation here.

Base configuration

This is done with reduced tables to ease preparing data.

Your MySQL connection id is 827
Server version: 5.7.16-0ubuntu0.16.04.1 (Ubuntu)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> DESCRIBE table1; SELECT * FROM table1;
+------------+-----------+------+-----+-------------------+----------------+
| Field      | Type      | Null | Key | Default           | Extra          |
+------------+-----------+------+-----+-------------------+----------------+
| uid        | int(11)   | NO   | PRI | NULL              | auto_increment |
| created_at | timestamp | NO   |     | CURRENT_TIMESTAMP |                |
+------------+-----------+------+-----+-------------------+----------------+
2 rows in set (0.00 sec)

+-----+---------------------+
| uid | created_at          |
+-----+---------------------+
|   1 | 2017-01-21 07:59:28 |
|   2 | 2017-01-21 07:59:33 |
|   3 | 2017-01-21 07:59:35 |
|   4 | 2017-01-21 09:01:40 |
+-----+---------------------+
4 rows in set (0.00 sec)

mysql> DESCRIBE table2; SELECT * FROM table2;
+------------+-----------+------+-----+-------------------+----------------+
| Field      | Type      | Null | Key | Default           | Extra          |
+------------+-----------+------+-----+-------------------+----------------+
| uid        | int(11)   | NO   | PRI | NULL              | auto_increment |
| created_at | timestamp | NO   |     | CURRENT_TIMESTAMP |                |
+------------+-----------+------+-----+-------------------+----------------+
2 rows in set (0.00 sec)

+-----+---------------------+
| uid | created_at          |
+-----+---------------------+
|   1 | 2017-01-21 07:59:47 |
|   3 | 2017-01-21 07:59:51 |
+-----+---------------------+
2 rows in set (0.01 sec)

mysql> DESCRIBE table3; SELECT * FROM table3;
+------------+-----------+------+-----+-------------------+----------------+
| Field      | Type      | Null | Key | Default           | Extra          |
+------------+-----------+------+-----+-------------------+----------------+
| uid        | int(11)   | NO   | PRI | NULL              | auto_increment |
| created_at | timestamp | NO   |     | CURRENT_TIMESTAMP |                |
+------------+-----------+------+-----+-------------------+----------------+
2 rows in set (0.01 sec)

+-----+---------------------+
| uid | created_at          |
+-----+---------------------+
|   1 | 2017-01-21 07:59:28 |
|   2 | 2017-01-21 07:59:33 |
+-----+---------------------+
2 rows in set (0.00 sec)

mysql> DESCRIBE sys_refindex; SELECT * FROM sys_refindex;
+-----------+--------------+------+-----+---------+----------------+
| Field     | Type         | Null | Key | Default | Extra          |
+-----------+--------------+------+-----+---------+----------------+
| uid       | int(11)      | NO   | PRI | NULL    | auto_increment |
| recuid    | int(11)      | NO   |     | 0       |                |
| tablename | varchar(255) | NO   |     |         |                |
+-----------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

+-----+--------+-----------+
| uid | recuid | tablename |
+-----+--------+-----------+
|   1 |      1 | table1    |
|   2 |      2 | table1    |
|   3 |      3 | table1    |
|   4 |      4 | table1    |
|   5 |      2 | table2    |
|   6 |      3 | table2    |
|   7 |      1 | table3    |
|   8 |      2 | table3    |
|   9 |      3 | table3    |
|  10 |      4 | table3    |
+-----+--------+-----------+
10 rows in set (0.00 sec)

Expected outcome

  • no records from sys_refindex need to be deleted for table1
  • the record with uid 5 needs to be deleted from sys_refindex for table2 as there is no record with uid 2 in table2
  • the records with uid 9 and 10 need to be deleted from sys_refindex for table3 as there is no record with uid 3 or 4 in table3

Current outcome

Evaluation the subquery

First observation: It's not working at all - needs fixing in multiple places.

mysql> SELECT uid FROM table1 sub_table1 WHERE sub_table1.uid = sys_refindex.recuid;
ERROR 1054 (42S22): Unknown column 'sys_refindex.recuid' in 'where clause'

mysql> SELECT uid FROM table1 sub_table1,sys_refindex WHERE sub_table1.uid = sys_refindex.recuid;
ERROR 1052 (23000): Column 'uid' in field list is ambiguous

Checking for "lost indexes"

mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table1' AND NOT EXISTS (SELECT sub_table1.uid FROM table1 sub_table1,sys_refindex WHERE sub_table1.uid = sys_refindex.recuid);
+------------+
| COUNT(uid) |
+------------+
|          0 |
+------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table2' AND NOT EXISTS (SELECT sub_table2.uid FROM table2 sub_table2,sys_refindex WHERE sub_table2.uid = sys_refindex.recuid);
+------------+
| COUNT(uid) |
+------------+
|          0 |
+------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND NOT EXISTS (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid);
+------------+
| COUNT(uid) |
+------------+
|          0 |
+------------+
1 row in set (0.00 sec)

This is a bit unexpected. There should be work to be done for table2 and table3.

Debugging the subquery

mysql> SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid;
+-----+
| uid |
+-----+
|   1 |
|   2 |
|   2 |
|   1 |
|   2 |
+-----+
5 rows in set (0.00 sec)

That's more rows than expected - restricting on the tablename is missing.

mysql> SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3';
+-----+
| uid |
+-----+
|   1 |
|   2 |
+-----+
2 rows in set (0.00 sec)

2 rows is what we want and those are the uids which should be part of sys_refindex.

mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND NOT EXISTS (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3');
+------------+
| COUNT(uid) |
+------------+
|          0 |
+------------+
1 row in set (0.00 sec)

Wait, what? Still not finding what we are looking for!

NOT EXISTS vs NOT IN

According to the docs: NOT EXISTS (<subquery>) is FALSE if the subquery returns any rows at all.
Interpretation: The subquery returns two rows in the example, so the NOT EXISTS part evaluates to FALSE.
This basically reduces constraint to WHERE tablename='table3' AND FALSE.
Obviously nothing to count here.

Compare that to the NOT IN behavior which actually checks for records that are not in the result set of the subquery:

mysql> SELECT COUNT(uid) FROM sys_refindex WHERE tablename='table3' AND recuid NOT IN (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3');
+------------+
| COUNT(uid) |
+------------+
|          2 |
+------------+
1 row in set (0.00 sec)

Finally, the expected result, we need to get rid of the records with uids 9 and 10, so a count of two is what we want.
Let's validate that we are selecting those records:

mysql> SELECT * FROM sys_refindex WHERE tablename='table3' AND recuid NOT IN (SELECT sub_table3.uid FROM table3 sub_table3,sys_refindex WHERE sub_table3.uid = sys_refindex.recuid AND sys_refindex.tablename = 'table3');
+-----+--------+-----------+
| uid | recuid | tablename |
+-----+--------+-----------+
|   9 |      3 | table3    |
|  10 |      4 | table3    |
+-----+--------+-----------+
2 rows in set (0.00 sec)

YAY! This is what we want to delete.

Actions #5

Updated by Gerrit Code Review almost 8 years ago

Patch set 4 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #6

Updated by Gerrit Code Review almost 8 years ago

Patch set 5 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #7

Updated by Gerrit Code Review almost 8 years ago

Patch set 6 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #8

Updated by Gerrit Code Review almost 8 years ago

Patch set 7 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51370

Actions #9

Updated by Thomas Hohn almost 8 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #10

Updated by Riccardo De Contardi about 7 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF