Project

General

Profile

Actions

Task #80014

closed

Remove extbase ensureCorrectPostOrderingByCategoryTitle test

Added by Morton Jonuschat about 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Must have
Category:
Tests
Target version:
Start date:
2017-02-25
Due date:
% Done:

100%

Estimated time:
TYPO3 Version:
8
PHP Version:
7.1
Tags:
Complexity:
Sprint Focus:
Stabilization Sprint

Description

The test is based on the assumption that a relational database can easily sort across a M:N relation. This is not the case, sorting across M:N relations using a series of joins is an indeterministic operation at best and not supported by any standard compliant database.

The reason for this is that GROUP BY evaluation happens before ORDER BY. Under the assumption that a Post has two categories assigned the row which should be the result of the aggregation can not reliably be determined by the database and a random record based on the joined rows would have to be chosen. In one query this could be the Post joined to the first category, in another the Post that is joined to the second category.

As this is not deterministic this will lead to failures on most databases. MySQL has an operating mode which enables this random selection (which was enabled by default until MySQL 5.7.5) and as such would execute the query without a hard error.

To exemplify this assume the following join result: (before grouping and ordering)

post.uid | post.title | category.title
--------------------------------------
4        | Post 4     | Cat 2
3        | Post 3     | Cat 2
1        | Post 1     | Cat 1
2        | Post 2     | Cat 5
4        | Post 4     | Cat 3
4        | Post 4     | Cat 1

What would the correct reduction be after the grouping step?
First rows?

post.uid | post.title | category.title
--------------------------------------
4        | Post 4     | Cat 2
3        | Post 3     | Cat 2
1        | Post 1     | Cat 1
2        | Post 2     | Cat 5

Last rows?

post.uid | post.title | category.title
--------------------------------------
3        | Post 3     | Cat 2
1        | Post 1     | Cat 1
2        | Post 2     | Cat 5
4        | Post 4     | Cat 1

A random one?

post.uid | post.title | category.title
--------------------------------------
3        | Post 3     | Cat 2
1        | Post 1     | Cat 1
2        | Post 2     | Cat 5
4        | Post 4     | Cat 3

Given the uncertainties in this step the outcome of the order by could be that Post 4 is listed with Cat 1,2 or 3 and all would be "correct" by definition as there is no rule as to how to reduce this.

As this is a limitation of the database engines/the undefined expected outcome I would remove this test entirely.

If this works on a MySQL database that has the partial group by extension activated and the user can live with possibly random results give them the results, otherwise let the query fail with an appropriate message from the database:

MySQL:

TYPO3\CMS\Extbase\Persistence\Generic\Storage\Exception\SqlErrorException: Expression #1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'test_ftb66966d.sys_category.title' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

PostgreSQL:

TYPO3\CMS\Extbase\Persistence\Generic\Storage\Exception\SqlErrorException: SQLSTATE[42803]: Grouping error: 7 ERROR:  column "sys_category.title" must appear in the GROUP BY clause or be used in an aggregate function


Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #79988: Fix ensureCorrectPostOrderingByCategoryTitle extbase test on postgresClosed2017-02-24

Actions
Related to TYPO3 Core - Feature #80380: Missing Distinct / GroupBy option in QueryClosed2017-03-21

Actions
Actions #1

Updated by Gerrit Code Review about 7 years ago

  • Status changed from New to Under Review

Patch set 1 for branch master of project Packages/TYPO3.CMS has been pushed to the review server.
It is available at https://review.typo3.org/51844

Actions #2

Updated by Christian Kuhn about 7 years ago

  • Sprint Focus set to Stabilization Sprint
Actions #3

Updated by Morton Jonuschat about 7 years ago

  • Status changed from Under Review to Resolved
  • % Done changed from 0 to 100
Actions #4

Updated by Markus Klein about 7 years ago

For the record. The issue actually is that the doctrine migration for the MM part is again the reason and is therefore the 3rd regression to the migration patch!
The former distinct was changed to a groupby uid, which leaves the orderby out of scope and causes this problem.
The test itself is therefore correct, but the underlying extbase query parser is the problem.

Actions #5

Updated by Morton Jonuschat about 7 years ago

In this situation DISTINCT and GROUP BY are identical and the behavior after the fix is not a regression but by design of the standard. It only ever worked because ONLY_FULL_GROUP_BY was not enabled by default.

Quoting from https://dev.mysql.com/doc/refman/5.7/en/distinct-optimization.html

In most cases, a DISTINCT clause can be considered as a special case of GROUP BY. For example, the following two queries are equivalent:

SELECT DISTINCT c1, c2, c3 FROM t1
WHERE c1 > const;

SELECT c1, c2, c3 FROM t1
WHERE c1 > const GROUP BY c1, c2, c3;

Quoting from https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html

In MySQL 5.7.5 and higher, ONLY_FULL_GROUP_BY also affects handling of queries that use DISTINCT and ORDER BY. Consider the case of a table t with three columns c1, c2, and c3 that contains these rows:

c1 c2 c3
1  2  A
3  4  B
1  2  C

Suppose that we execute the following query, expecting the results to be ordered by c3:
SELECT DISTINCT c1, c2 FROM t ORDER BY c3;

To order the result, duplicates must be eliminated first. But to do so, should we keep the first row or the third? This arbitrary choice influences the retained value of c3, which in turn influences ordering and makes it arbitrary as well. To prevent this problem, a query that has DISTINCT and ORDER BY is rejected as invalid if any ORDER BY expression does not satisfy at least one of these conditions:
  • The expression is equal to one in the select list
  • All columns referenced by the expression and belonging to the query's selected tables are elements of the select list
Actions #6

Updated by Christian Kuhn almost 7 years ago

  • Related to Feature #80380: Missing Distinct / GroupBy option in Query added
Actions #7

Updated by Riccardo De Contardi over 6 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF