Feature #7848

Support array / multiple values in $query->contains

Added by Morton Jonuschat over 7 years ago. Updated 12 months ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
Extbase
Target version:
-
Start date:
2010-05-21
Due date:
2010-06-04
% Done:

0%

PHP Version:
Tags:
Complexity:
medium
Sprint Focus:

Description

The contains constraint should accept an Array/ObjectStorage/Traversable as condition so that building a query like this is possible

$query->matching(
$query->contains('relation', array(1,2,3,4,5))
);

Currently the only way is to build an array of $query->contains('relation', $relation) and combine that with a logicalOr condition.

Looking at the resulting SQL this leads to an excessive and unnecessary amount of subqueries.


Related issues

Related to Core - Bug #64044: Revert fix for #63275 until concept is worked out and agreed upon Resolved 2014-12-26
Duplicated by Core - Bug #63275: Typo3DbQueryParser generates broken SQL for array in CONTAINS-comparison Closed 2014-11-23

Associated revisions

Revision e73f204d (diff)
Added by Helmut Hummel over 2 years ago

Revert "[BUGFIX] Typo3DbQueryParser: Use IN with array-operand2"

Reverts: #63275
Resolves: #64044
Related: #7848

This reverts commit e235c9ad8b32267ece98d843a43daeab430db235.

Change-Id: I71a134469b9b44f892ad13be52d828ca653c03f1
Reviewed-on: http://review.typo3.org/35648
Reviewed-by: Helmut Hummel <>
Tested-by: Helmut Hummel <>

History

#1 Updated by Felix Oertel over 7 years ago

  • Due date set to 2010-06-04
  • Status changed from New to Needs Feedback
  • Start date changed from 2010-05-20 to 2010-05-21

Hi,

what exactly do you try to achieve with your statment, or better: what should the sql output of $query->contains('relation', array(1,2,3,4,5)) be? At first glance, the sql generated by this would not differ from those built with logicalOr or did I miss a way to simplify that statement?

regards, foertel

#2 Updated by Christian Müller over 7 years ago

Felix Oertel wrote:

Hi,

what exactly do you try to achieve with your statment, or better: what should the sql output of $query->contains('relation', array(1,2,3,4,5)) be? At first glance, the sql generated by this would not differ from those built with logicalOr or did I miss a way to simplify that statement?

regards, foertel

Maybe check also $query->in() as that could be exactly what you want.

regards,
Christian

#3 Updated by Morton Jonuschat over 7 years ago

Ok, given I have two tables linked by a M:M relation the logicalOr / array of constraints results in something like this:

SELECT tx_extension_domain_model_parent.*
FROM tx_extension_domain_model_parent
WHERE (
  tx_extension_domain_model_parent.uid
  IN (
    SELECT uid_local FROM tx_extension_domain_model_parent_child_mm WHERE uid_foreign =20
  )
  OR tx_extension_domain_model_parent.uid
  IN (
    SELECT uid_local FROM tx_extension_domain_model_parent_child_mm WHERE uid_foreign =19
  )
)
AND tx_extension_domain_model_parent.deleted =0
AND tx_extension_domain_model_parent.hidden =0
AND tx_extension_domain_model_parent.pid IN ( 27 )

So this results in N+1 queries being performed by the database.

I would like something like this to be the result when I pass in an array / collection of objects:

SELECT tx_extension_domain_model_parent.*
FROM tx_extension_domain_model_parent
WHERE tx_extension_domain_model_parent.uid IN (
  SELECT uid_local FROM tx_extension_domain_model_parent_child_mm WHERE uid_foreign IN (20,19)
)
AND tx_extension_domain_model_parent.deleted =0
AND tx_extension_domain_model_parent.hidden =0
AND tx_extension_domain_model_parent.pid IN ( 27 )

$query->in() is a partial solution, but it means that I need to (manually) reduce collection of model objects ( I already have ) into an array of say UIDs and then use $query->in('relation.uid', $array_of_uids). This seems redundant and counter-inituitive. Why would I need to manually perform these steps when even the choice of the constraint name (contains) implies querying a subset. The way it currently works a better name would be "is()", but that's only semantics.

#4 Updated by Christian Müller over 7 years ago

Ok, I agree that should be handled better... Anyone else has some idea how to archive that with current code? Otherwise I would take it to make a patch for having arrays too in contains.

Morton Jonuschat wrote:

Ok, given I have two tables linked by a M:M relation the logicalOr / array of constraints results in something like this:
[...]

So this results in N+1 queries being performed by the database.

I would like something like this to be the result when I pass in an array / collection of objects:

[...]

$query->in() is a partial solution, but it means that I need to (manually) reduce collection of model objects ( I already have ) into an array of say UIDs and then use $query->in('relation.uid', $array_of_uids). This seems redundant and counter-inituitive. Why would I need to manually perform these steps when even the choice of the constraint name (contains) implies querying a subset. The way it currently works a better name would be "is()", but that's only semantics.

#5 Updated by Sebastian Kurfuerst over 7 years ago

Hi everybody,

Ok, I agree that should be handled better... Anyone else has some idea how to archive that with current code? Otherwise I would take it to make a patch for having arrays too in contains.

I think just using an array in contains() is not really possible, as one cannot know which case is meant:

Example: $query->contains('relation', array(1,2,3,4,5));

This could mean:
  • Find elements where relation is 1, or 2, or 3, ... (this is what you suggested)
  • Find elements where there exists a relation to 1, AND to 2, AND to 3. (this is also a valid possibility in an M:N relation case).

... and Extbase can not resolve this ambiguity.

$query->in() is a partial solution, but it means that I need to (manually) reduce collection of model objects ( I already have ) into an array of say UIDs and then use $query->in('relation.uid', $array_of_uids).

Are you sure? I think you can directly use the model objects like $query->in('relation', $array_of_model_objects), and Extbase should figure out the UID stuff by itself.
If that is not supported, I'd rather suggest to include support for this, as we don't have such an ambiquity in this case.

Greets,
Sebastian

#6 Updated by Sebastian Kurfuerst over 7 years ago

Ah, one more thing: When we change behavior here, we should open a ticket in the FLOW3 issue tracker, so Karsten can implement the corresponding functionality in the FLOW3 query language. Keeping these two parts in sync is crucial.

#7 Updated by Christian Müller over 7 years ago

Sebastian Kurfuerst wrote:

Ah, one more thing: When we change behavior here, we should open a ticket in the FLOW3 issue tracker, so Karsten can implement the corresponding functionality in the FLOW3 query language. Keeping these two parts in sync is crucial.

Good point about the "AND" case.
Yes, something like that with $query->in() is what I hoped for... If its not there that would be a much cleaner way to go. This would do for the "OR" case, what about the "AND" case of relations?

#8 Updated by Morton Jonuschat over 7 years ago

1)

$query->in('relation', $array_of_model_objects)
didn't work for me, it was one the first things I tried. I needed to reduce the array_of_model_objects to uids and query the relation uid.

2) I think there's a logical flaw in your "AND" case. I can't find a case where you will construct something where there is an "AND" condition on the child side of the relation with only a single attribute being queried. Any example I can come up with results in an empty result set. In my opinion the "AND" case happens on the parent side of the relation - which means it's an extra condition on the query and the current logicalAnd is the way to go. I think AND only comes into play if you are searching a parent object where there's a child with attr=value1 AND there's a child with attr=value2.

#9 Updated by Christian Müller about 7 years ago

Morton Jonuschat wrote:

1) [...] didn't work for me, it was one the first things I tried. I needed to reduce the array_of_model_objects to uids and query the relation uid.

2) I think there's a logical flaw in your "AND" case. I can't find a case where you will construct something where there is an "AND" condition on the child side of the relation with only a single attribute being queried. Any example I can come up with results in an empty result set. In my opinion the "AND" case happens on the parent side of the relation - which means it's an extra condition on the query and the current logicalAnd is the way to go. I think AND only comes into play if you are searching a parent object where there's a child with attr=value1 AND there's a child with attr=value2.

What if you search for a parent that has relation to two childs, so something like that (just modified copy&paste, not optimized):

SELECT tx_extension_domain_model_parent.*
FROM tx_extension_domain_model_parent
WHERE (
  tx_extension_domain_model_parent.uid
  IN (
    SELECT uid_local FROM tx_extension_domain_model_parent_child_mm WHERE uid_foreign =20
  )
  AND tx_extension_domain_model_parent.uid
  IN (
    SELECT uid_local FROM tx_extension_domain_model_parent_child_mm WHERE uid_foreign =19
  )
)
AND tx_extension_domain_model_parent.deleted =0
AND tx_extension_domain_model_parent.hidden =0
AND tx_extension_domain_model_parent.pid IN ( 27 )

Should find you all parents that have childs with uid 20 AND 19. But for me this could be solved with logicalAnd.

#10 Updated by Morton Jonuschat about 7 years ago

Well, even with this example I can't see any advantage to having a "AND" Operator in the contains() constraint:

  • The result is no different than using a simple logicalAnd, the programming using logicalAnd is more obvious (and better readable)
  • There is no SQL command "optimizes" the N+1 Queries, so the "AND" would be some magic that just reduces a very little amount of typing for the programmer.

Contrasting the OR case

  • The result is very different and much more optimized
  • Most other ORMs handle it this way (OK, some have "whereIn" or something like that)
  • There is an explicit SQL command for this use case.

#11 Updated by Bastian Waidelich almost 7 years ago

  • Category set to Extbase: Generic Persistence
  • Assignee set to Jochen Rau
  • Target version set to Extbase 1.3.0beta1

If I get it right, we just need to convert objects to uids in

$query->in('property', array($object1, $object2))

correct?

#12 Updated by Bastian Waidelich over 6 years ago

  • Target version changed from Extbase 1.3.0beta1 to Extbase 1.4

Any news here?

#13 Updated by Morton Jonuschat over 6 years ago

I would be all too happy to settle for the solution of converting objects to ids you proposed 6 months ago. Beats doing it by hand any time.

#14 Updated by Alexander Schnitzler almost 5 years ago

  • Assignee deleted (Jochen Rau)
  • Target version changed from Extbase 1.4 to Extbase 6.1
  • Has patch set to No

#15 Updated by Robert Weißgraeber over 4 years ago

  • Target version changed from Extbase 6.1 to Extbase 6.2

#16 Updated by Anja Leichsenring about 4 years ago

  • Target version changed from Extbase 6.2 to Extbase 6.3

#17 Updated by Markus Timtner almost 3 years ago

@All: Has there anything happened here ever since??

#18 Updated by Stefan Froemken almost 3 years ago

I don't think so. In current projects I work that way:

// create OR-Query for categories
foreach (GeneralUtility::trimExplode(',', $categories) as $category) {
    $categoryOrQuery[] = 'sys_category_record_mm.uid_local IN (\'' . (int) $category . '\')';
}

...

WHERE (' . implode(' OR ', $categoryOrQuery) . ')

Further keep in mind that Extbase does not respect col "tablenames" and "fieldname" in MM-tables (Yes I know that Frans is just working on that). So in that case it's better to use statements instead.

Stefan

#19 Updated by Alexander Opitz almost 3 years ago

  • Project changed from Extbase MVC Framework to Core
  • Category changed from Extbase: Generic Persistence to Extbase
  • Status changed from Needs Feedback to New
  • Target version changed from Extbase 6.3 to 7.0

#20 Updated by Mathias Schreiber over 2 years ago

  • Target version changed from 7.0 to 7.1 (Cleanup)

#21 Updated by Helmut Hummel over 2 years ago

  • Status changed from New to Needs Feedback
  • Complexity set to medium

Here are my thoughts on this:

$query->in('singleValueProperty', $arrayOfThings)

"In" comparison is to compare a single value property on the left side with an array of values.

Putting this in a sentence or directive (which the QOM is all about) it would be:

"Find entities which have any of these things in my single value property"

$query->contains('collectionProperty', $oneThing)

"Contains" comparison is to compare a collection property on the left with a single value.

Putting this in a sentence or directive it would be:

"Find entities which have this thing in their collection"

Making contains accept an array as comparator, we would change the directive as follows:

The directive would translate to:

"Find entities which have these things in their collection"

Which could mean:

"Find entities which have all these things in their collection"

or

"Find entities which have any of these things in their collection"

This is exactly the ambiguity Sebastian is talking about.

At this point it does not really matter whether the case with all these things (referred as AND case in previous comments) could be expressed otherwise or is the "rarer" use case, it is a valid one!

When writing:

$query->contains('tags', $arrayOfTags)

I could surely expect to find entities that has all of the given tags attached, not any of them.

Implementing a solution favoring one over the other would be wrong.

Conclusion:

This is a "won't fix"

What could be done though, is to throw a more meaningful exception in case one passes an array as comparator to contains.
Additionally we could provide a utility method, which returns an array of identifiers for a given collection of entities or value objects so one could use:

$query->in('tags.uid', SomeClass::someMeaningfulUtilityMethod($arrayOfTagEntities)

Which would result in the exact same query proposed in https://review.typo3.org/#/c/34514/6/typo3/sysext/extbase/Classes/Persistence/Generic/Storage/Typo3DbQueryParser.php and the overhead would also be the same, just the place where this transformation happens is different.

This would keep the QOM consistent, while aiding developers for common tasks a bit.

Besides that all of this could easily be adopted for Flow Doctrine Persistence (where the exact same issue exists)

#22 Updated by Helmut Hummel over 2 years ago

everybody cool with that?

#23 Updated by Morton Jonuschat over 2 years ago

I'm cool with that, making it easier for the simple cases is a good step. For the complex cases I've worked around them for 4.5 years now and fallen back to pure SQL statements whenever needed so for me that's a non-issue nowadays.

#24 Updated by Stefan Neufeind over 2 years ago

Yes, fine with me. (Sorry for the late reply. Missed one notification-email.)
We'd insert one object inbetween, in the comparison, to make clear if "any of" or "all of this" are meant.

#25 Updated by Stephan Großberndt over 2 years ago

Should a new issue be created for this utility method, which returns an array of identifiers for a given collection of entities or value objects? Has this already been done?

#26 Updated by Benni Mack about 2 years ago

  • Target version changed from 7.1 (Cleanup) to 7.4 (Backend)

#27 Updated by Susanne Moog about 2 years ago

  • Target version changed from 7.4 (Backend) to 7.5

#28 Updated by Morton Jonuschat about 2 years ago

  • Assignee set to Morton Jonuschat

#29 Updated by Benni Mack almost 2 years ago

  • Target version changed from 7.5 to 8 LTS

#30 Updated by Moritz Ahl over 1 year ago

Just a side note: Fixing this issue would also make life easier on the view level - for example, the ability to use the Default Fluid Pagebrowser Widget, which depends on queries to be built with QOM (as the widget currently cannot handle "statement" queries). And since Core Support for sys_categories has been added, it's even more important to provide an efficient way to handle "qualified" mm tables.

#31 Updated by Alexander Opitz over 1 year ago

  • Status changed from Needs Feedback to New

#32 Updated by Morton Jonuschat 12 months ago

  • Status changed from New to Closed
  • Assignee deleted (Morton Jonuschat)
  • Target version deleted (8 LTS)

The main issue is a "Won't fix", the helper method - if anybody wants to tackle it - should be done in a dedicated issue.
Closing this due to inactivity.

Also available in: Atom PDF