Project

General

Profile

Actions

Bug #82929

closed

sys_file_references filled with duplicate/multiple entries

Added by Sigfried Arnold over 6 years ago. Updated about 2 months ago.

Status:
Closed
Priority:
Should have
Assignee:
-
Category:
DataHandler aka TCEmain
Target version:
-
Start date:
2017-11-06
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
8
PHP Version:
7.0
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

On multiple TYPO3 instances running on TYPO3 8.7.8 (some of them were clean installed with different Versions of the TYPO3 8 branche since the release and now updated to the latest Version) we get multiple entries for the same file.

They are not shown in the backend but are visible in the database and are taken into account while pages are being generated -

SELECT pid, fieldname, COUNT(*) as count FROM sys_file_reference GROUP BY pid,fieldname order by count DESC;

The worst case i found so far looks like this:

50    image    214572
232    assets    352
9    assets    98
9    image    96
103    image    75
138    image    72
138    assets    70
94    assets    42
109    assets    41
235    assets    40

The result is, that the page generation for those pages takes ages (or till max_execution_time is reached in this example, where 200k images are present).

This issue also might be connected to #80804 since the blocked apache processes sometimes do not unlock the lockfiles properly and then apache gets jammed.

Currently i have not figured out yet, what causes the duplication.

Also i found an older TYPO3 7.6 Instance, which does the same.


Related issues 2 (0 open2 closed)

Related to TYPO3 Core - Bug #80804: Apache precesses overload because of typo3temp/locks/flock_Closed2017-04-11

Actions
Related to TYPO3 Core - Bug #87640: Keep language reference for associated l10n_mode=exclude childrenClosed2019-02-04

Actions
Actions #1

Updated by Frans Saris over 6 years ago

Are you using multiple languages and or workspaces?

Actions #2

Updated by Sigfried Arnold over 6 years ago

On the 7.6 instance we used workspaces and multiple languages, but now workspaces are disabled and only languages are in use

On the 8.7 instances none uses workspacese but all of them use at least one additional language, one uses 9 languages - but there is no corellation between language count and duplicates - the worst case only got 2 languages

Actions #3

Updated by Frans Saris over 6 years ago

Can you show some more column data. What fields are/are not identical

Actions #4

Updated by Sigfried Arnold over 6 years ago

identical fields are the following:

sys_language_uid, l10n_parent, l10n_diffsource, uid_local, uid_foreign, tablenames, fieldname (also title, description, alternative and so on are identical)

here is an example from a web without workspaces/versioning - so i removed all t3ver_* fields (they just contain 0)

This is an example of "bad" case - this is taken from a bigger website with currently 9 languages in total - as you can see those records are inserted the same time - sometimes only a duplicate, somtimes a huge amount of multiple entries - this page only has one image in the media field - i've also provided a grouping by language - as you can see, it inserted 7 copies on a few languages, only L=16 worked properly and L=1 & 2 really went nuts.

COUNT(*) sys_language_uid
1        0
18038    1
18046    2
7        3
7        4
7        5
7        11
1        15
1        16
uid     pid  FROM_UNIXTIME(tstamp)   FROM_UNIXTIME(crdate)  sorting   deleted        l10n_parent         uid_foreign   fieldname
                                                                           hidden           l10n_diffsource  tablenames
                                                                                sys_language_uid   uid_local
212      7    2017-05-11 16:43:40    2016-03-15 09:05:27    1031168   0    0    0    0      ...    18    7    pages    media
39902    7    2017-06-07 17:31:16    2017-06-07 17:31:16    285952    0    0    1    212    ...    18    7    pages    media
39903    7    2017-06-07 17:31:16    2017-06-07 17:31:16    512       0    0    1    212    ...    18    7    pages    media
39936    7    2017-06-20 18:40:26    2017-06-20 18:40:26    472576    0    0    1    212    ...    18    7    pages    media
39937    7    2017-06-20 18:40:26    2017-06-20 18:40:26    905728    0    0    1    212    ...    18    7    pages    media
39938    7    2017-06-20 18:40:26    2017-06-20 18:40:26    188160    0    0    1    212    ...    18    7    pages    media
39934    7    2017-06-20 18:40:26    2017-06-20 18:40:26    718080    0    0    1    212    ...    18    7    pages    media
39935    7    2017-06-20 18:40:26    2017-06-20 18:40:26    596992    0    0    1    212    ...    18    7    pages    media
39964    7    2017-07-07 08:34:50    2017-07-07 08:34:50    83456     0    0    1    212    ...    18    7    pages    media
39965    7    2017-07-07 08:34:50    2017-07-07 08:34:50    801024    0    0    1    212    ...    18    7    pages    media
39966    7    2017-07-07 08:34:50    2017-07-07 08:34:50    676608    0    0    1    212    ...    18    7    pages    media
39967    7    2017-07-07 08:34:50    2017-07-07 08:34:50    555520    0    0    1    212    ...    18    7    pages    media
39968    7    2017-07-07 08:34:50    2017-07-07 08:34:50    988672    0    0    1    212    ...    18    7    pages    media
39969    7    2017-07-07 08:34:50    2017-07-07 08:34:50    926464    0    0    1    212    ...    18    7    pages    media
39970    7    2017-07-07 08:34:50    2017-07-07 08:34:50    493312    0    0    1    212    ...    18    7    pages    media
39971    7    2017-07-07 08:34:50    2017-07-07 08:34:50    617728    0    0    1    212    ...    18    7    pages    media
39972    7    2017-07-07 08:34:50    2017-07-07 08:34:50    738816    0    0    1    212    ...    18    7    pages    media
39973    7    2017-07-07 08:34:50    2017-07-07 08:34:50    146688    0    0    1    212    ...    18    7    pages    media
39974    7    2017-07-07 08:34:50    2017-07-07 08:34:50    864256    0    0    1    212    ...    18    7    pages    media
39975    7    2017-07-07 08:34:50    2017-07-07 08:34:50    21248     0    0    1    212    ...    18    7    pages    media
39976    7    2017-07-07 08:34:50    2017-07-07 08:34:50    431104    0    0    1    212    ...    18    7    pages    media
Actions #5

Updated by Frans Saris over 6 years ago

Interesting thing is that a lot of timestamps are the same and looking at the sorting field it looks like they are somehow all linked to the same field.

Is there some custom tca added there? Can you see what action triggers the creation of all the extra rows in the DB.

Actions #6

Updated by Sigfried Arnold over 6 years ago

In most cases there was Grid Elements and DCE present, that was our fist guess - but we have a few occurrences without any additional fields.

So far we have not isolated the issue yet.

Actions #7

Updated by Markus Opahle over 6 years ago

I don't know if this is related to this issue but I found a similar issue on several 8.7 instances. The only difference here is that the duplicates marked as deleted right after creation. This happens when you add l10n_mode=exclude to an inline relation to sys_file_references. The duplicates appear when you translate a record and they depend on the number of translations exist before (so you actually have to translate the record twice to reproduce the issue).
It seems that an inline relationship to a translatable table and with l10n_mode=exclude is not handled well by \TYPO3\CMS\Core\DataHandling\Localization\DataMapProcessor which creates the duplicates in my case.
Hope this helps

Actions #8

Updated by Susanne Moog about 6 years ago

  • Category set to DataHandler aka TCEmain
Actions #9

Updated by Sigfried Arnold almost 6 years ago

Just got noticed by a customer with a TYPO3 7.6 instance with multiple languages with this issue - this time a few hundred copies. The instance runs on Azure with Ubuntu and MariaDB - and has a different architecure from our other instances i've seen so far. Others usually used CentOS with MySQL or RHEL with MySQL.

Sadly this enviroment has no transaction log in MariaDB activated, so i cannot narrow down, what created those duplicates.

Is there any new information available on this issue?

Actions #10

Updated by Riccardo De Contardi almost 6 years ago

  • Related to Bug #80804: Apache precesses overload because of typo3temp/locks/flock_ added
Actions #11

Updated by Oliver Hader about 5 years ago

Having a sequence of recreated records (e.g. 20 deleted, one visible record, all having same information) might the result of a flow in l10n_mode=exclude in combination with inline records that has been addressed in issue #87640 - maybe give it a try in order to avoid these re-synchronization problems.

Actions #12

Updated by Oliver Hader about 5 years ago

  • Status changed from New to Needs Feedback
Actions #13

Updated by Oliver Hader about 5 years ago

  • Related to Bug #87640: Keep language reference for associated l10n_mode=exclude children added
Actions #14

Updated by Sybille Peters about 5 years ago

Oliver, the patch introduced for #87640 was only merged into TYPO3 9, but this bug here was reported for 8 and 7.

Actions #15

Updated by Sybille Peters about 5 years ago

Sigfried, I don't quite see how the same fieldname and pid occuring multiple times would automatically point to a problem in the query you suggested:

SELECT pid, fieldname, COUNT(*) as count FROM sys_file_reference GROUP BY pid,fieldname order by count DESC;

But this query would with a higher probability, I think: It checks if there are duplicates with the same relation:

SELECT pid, tablenames,uid_foreign,table_local,uid_local,COUNT(*) as count FROM sys_file_reference GROUP BY pid,tablenames,uid_foreign,table_local,uid_local order by count DESC;

I do get hits for that (but not such huge amounts, all less than 10).

Actions #16

Updated by Jonas Eberle about 4 years ago

I also get hits for that in the range 2-4 in several instances.

# I added a constraint count > 1
SELECT 
  pid, tablenames, uid_foreign, table_local, uid_local, COUNT(*) as count 
FROM 
  sys_file_reference 
GROUP BY 
  pid, tablenames, uid_foreign, table_local, uid_local 
HAVING 
  count > 1 
ORDER BY 
  count DESC;

The instances are: 8.7, 9.5 (upgraded), 9.5 (fresh)

Actions #17

Updated by Sigfried Arnold about 4 years ago

I was not "haunted" by this issue since (at least not in TYPO3 9.x) - but be aware, that those queries do not account for deleted records, so sometimes multiple references are fine - if there are only a few, you should check em if they are really duplicates or if they are just deleted

Actions #18

Updated by Jonas Eberle about 4 years ago

Taking into account deleted and hidden it would be

SELECT
    pid, tablenames, uid_foreign, table_local, uid_local, COUNT(*) as count
FROM
    sys_file_reference
WHERE
    !deleted AND !hidden
GROUP BY
    pid, tablenames, uid_foreign, table_local, uid_local
HAVING
    count > 1
ORDER BY
    count DESC;

I still have duplicate records in all those instances.

Actions #19

Updated by Benni Mack about 4 years ago

Jonas Eberle wrote:

Taking into account deleted and hidden it would be

[...]

I still have duplicate records in all those instances.

Can you find out how this happens?

Actions #20

Updated by Christian Kuhn over 1 year ago

  • Status changed from Needs Feedback to Closed

Hey. I hope it's ok to close here: Keeping db integrity over multiple core version when upgrading core isn't simple and a broader task. Extension "dbdoctor" from https://github.com/lolli42/dbdoctor tries to find such inconsistencies and to fix them, but it is not in a state that could be merged into core. We'll see on how it evolves, though.

Actions #21

Updated by Michael Nußbaumer about 2 months ago

Jonas Eberle wrote in #note-18:

Taking into account deleted and hidden it would be

[...]

I still have duplicate records in all those instances.

This query is not quite right, it misses fieldname for grouping:

SELECT
    pid, tablenames, fieldname, uid_foreign, table_local, uid_local, COUNT(*) as count
FROM
    sys_file_reference
WHERE
    !deleted AND !hidden
GROUP BY
    pid, tablenames, fieldname, uid_foreign, table_local, uid_local
HAVING
    count > 1
ORDER BY
    count DESC;
Actions

Also available in: Atom PDF