Bug #82929
closedsys_file_references filled with duplicate/multiple entries
Added by Sigfried Arnold about 7 years ago. Updated 9 months ago.
0%
Description
On multiple TYPO3 instances running on TYPO3 8.7.8 (some of them were clean installed with different Versions of the TYPO3 8 branche since the release and now updated to the latest Version) we get multiple entries for the same file.
They are not shown in the backend but are visible in the database and are taken into account while pages are being generated -
SELECT pid, fieldname, COUNT(*) as count FROM sys_file_reference GROUP BY pid,fieldname order by count DESC;
The worst case i found so far looks like this:
50 image 214572 232 assets 352 9 assets 98 9 image 96 103 image 75 138 image 72 138 assets 70 94 assets 42 109 assets 41 235 assets 40
The result is, that the page generation for those pages takes ages (or till max_execution_time is reached in this example, where 200k images are present).
This issue also might be connected to #80804 since the blocked apache processes sometimes do not unlock the lockfiles properly and then apache gets jammed.
Currently i have not figured out yet, what causes the duplication.
Also i found an older TYPO3 7.6 Instance, which does the same.
Updated by Frans Saris about 7 years ago
Are you using multiple languages and or workspaces?
Updated by Sigfried Arnold about 7 years ago
On the 7.6 instance we used workspaces and multiple languages, but now workspaces are disabled and only languages are in use
On the 8.7 instances none uses workspacese but all of them use at least one additional language, one uses 9 languages - but there is no corellation between language count and duplicates - the worst case only got 2 languages
Updated by Frans Saris about 7 years ago
Can you show some more column data. What fields are/are not identical
Updated by Sigfried Arnold about 7 years ago
identical fields are the following:
sys_language_uid, l10n_parent, l10n_diffsource, uid_local, uid_foreign, tablenames, fieldname (also title, description, alternative and so on are identical)
here is an example from a web without workspaces/versioning - so i removed all t3ver_* fields (they just contain 0)
This is an example of "bad" case - this is taken from a bigger website with currently 9 languages in total - as you can see those records are inserted the same time - sometimes only a duplicate, somtimes a huge amount of multiple entries - this page only has one image in the media field - i've also provided a grouping by language - as you can see, it inserted 7 copies on a few languages, only L=16 worked properly and L=1 & 2 really went nuts.
COUNT(*) sys_language_uid 1 0 18038 1 18046 2 7 3 7 4 7 5 7 11 1 15 1 16
uid pid FROM_UNIXTIME(tstamp) FROM_UNIXTIME(crdate) sorting deleted l10n_parent uid_foreign fieldname hidden l10n_diffsource tablenames sys_language_uid uid_local 212 7 2017-05-11 16:43:40 2016-03-15 09:05:27 1031168 0 0 0 0 ... 18 7 pages media 39902 7 2017-06-07 17:31:16 2017-06-07 17:31:16 285952 0 0 1 212 ... 18 7 pages media 39903 7 2017-06-07 17:31:16 2017-06-07 17:31:16 512 0 0 1 212 ... 18 7 pages media 39936 7 2017-06-20 18:40:26 2017-06-20 18:40:26 472576 0 0 1 212 ... 18 7 pages media 39937 7 2017-06-20 18:40:26 2017-06-20 18:40:26 905728 0 0 1 212 ... 18 7 pages media 39938 7 2017-06-20 18:40:26 2017-06-20 18:40:26 188160 0 0 1 212 ... 18 7 pages media 39934 7 2017-06-20 18:40:26 2017-06-20 18:40:26 718080 0 0 1 212 ... 18 7 pages media 39935 7 2017-06-20 18:40:26 2017-06-20 18:40:26 596992 0 0 1 212 ... 18 7 pages media 39964 7 2017-07-07 08:34:50 2017-07-07 08:34:50 83456 0 0 1 212 ... 18 7 pages media 39965 7 2017-07-07 08:34:50 2017-07-07 08:34:50 801024 0 0 1 212 ... 18 7 pages media 39966 7 2017-07-07 08:34:50 2017-07-07 08:34:50 676608 0 0 1 212 ... 18 7 pages media 39967 7 2017-07-07 08:34:50 2017-07-07 08:34:50 555520 0 0 1 212 ... 18 7 pages media 39968 7 2017-07-07 08:34:50 2017-07-07 08:34:50 988672 0 0 1 212 ... 18 7 pages media 39969 7 2017-07-07 08:34:50 2017-07-07 08:34:50 926464 0 0 1 212 ... 18 7 pages media 39970 7 2017-07-07 08:34:50 2017-07-07 08:34:50 493312 0 0 1 212 ... 18 7 pages media 39971 7 2017-07-07 08:34:50 2017-07-07 08:34:50 617728 0 0 1 212 ... 18 7 pages media 39972 7 2017-07-07 08:34:50 2017-07-07 08:34:50 738816 0 0 1 212 ... 18 7 pages media 39973 7 2017-07-07 08:34:50 2017-07-07 08:34:50 146688 0 0 1 212 ... 18 7 pages media 39974 7 2017-07-07 08:34:50 2017-07-07 08:34:50 864256 0 0 1 212 ... 18 7 pages media 39975 7 2017-07-07 08:34:50 2017-07-07 08:34:50 21248 0 0 1 212 ... 18 7 pages media 39976 7 2017-07-07 08:34:50 2017-07-07 08:34:50 431104 0 0 1 212 ... 18 7 pages media
Updated by Frans Saris about 7 years ago
Interesting thing is that a lot of timestamps are the same and looking at the sorting field it looks like they are somehow all linked to the same field.
Is there some custom tca added there? Can you see what action triggers the creation of all the extra rows in the DB.
Updated by Sigfried Arnold about 7 years ago
In most cases there was Grid Elements and DCE present, that was our fist guess - but we have a few occurrences without any additional fields.
So far we have not isolated the issue yet.
Updated by Markus Opahle almost 7 years ago
I don't know if this is related to this issue but I found a similar issue on several 8.7 instances. The only difference here is that the duplicates marked as deleted right after creation. This happens when you add l10n_mode=exclude to an inline relation to sys_file_references. The duplicates appear when you translate a record and they depend on the number of translations exist before (so you actually have to translate the record twice to reproduce the issue).
It seems that an inline relationship to a translatable table and with l10n_mode=exclude is not handled well by \TYPO3\CMS\Core\DataHandling\Localization\DataMapProcessor which creates the duplicates in my case.
Hope this helps
Updated by Susanne Moog almost 7 years ago
- Category set to DataHandler aka TCEmain
Updated by Sigfried Arnold over 6 years ago
Just got noticed by a customer with a TYPO3 7.6 instance with multiple languages with this issue - this time a few hundred copies. The instance runs on Azure with Ubuntu and MariaDB - and has a different architecure from our other instances i've seen so far. Others usually used CentOS with MySQL or RHEL with MySQL.
Sadly this enviroment has no transaction log in MariaDB activated, so i cannot narrow down, what created those duplicates.
Is there any new information available on this issue?
Updated by Riccardo De Contardi over 6 years ago
- Related to Bug #80804: Apache precesses overload because of typo3temp/locks/flock_ added
Updated by Oliver Hader almost 6 years ago
Having a sequence of recreated records (e.g. 20 deleted, one visible record, all having same information) might the result of a flow in l10n_mode=exclude
in combination with inline
records that has been addressed in issue #87640 - maybe give it a try in order to avoid these re-synchronization problems.
Updated by Oliver Hader almost 6 years ago
- Status changed from New to Needs Feedback
Updated by Oliver Hader almost 6 years ago
- Related to Bug #87640: Keep language reference for associated l10n_mode=exclude children added
Updated by Sybille Peters over 5 years ago
Oliver, the patch introduced for #87640 was only merged into TYPO3 9, but this bug here was reported for 8 and 7.
Updated by Sybille Peters over 5 years ago
Sigfried, I don't quite see how the same fieldname and pid occuring multiple times would automatically point to a problem in the query you suggested:
SELECT pid, fieldname, COUNT(*) as count FROM sys_file_reference GROUP BY pid,fieldname order by count DESC;
But this query would with a higher probability, I think: It checks if there are duplicates with the same relation:
SELECT pid, tablenames,uid_foreign,table_local,uid_local,COUNT(*) as count FROM sys_file_reference GROUP BY pid,tablenames,uid_foreign,table_local,uid_local order by count DESC;
I do get hits for that (but not such huge amounts, all less than 10).
Updated by Jonas Eberle over 4 years ago
I also get hits for that in the range 2-4 in several instances.
# I added a constraint count > 1 SELECT pid, tablenames, uid_foreign, table_local, uid_local, COUNT(*) as count FROM sys_file_reference GROUP BY pid, tablenames, uid_foreign, table_local, uid_local HAVING count > 1 ORDER BY count DESC;
The instances are: 8.7, 9.5 (upgraded), 9.5 (fresh)
Updated by Sigfried Arnold over 4 years ago
I was not "haunted" by this issue since (at least not in TYPO3 9.x) - but be aware, that those queries do not account for deleted records, so sometimes multiple references are fine - if there are only a few, you should check em if they are really duplicates or if they are just deleted
Updated by Jonas Eberle over 4 years ago
Taking into account deleted and hidden it would be
SELECT pid, tablenames, uid_foreign, table_local, uid_local, COUNT(*) as count FROM sys_file_reference WHERE !deleted AND !hidden GROUP BY pid, tablenames, uid_foreign, table_local, uid_local HAVING count > 1 ORDER BY count DESC;
I still have duplicate records in all those instances.
Updated by Benni Mack over 4 years ago
Jonas Eberle wrote:
Taking into account deleted and hidden it would be
[...]
I still have duplicate records in all those instances.
Can you find out how this happens?
Updated by Christian Kuhn almost 2 years ago
- Status changed from Needs Feedback to Closed
Hey. I hope it's ok to close here: Keeping db integrity over multiple core version when upgrading core isn't simple and a broader task. Extension "dbdoctor" from https://github.com/lolli42/dbdoctor tries to find such inconsistencies and to fix them, but it is not in a state that could be merged into core. We'll see on how it evolves, though.
Updated by Michael Nußbaumer 9 months ago
Jonas Eberle wrote in #note-18:
Taking into account deleted and hidden it would be
[...]
I still have duplicate records in all those instances.
This query is not quite right, it misses fieldname for grouping:
SELECT
pid, tablenames, fieldname, uid_foreign, table_local, uid_local, COUNT(*) as count
FROM
sys_file_reference
WHERE
!deleted AND !hidden
GROUP BY
pid, tablenames, fieldname, uid_foreign, table_local, uid_local
HAVING
count > 1
ORDER BY
count DESC;