Project

General

Profile

Actions

Bug #103707

open

Duplicate entries in sys_file table

Added by Ulrich Mathes 11 days ago. Updated 11 days ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
File Abstraction Layer (FAL)
Target version:
-
Start date:
2024-04-23
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
11
PHP Version:
8.3
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

We have encountered an issue where there are duplicate entries in the sys_file table pointing to the same physical file. This results in the Filelist module not displaying references to those files correctly. Consequently, these files can be deleted even though there are references pointing to another sys_file entry that shares the same file in the filesystem.

To check for duplicates, you can run:

SELECT COUNT(*), `identifier` FROM `sys_file` GROUP BY `identifier` HAVING COUNT(*) > 1;

Here is an example of a duplicate entry:

SELECT * FROM `sys_file` where identifier like "/user_upload/Bilder/Wirtschaftspruefer/Veranstaltung_2024/mbs_04.jpg";
+------+-----+------------+--------------+---------+---------+------+----------+----------------------------------------------------------------------+------------------------------------------+------------------------------------------+-----------+------------+------------+------------------------------------------+--------+---------------+-------------------+
| uid  | pid | tstamp     | last_indexed | missing | storage | type | metadata | identifier                                                           | identifier_hash                          | folder_hash                              | extension | mime_type  | name       | sha1                                     | size   | creation_date | modification_date |
+------+-----+------------+--------------+---------+---------+------+----------+----------------------------------------------------------------------+------------------------------------------+------------------------------------------+-----------+------------+------------+------------------------------------------+--------+---------------+-------------------+
| 2792 |   0 | 1711288107 |   1711288107 |       0 |       1 | 2    |        0 | /user_upload/Bilder/Wirtschaftspruefer/Veranstaltung_2024/mbs_04.jpg | ed3516d8ff67f42ae9a0b4ca9443989421b21324 | 928bfe2c54bf0575ae5ad65b55cb3deee6aaa658 | jpg       | image/jpeg | mbs_04.jpg | 61782595dddae96224e259cf7170dd646fb4e3d1 | 562574 |    1711288107 |        1711288107 |
| 2794 |   0 | 1712041049 |   1712041049 |       0 |       1 | 2    |        0 | /user_upload/Bilder/Wirtschaftspruefer/Veranstaltung_2024/mbs_04.jpg | ed3516d8ff67f42ae9a0b4ca9443989421b21324 | 928bfe2c54bf0575ae5ad65b55cb3deee6aaa658 | jpg       | image/jpeg | mbs_04.jpg | 61782595dddae96224e259cf7170dd646fb4e3d1 | 562574 |    1712041049 |        1711288107 |

In another instance, there are multiple sys_file records pointing to files outside the fileadmin directory. This might indicate that the problem is not related to the Filelist module and user uploads, but could also be rooted in page rendering.

| uid  | pid | tstamp     | last_indexed | missing | storage | type | metadata | identifier                                                          | identifier_hash                          | folder_hash                              | extension | mime_type     | name        | sha1                                     | size | creation_date | modification_date |
+------+-----+------------+--------------+---------+---------+------+----------+---------------------------------------------------------------------+------------------------------------------+------------------------------------------+-----------+---------------+-------------+------------------------------------------+------+---------------+-------------------+
| 2754 |   0 | 1680786128 |            0 |       0 |       0 | 2    |        0 | /typo3conf/ext/sitepackage/Resources/Public/Icons/Flags/flag_at.svg | aa8d82349adeaa66fa834c0bab77a507d74bbd3a | 15b08915cf0ec6b4ffef65022ba6219d4111ee18 | svg       | image/svg+xml | flag_at.svg | 87fcd7ee2c2fe0e53935bd952efdb97dde1b3c66 |  226 |    1680786117 |        1680786007 |
| 2755 |   0 | 1680786128 |            0 |       0 |       0 | 2    |        0 | /typo3conf/ext/sitepackage/Resources/Public/Icons/Flags/flag_at.svg | aa8d82349adeaa66fa834c0bab77a507d74bbd3a | 15b08915cf0ec6b4ffef65022ba6219d4111ee18 | svg       | image/svg+xml | flag_at.svg | 87fcd7ee2c2fe0e53935bd952efdb97dde1b3c66 |  226 |    1680786117 |        1680786007 |

This issue appears to be at least four years old, as i found the extension https://extensions.typo3.org/package/elementareteilchen/unduplicator, which was created to find and fix such duplicates. The extension dates back to 2020.

We found these problems in many other projects. However, we do not have a TYPO3 v12 or v13 instance that is not an upgrade from TYPO3 v11, so we are currently unsure if this issue affects TYPO3 v12 or v13 as well.

Actions #1

Updated by Ulrich Mathes 11 days ago

  • Description updated (diff)
Actions #2

Updated by Christian Kuhn 11 days ago ยท Edited

Thanks Ulrich :)

Great report!

I'll try to fire that query onto the one or the other b13 project as well, to see if we experience similar issues.

Some things come to my mind:
  • Fixing currently broken DB status could be a job for dbdoctor, I'm not sure about details at the moment, though.
  • If the issue persists, we need to get some clue which action in the core creates those dupes.
  • We might think about adding a 'unique' key - maybe on the combination of the two hash fields? When done, we should probably see bug reports (with backtraces) when core tries to insert a dupe, which will help us to trace the issue.
  • I wonder this hasn't been reported, yet?!
Actions

Also available in: Atom PDF