FAL Migration should consolidate duplicated files
TYPO3 versions prior to FAL made copies to /uploads/ for each integration of a file. Since these copies are numbered and FAL calculates a SHA1 hash over the file contents it should be possible to consolidate these files when they are copied to the /fileadmin/_migrated/ folder.
Otherwise an upgraded installation has new files with meaningful FAL referece counts and migrated files with lots of duplicates who all have a reference count of 1. With an "intelligent" FAL migration upgraded installations could take full advantage (for all content/files) of nice FAL features such as central update of files / space savings etc.
Updated by Dirk Klimpel over 3 years ago
This is an old thread but I had same problem / request.
I have builded a solution.
# table with sha1 hashes of all files and # how often occur CREATE TEMPORARY TABLE IF NOT EXISTS temp_table_sha1 ( index(sha1) ) ENGINE=MyISAM AS ( SELECT sys_file.sha1, count( sys_file.sha1 ) AS anz FROM sys_file GROUP BY sys_file.sha1 ); # table of all files to migrate (source) # when file exists more then one times # and file is saved in folder "_migrated" CREATE TEMPORARY TABLE IF NOT EXISTS temp_table_src ( index(uid), key(sha1) ) ENGINE=MyISAM AS ( SELECT sys_file.uid, sys_file.sha1 FROM sys_file INNER JOIN temp_table_sha1 ON sys_file.sha1 = temp_table_sha1.sha1 WHERE temp_table_sha1.anz > 1 AND sys_file.identifier LIKE '/_migrated/%' ORDER BY sys_file.uid ); # table of all files of orign # when file exists more then one times, is not missing # and file is not saved in folder "_migrated" or "uploads" or "templates" CREATE TEMPORARY TABLE IF NOT EXISTS temp_table_dst ( INDEX(sha1) ) ENGINE=MyISAM AS ( SELECT sys_file.uid, sys_file.sha1 FROM sys_file INNER JOIN temp_table_sha1 ON sys_file.sha1 = temp_table_sha1.sha1 WHERE temp_table_sha1.anz > 1 AND sys_file.identifier NOT RLIKE '/_migrated/.*|/uploads/.*|/templates/.*' AND sys_file.missing = 0 GROUP BY sys_file.sha1 ); # create backup CREATE TABLE sys_file_reference_bak LIKE sys_file_reference; INSERT sys_file_reference_bak SELECT * FROM sys_file_reference; # update reference table # join sys_file_reference.uid_local -> temp_table_src.uid - temp_table_src.sha1 -> temp_table_dst.sha1 - sys_file_reference.uid_local # replace uid of old files (temp_table_src) with uid of new files (temp_table_dst) # matching with same sha1 hash UPDATE sys_file_reference INNER JOIN temp_table_src ON temp_table_src.uid = sys_file_reference.uid_local INNER JOIN temp_table_dst ON temp_table_dst.sha1 = temp_table_src.sha1 SET sys_file_reference.uid_local = temp_table_dst.uid WHERE sys_file_reference.table_local = 'sys_file'; # show the changes with help of backup table SELECT * FROM sys_file_reference INNER JOIN sys_file_reference_bak ON sys_file_reference_bak.uid = sys_file_reference.uid WHERE sys_file_reference_bak.uid_local <> sys_file_reference.uid_local;
After that you have to check / update the reference index.
You can delete the old files with the FAL Explorer in folder "_migrated" which have no reference anymore, now.