Task #102164
closedEpic #101608: File Abstraction Layer Changes for v13
Have the task object clean up the configuration
100%
Description
Within the FileProcessingService class, various cleanups regarding the configuration array was made in order to avoid duplicate processed files.
However, as noted in the existing code, this actually are special cases for specific tasks (ensuring that width and height are actually integers and have max boundaries in ImagePreviewTask).
However, there is a culprit when moving this to the actual Task Objects (and that might be the reason why it was implemented the way it was):
First, there is the ProcessedFile object, which then has a ->getTask() call, which then builds the Task object. The Task object thus needs a ProcessedFile object.
But the ProcessedFile object is created AFTER it was checked in the DB based on the given configuration (which needs to be sanitized first to avoid duplicates),
so a classic Chicken Egg problem.
In the ideal world, the Task object should not hold state anymore, but write everything back to the ProcessedFile (e.g. configuration) as this is still duplicated in the current state.
So, to reduce the interwoven situation, the AbstractTask object now has a new option called sanitizeConfiguration() which at a later point should be part of the interface (= breaking), but then also work with the ProcessedFile->getProcessingConfiguration()
directly to reduce the duplication of memory everywhere.
For the time being, an intermediate (empty) ProcessedFile
is created, a Task object is instantiated and the configuration
is sanitized (in the ProcessedFileRepository),
when checking in the DB if a DB entry is available.
The final ProcessedFile object is re-created after the DB
query, which contains the sanitized configuration array.