Bug #91768

Race condition while caching data using SimpleFileBackend

Added by Michael Stucki over 1 year ago. Updated 10 months ago.

Status:
New
Priority:
Should have
Assignee:
-
Category:
Caching
Target version:
-
Start date:
2020-07-08
Due date:
% Done:

0%

Estimated time:
TYPO3 Version:
10
PHP Version:
7.2
Tags:
Complexity:
Is Regression:
Sprint Focus:

Description

When two requests run at the same time:
- request A clears a cache (e.g. cache_core)
- request B tries to write into the same cache

In this situation, request B may fail because the parent folder is gone:

[06-Jul-2020 17:54:02] WARNING: [pool www] child 4978 said into stderr: "NOTICE: PHP message: https://example.host/ - core: Core: Error handler (FE): PHP Warning: file_put_contents(/var/www/html/html/typo3temp/var/Cache/Data/l10n/5f03491aac1d8038441429.temp): failed to open stream: No such file or directory in /var/www/html/vendor/typo3/cms/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php line 236" 
[06-Jul-2020 17:54:02] WARNING: [pool www] child 4978 said into stderr: "NOTICE: PHP message: https://example.host/ - Core: Exception handler (WEB): Uncaught TYPO3 Exception: #1334756737: The temporary cache file "/var/www/html/html/typo3temp/var/Cache/Data/l10n/5f03491aac1d8038441429.temp" could not be written. | TYPO3\CMS\Core\Cache\Exception thrown in file /var/www/html/vendor/typo3/cms/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php in line 239. Requested URL: https://example.host/home/" 

This seems to happen more often on non-local filesystems because they are slower. However, it could also happen when using a local temp folder.


Related issues

Related to TYPO3 Core - Bug #87174: .... typo3temp/var/cache/code/cache_core/site-configuration.php): Access is deniedNew2018-12-16

Actions
Related to TYPO3 Core - Task #88927: The temporary cache file ... could not be writtenNeeds Feedback2019-08-06

Actions
#1

Updated by Michael Stucki over 1 year ago

I spent a lot of time analyzing this problem, and my conclusion is that the error should be ignored by TYPO3:

If it happens that one request clears the cache while another request tries to write to it, just ignore if this fails. This means that the result is not cached, but the page can still be generated. The cache will be filled with one of the next requests as soon as the temp folder exists again...

This is my current proposal to solve / ignore this error:

diff --git a/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php b/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php
index d2dbb371fb..09e5f15f25 100644
--- a/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php
+++ b/typo3/sysext/core/Classes/Cache/Backend/SimpleFileBackend.php
@@ -226,13 +226,19 @@ class SimpleFileBackend extends AbstractBackend implements PhpCapableBackendInte
             throw new \InvalidArgumentException('The specified entry identifier must not be empty.', 1334756736);
         }
         $temporaryCacheEntryPathAndFilename = $this->cacheDirectory . StringUtility::getUniqueId() . '.temp';
-        $result = file_put_contents($temporaryCacheEntryPathAndFilename, $data);
+        $result = @file_put_contents($temporaryCacheEntryPathAndFilename, $data);
         GeneralUtility::fixPermissions($temporaryCacheEntryPathAndFilename);
         if ($result === false) {
-            throw new Exception('The temporary cache file "' . $temporaryCacheEntryPathAndFilename . '" could not be written.', 1334756737);
+            // This operation may fail when another request is clearing the cache (by removing and re-creating $this->cacheDirectory) in the same moment.
+            // Ignore this error and return without storing the result. A future request will come back here and try again...
+            return;
         }
         $cacheEntryPathAndFilename = $this->cacheDirectory . $entryIdentifier . $this->cacheEntryFileExtension;
-        rename($temporaryCacheEntryPathAndFilename, $cacheEntryPathAndFilename);
+        $result = @rename($temporaryCacheEntryPathAndFilename, $cacheEntryPathAndFilename);
+        if ($result === false) {
+            // This may fail for the same reason as above.
+            return;
+        }
         if ($this->cacheEntryFileExtension === '.php') {
             GeneralUtility::makeInstance(OpcodeCacheService::class)->clearAllActive($cacheEntryPathAndFilename);
         }

I'm not 100% happy with this approach, but what are the alternatives?

  • Wait some milliseconds and try again?
  • Use locking to gain exclusive access to the SimpleFileBackend (keep in mind that this should work over multiple hosts)
  • Stop clearing the cache by removing the whole folder
  • ...

Let me know what you think!

#2

Updated by Michael Stucki over 1 year ago

  • Related to Bug #87174: .... typo3temp/var/cache/code/cache_core/site-configuration.php): Access is denied added
#3

Updated by Mathias Brodala 12 months ago

  • Related to Task #88927: The temporary cache file ... could not be written added
#4

Updated by Mathias Brodala 12 months ago

As mentioned in Slack the change suggested by Michael here was the only thing which allowed me to complete my deployment (switch from TYPO3v8 to TYPO3v9). Everything else I tried before (manually creating the cache directory, creating the cache directory in the SimpleFileBackend before writing the file, switching to FileBackend) didn't help.

#5

Updated by ondro no-lastname-given 11 months ago

We struggled with the same troubles on typo3 hosted on Kubernetes cluster (with multiple pods) or hosted on multiple servers (active/active with LB) which shares/store cache files via persistence volume (in case of k8s) or NFS/Ceph file systems. No matter of typo3 version 8/9/10

Partially we solved it by moving caches to Redis but 'core' cache is not possible to configure to use redis ... :(

Have you found a solution for that?
thx

#6

Updated by Michael Stucki 11 months ago

Did you try my patch from above? Thanks to this my websites run fine in Kubernetes with multiple pods. Feel free to ping me on Slack if you need more infos.

#7

Updated by ondro no-lastname-given 11 months ago

Michael Stucki wrote in #note-6:

Did you try my patch from above? Thanks to this my websites run fine in Kubernetes with multiple pods. Feel free to ping me on Slack if you need more infos.

Hi Michael we will try your patch for sure although it's quite dirty workaround and there should be a nicer way for that ...

#8

Updated by Michael Stucki 11 months ago

Feel free to add suggestions on how this could be improved.

#9

Updated by ondro no-lastname-given 10 months ago

Michael Stucki wrote in #note-6:

Did you try my patch from above? Thanks to this my websites run fine in Kubernetes with multiple pods. Feel free to ping me on Slack if you need more infos.

I've tried your patch but unfortunately it doesn't help :(
Also I've tried other workarounds (to ignore errors etc.) but nothing helps here ...

Also available in: Atom PDF