Bug #68134
closed
sem_acquire while loop can lead to 100% CPU load
Added by Stefan Hekele over 9 years ago.
Updated over 8 years ago.
Description
Faulty function:
typo3/sysext/core/Classes/Locking/Locker.php->acquire
Environment:
RedHat 6.5
Apache 2.4.12
PHP FPM 5.5.6 / 5.5.21 (tested before and after update)
TYPO3 6.2.14 with semaphore locking
Encountered behavior:
After a cache clear via admin interface, a siege run with more than 3-4 users leads to at least one timed out response. The FPM process serving that requests uses 100% CPU till PHP timeout. Using the recommended timeout setting of 240, this quickly leads to an unresponsive server at just medium traffic. A lower timeout setting mitigates the problem.
Expected behavior:
Just like before 6.2, and like 7.3+, the lock acquiration should not be attempted in a loop. After a failed lock, an exception should be thrown, same as the other locking mechanisms.
Proposed change:
https://github.com/plan2net/TYPO3.CMS/commit/055b4f853f1fe44db023da3c2b60c7ef988ef817
I observed the same behaviour (CPU maxes out) in another installation.
The Locking method LOCKING_METHOD_SEMAPHORE simply is broken in 6.2.* and apparently also in the 6.2-compliancy code still living in 7.
Stefan's patch fixes the broken "while" perfect.
- Assignee set to Georg Ringer
Ahoi GeorgR,
Can you please review Stefan's patch and bring it into the gerrit process?
Guess it would be a cool fix for a real stupid bug which is a Blocker, that is still lives in both, the 6.2 + future LTS versions.
Thanks G
- Status changed from New to Needs Feedback
- Assignee changed from Georg Ringer to Markus Klein
Where are you using this?
Using acquireExclusiveLock() would be the right way.
TYPO3\CMS\Frontend\Controller\TypoScriptFrontendController->acquirePageGenerationLock calls TYPO3\CMS\Core\Locking\Locker->acquire
I dug into that again and can explain that now.
The reason for FE still using the old acquire()
method is that it relies on the (crazy) implementation of that function.
It is not feasible to replace the implementation in 6.2 anymore. I invested a fortune of time in 7 to get that whole API out of the way and the FE implementation correct.
For 6.2 I highly recommend to NOT use semaphore locking at all.
Some other reason: The implementation is not safe an uses different identifiers for the semaphores which causes the system to be flooded with semaphores. You need a root on the server to actually get rid of them. Number of semaphores allocated might be O(number of pages)
If it doesn't get fixed in 6.2 then you should at least disable the locking-mechanism semaphore in 6.2 with the next release if it's broken beyond repair.
It's an absurdity to speak of Long Term Support and then not fixing such inherently broken features.
I understand that:
- 6.2 LTS has a bug, which can bring down a webserver in seconds.
- The bug is well known, understood and can be fixed.
- The bug is also shipped with the 7.2 future LTS version (same code, same bug)
But "It is not feasible to replace the implementation in 6.2 anymore".
I fail to understand WHY.
Given Peter's Markus Klein's Warning:
"highly recommend to NOT use semaphore locking at all"
I suggest to:
- Change the LOCKING_METHOD_SEMAPHORE to use the filelocking Method (as is working)
- Remove the broken semaphore code.
Q: Shall we provide a patch implementing this strategy?
PS: Funny to read that the 6.2 Core uses a methode which is deprecated since 6.2, see:
TYPO3\CMS\Core\Locking\Locker->acquire
@deprecated since 6.2 - will be removed two versions later; use new API instead
Edit: "Peter" = MarkusK
Markus, (!=Peter).
- Sorry for the name mismatch. Is corrected.
Thanks for your answer to my question:
Q: Shall we provide a patch implementing this strategy?
A: If you get that fixed, I'm happy to review a patch. But I can of course only speak for myself.
We'll work on that one.
- Status changed from Needs Feedback to New
- Assignee deleted (
Markus Klein)
- Category set to Performance
- Status changed from New to Needs Feedback
Still waiting for the patch by Georg.
- Status changed from Needs Feedback to Closed
Closed due to missing feedback.
Also available in: Atom
PDF