http://forge.typo3.org/http://forge.typo3.org/themes/typo3_forge/favicon/favicon.png?17058661692010-12-21T16:42:21ZTYPO3 ForgeTYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594712010-12-21T16:42:21ZMyroslav Holyakvbhjckfd@gmail.com
<ul></ul><p>According to php manual only strings and integers can be saved "as is", all else will be serialized. <a class="external" href="http://www.php.net/manual/en/memcache.set.php">http://www.php.net/manual/en/memcache.set.php</a></p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594722010-12-21T23:03:01ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>Mmmh.</p>
<p>Situation:<br />- StringFrontend throws an exception if given data is not a string. It's a string frontend, we probably shouldn't just remove the exception.<br />- VariableFrontend always serializes the given data.</p>
<p>Assumption:<br />I doubt that php based serialization is much slower than serialization done in memcache (I did not benchmark!). Even if it's slower, there are php modules which speed up serialization a lot (like igbinary). And if serialization is that slow in php, this should probably handled in php upstream.</p>
<p>Possible solutions:<br />1) Make sure that incoming data to the VariableFrontend is not already serialized (so no double serialization is done), use the StringFrontend if data is already serialized -> core v4 task.<br />2) Use igbinary as a drop-in replacement for php with an php.ini setup. Make sure the selected backend handles this -> local setup<br />3) Add igbinary as new caching framework frontend, make sure all backends successfully handle binary stuff -> FLOW3 commit -> backport v4 core<br />4) Hack some 'do-not-change-content-whatever-comes-in' frontend which could be used with the memcache backend -> FLOW3 commit -> backport v4 core.</p>
<p>At the moment I'm unsure which solution is best.</p>
<p>Links:<br />typo3-performance hint about igbinary: <a class="external" href="http://lists.typo3.org/pipermail/typo3-performance/2010-October/000383.html">http://lists.typo3.org/pipermail/typo3-performance/2010-October/000383.html</a><br />igbinary frontend on forge: <a class="external" href="http://forge.typo3.org/projects/extension-igbinary/">http://forge.typo3.org/projects/extension-igbinary/</a></p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594732010-12-21T23:20:27ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>Here are some things we need to know to find acceptable solutions:</p>
<p>1) Read core code and locate positions where the caching framework with variableFrondend is feeded with already searialized data -> fix it or switch to string frontend in default setup to reduce double serialization. Benchmark if serializing a string again is really slow (for longer strings).<br />2) Test if igbinary does what it tells if used as a drop-in replacement for serialize()<br />3) Test if <strong>all</strong> backends can handle binary data produced by igbinary()<br />4) Benchmark igbinary in real-world solutions<br />5) test a 'do-not-change-data' frontend together with memcache serialization and compare to igbinary data.<br />6) See if other backends can handle non serialized data nativly, too (objects, arrays), could be done with unit tests</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594742010-12-21T23:30:33ZRalf Strobelralf-strobel@web.de
<ul></ul><p>Christian's option 3 (igbinary as new frontend) sounds like the most solid solution to me.</p>
<p>The function igbinary_serialize() seems to do just what serialize() does. So the rewriting should be quite minimal as well. Some testing should be done of course.</p>
<p>This way, the backend class could continue to demand that its input be submitted as strings.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594752010-12-21T23:41:58ZRalf Strobelralf-strobel@web.de
<ul></ul><p>On a related note:</p>
<p>It might also be a good idea to make backend_MemcachedBackend compatible with "memcached" as well (currently only supports "memcache").</p>
<p>Check for which is installed could be done by simply using function_exists().</p>
<p>Some people may want to use memcached with igbinary as default serializer. This way it would also affect serialization of session data.</p>
<p>I might start a separate issue for this...</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594762010-12-22T00:23:17ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>@Ralf:</p>
<p>True, the backend should somehow work with both memcache and memcached ... patches for this should be done in FLOW3 first. From my point of view there are currently more important tasks: We must implement the garbage collection for this backend asap ...</p>
<p>BTW: Currently php-memcache is broken for me in debian squeeze because delete fails due to a misleading second parameter, so I'm currently unable to do much work for this backend without much hassle in my setup.<br />Please also keep in mind that memcache doesn't really fit the "structure" the caching framework puts into it, there are backends which handle this much smarter (like the new redis backend in 4.5 if you want an nosql solution).</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594772010-12-22T07:09:57ZMyroslav Holyakvbhjckfd@gmail.com
<ul></ul><p>If you want replace all serialize calls to igbinary_serialize(), then it's probably needed to create some t3lib_div::serialize where system will choose what method to use according to loaded php extensions etc.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594782010-12-22T17:00:59ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>FYI: igbinary support was already added to the VariableFrontend in FLOW3:</p>
<p><a class="external" href="http://forge.typo3.org/issues/11443">http://forge.typo3.org/issues/11443</a></p>
<p>I'll hopefully find some time to backport this to 4.5 before stable ...</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594792010-12-22T19:13:36ZRalf Strobelralf-strobel@web.de
<ul></ul><p>That would be very nice. Just installed igbinary on my servers.</p>
<p>In case someone is looking for installation instructions:<br /><a class="external" href="http://blogs.vinuthomas.com/2009/11/24/compress-your-serialize-output-using-igbinary/">http://blogs.vinuthomas.com/2009/11/24/compress-your-serialize-output-using-igbinary/</a></p>
<p>Hopefully there will eventually be a debian package as well.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594802010-12-23T00:11:56ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>The igbinary serializer in the variableFrontend will be backported from FLOW3 with issue <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: [Caching framework] Update to latest FLOW3 version (Closed)" href="http://forge.typo3.org/issues/24400">#24400</a></p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594812010-12-23T08:11:29ZRalf Strobelralf-strobel@web.de
<ul></ul><p>I just noticed there already is an ApcBackend.</p>
<p>That of course takes me right back to where this issue started:<br />Unlike Memcached, APC really can store and retrieve variables without serialization. Still doing so is quite a waste of time.</p>
<p>Maybe, in correlation to "phpcapablebackend" there should also be an interface "nonserializedbackend". I'm sure there will be other backend storing methods that can also handle unserialized code.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594822010-12-23T11:34:40ZMyroslav Holyakvbhjckfd@gmail.com
<ul></ul><p>Are you sure APC can store objects? Can you proof that? I ask because in this bug-discussion <a class="external" href="http://pecl.php.net/bugs/bug.php?id=8118">http://pecl.php.net/bugs/bug.php?id=8118</a> i have read that non-scalar values (objects, arrays) are passed via internal serialization. E.g try to search by words "[2006-07-04 23:17 UTC] rasmus at php dot net"</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594832010-12-23T13:26:46ZRalf Strobelralf-strobel@web.de
<ul></ul><p>I'm going to run some tests myself over the next days. It's true that there seem to have been some issues in the past....<br /><a class="external" href="http://www.php.net/manual/en/function.apc-store.php">http://www.php.net/manual/en/function.apc-store.php</a></p>
<p>There it says: "It might be interesting to note that storing an object in the cache does not serialize the object".</p>
<p>But also: "It should be noted that apc_store appears to only store one level deep. So if you have an array of arrays, (...) it will only have the top level row of keys with nulls as the values of each key."</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594842010-12-23T14:20:02ZMyroslav Holyakvbhjckfd@gmail.com
<ul></ul><p>Such unexpectable array storing is bug and it was resolved in summer 2010 (the same link as above) <a class="external" href="http://pecl.php.net/bugs/bug.php?id=8118">http://pecl.php.net/bugs/bug.php?id=8118</a>.</p>
<p>And if we want to know truth about possible serializaion of objects - then there is no better way than ask developers of apc or digging in cvs/svn.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594852010-12-23T16:47:41ZRalf Strobelralf-strobel@web.de
<ul></ul><p>You're right. Asking one of the developers is probably the only trustworthy source.</p>
<p>If you haven't found other solid information so far (I haven't) I will go ahead and contact one of them.</p>
<p>Meanwhile, I can at least confirm that storing and retrieving cascaded arrays/objects works fine in the current version.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594862010-12-29T22:13:21ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>The variable frontend now supports the igbinary serializer and another double serialization was fixed with <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: caching of pagesections uses superfluos serialize() call (Closed)" href="http://forge.typo3.org/issues/20582">#20582</a>.</p>
<p>I don't expect any serializer in memcache or apc to be more reliable or even quicker than the current solution.</p>
<p>Thus, I do not think we need to take any more actions on this topic, especially as every solution using backend capabilities would force us to create another frontend class which doesn't seem to be very useful at the moment. We should only do this if we can prove that this gives a real performance benefit. So, unless no one of you wants to test, benchmark and hack up some solution, I'll tend to close this issue within the next days.</p>
<p>If there is still some need to have a 'path-through' frontend together with a self-serializing backend, this should go to the issue tracker of FLOW3 anyway.</p>
<p>BTW: The apc backend has some serious problems which renders it unusable for most 'real-life' caches of serious size. See <a class="external" href="http://wiki.typo3.org/Caching_framework">http://wiki.typo3.org/Caching_framework</a> for details on this topic.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594872010-12-29T22:53:29ZRalf Strobelralf-strobel@web.de
<ul></ul><p>I'm still waiting for replies from the APC developers. If I could still post those here, even if the issue is closed, then I have nothing against that. I also think the solution based on igbinary sounds pretty solid.</p>
<p>A question I can already answer, however, comes from the wiki page you linked: "its currently unknown what exactly happens if APC can not store additional data"</p>
<p>What happens is you get a PHP Warning "unable to allocate memory" and nothing gets stored. I had that a lot before upping memory size in the configuration. Now, after assigning 256 MB, I'm still far away from the limit even with several hundred pages cached. Not that I would mind a garbage collector becoming available.</p>
<p>Can't confirm serious memory leaks. Usage seems quite steady after a while. I'm using the newer squeeze or dotdeb packages.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594882010-12-29T22:58:12ZRalf Strobelralf-strobel@web.de
<ul></ul><p>Another possible solution of using apc I tried out was PhpFrontend + FileBackend.</p>
<p>I can only say that for me it didn't work at all. It just resulted in a lot of error messages. When I looked into the files, I didn't even find valid php code, but instead just serialized variables, wrapped in <?php ?>-Tags.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594892010-12-29T23:29:27ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>@Ralf:<br />Thanks for feedback on the APC backend. If a warning is raised by PHP, it should probably be catched and handled in the backend. This is actually a bug in this backend which should be tackled. We should report this to FLOW3 and see if we could come up with a unit test for this case.</p>
<p>It would be great if you could document you findings about the APC backend in the caching framework documention, the documentation was just created by me and will hopefully find its way to the official documentation if all parts have been reviewed. It's a wiki page, so it would be great if you could improve the current statement.</p>
<p>For the memory leaks: I was able to reproduce them with native debian lenny php packages (no dotdeb) with my enetcacheanalytics extension (it has a performance suite for cache backends, check out from forge if interested).</p>
<p>For the fileBackend:<br />Do not use the PhpFrontend with the fileBackend if you are <em>not</em> storing PHP files. If you are caching "usual" data like strings, arrays or objects, you should combine the fileBackend with the Variable or String frontend. The PhpFrontend must be used only if storing PHP files. I have improved the documentation a bit to make a clear statement about this.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594902010-12-30T10:03:08ZRalf Strobelralf-strobel@web.de
<ul></ul><p>The Warning I got is discussed here: <a class="external" href="http://pecl.php.net/bugs/bug.php?id=16966">http://pecl.php.net/bugs/bug.php?id=16966</a></p>
<p>It's probably not the final behavior. They mention fixing it by having apc clear the oldest cache entries when not enough space is available, which seems pretty reasonable.</p>
<p>Also, if you set the ttl configuration to zero (disabled, the current default), the cache is supposed to be purged entirely once it is full. I haven't testet this yet, however.</p>
<p>I updated the framework wiki documentation. Take a look if you see it fit.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594912010-12-30T12:11:56ZRalf Strobelralf-strobel@web.de
<ul></ul><p>There was still no response from the apc developer I emailed, so I went and had a look at the sourcecode myself...</p>
<p>The interesting function is "my_copy_zval", located here:<br /><a class="external" href="http://svn.php.net/viewvc/pecl/apc/trunk/apc_compile.c?view=markup">http://svn.php.net/viewvc/pecl/apc/trunk/apc_compile.c?view=markup</a></p>
<p>As it looks, apc does serialize objects, using php_var_serialize (which I guess results in the standard serialization).</p>
<p>However, any other datatype (numbers, strings, even arrays) is directly memcopied from the running php instance. So, as long as you are not handling objects mostly, this should be the fastest way of caching thinkable.</p>
<p>For arrays, this could really mean a significant edge over igbinary when loading from cache. Apc seeems to store the actual hash table of an associative array, meaning keys will not have to be re-hashed when rebuilding the content.</p>
<p>If I find the time, I will try to do a benchmark between igbinary+apc and just apc.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594922010-12-30T19:46:36ZRalf Strobelralf-strobel@web.de
<ul></ul><p>Well, you got to love it when test results completely disprove what you had anticipated...</p>
<p>I benchmarked using a multidimensional associative array of random data (integers, strings), running two different dataset sized (8kb, 8mb). Results were consitant over several runs in both cases.</p>
<p>----------- 8 kb ----------------</p>
<p>Loading data from uncached php file: 0.274 ms<br />Loading data from cached php file: 0.051 ms</p>
<p>serialize() : 0.060 ms.<br />unserialize() : 0.061 ms.</p>
<p>igbinary_serialize() : 0.093 ms.<br />igbinary_unserialize() : 0.043 ms.</p>
<p>apc_store() : 0.056 ms.<br />apc_fetch() : 0.047 ms.</p>
<p>apc_store(serialize()) : 0.049 ms.<br />unserialize(apc_fetch()) : 0.046 ms.</p>
<p>apc_store(igbinary_serialize()) : 0.087 ms.<br />igbinary_unserialize(apc_fetch()) : 0.037 ms.</p>
<p>----------- 8 mb ----------------</p>
<p>Loading data from uncached php file: 187 ms</p>
<p>serialize() : 106 ms<br />unserialize() : 109 ms</p>
<p>igbinary_serialize() : 221 ms<br />igbinary_unserialize() : 72 ms</p>
<p>apc_store() : 36007 ms<br />apc_fetch() : 216 ms</p>
<p>apc_store(serialize()) : 110 ms<br />unserialize(apc_fetch()) : 108 ms</p>
<p>apc_store(igbinary_serialize()) : 224 ms<br />igbinary_unserialize(apc_fetch()) : 74 ms</p>
<hr />
<p>Ok, so the most obvious lesson is that apc_store cannot be recommended for for large datasets, probably due to memory allocation overhead.</p>
<p>The second surprise for me was that igbinary_serialize is actually slower than serialize. Since unserialization is faster however, I think that justifies its use in most caching environments where reads occur more frequent than writes. Quite dissapointed though that the difference is this small.</p>
<p>Maybe most importantly: Looking at the absolute numbers, I now even doubt my original premise that serialization is a main bottleneck of caching. If 8 megabytes of complex data can be processed in roughly 0.1 seconds on a relatively weak modern server (Intel i3), then this can't really take up a lot of overall execution time, can it?</p>
<p>------------ EDIT ------------------</p>
<p>Turns out the slower serialization speed of igbinary is caused by its string compacting method (saves a bit more space in some scenarios). It can be disabled by "igbinary.compact_strings=0" in php.ini.</p>
<p>Now the timings for 8kb are as follows:</p>
<p>igbinary_serialize() : 0.030 milliseconds.<br />igbinary_unserialize() : 0.044 milliseconds.</p>
<p>apc_store(igbinary_serialize()) : 0.025 ms.<br />igbinary_unserialize(apc_fetch()) : 0.036 ms.</p>
<p>It is also worth mentioning that even without compacting, the output of igbinary was always around 25% smaller.</p>
<p>Note: Setting compact_strings=0 in igbinary 1.0.2 gave me errors in scripts that are trying to store entire objects. I emailed the developers about it and they said it is already fixed in an upcoming version.</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594932011-01-18T23:11:17ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>Thanks for benching Ralf!</p>
<p>As a sum up, APC based serialization doesn't seem to give us a real benefit which can't be done by VariableFrontend as well (especially since we integrated the igbinary_serializer).</p>
<p>I'd like to close this issue for now, it doesn't really seem to lead to anything for now. Still, all measurements and conclusions are valid. Is this ok for you Ralf? We could still open another issue if things change ...</p> TYPO3 Core - Bug #24318: Unnessessary serializing for memcached with variablefrontendhttp://forge.typo3.org/issues/24318?journal_id=594942011-01-19T00:02:41ZChristian Kuhnlolli@schwarzbu.ch
<ul></ul><p>Ok, actually closing here for now. Ralf, please reopen if you have further suggestions which fit to current class logic.</p>