Idea: save/restore active objects on DB close/create #163

jimfulton · 2017-05-14T01:46:52Z

Restarting a ZODB application can hurt due to the need for cache warming.

Persistent storage caches can help, but because they are secondary caches, they often don't have the most important objects. (Even the recent enhancements to the RelStorage local cache don't seem to help that much based on my experience with a RelStorage app running 2.1a2.)

Idea: add a DB option to save a list of object ids for all active (non-ghost) objects and, on startup, these oids could be prefetched to at least warm the storage caches. (If RelStorage grew a prefetch method that could prefetch multiple oids at once, this would be a win even though it would be synchronous.)

jamadden · 2019-06-28T18:17:28Z

Even the recent enhancements to the RelStorage local cache don't seem to help that much based on my experience with a RelStorage app running 2.1a2.

It's gotten much better in the 3.0 series (details below).

If RelStorage grew a prefetch method that could prefetch multiple oids at once, this would be a win even though it would be synchronous.

It did, and it is. In a big way.

Below are benchmarks from zodbshootout 0.8. The first one is for MySQL with and without a persistent cache, and the second is for PostgreSQL with and without a persistent cache. 'add' is adding 1000 objects in a transaction, 'cold' is reading 1,000 objects after having emptied all caches (the storage's persistent cache automatically reloads itself from disk in that case). (I ran these quickly, without many iterations, so there's some variability in the numbers; the 'add' times varying show that it's up to 15% or so.)

Benchmark	mysqlclient_hf_1000	mysqlclient-hf-pcache_1000
add	110 ms	93.9 ms: 1.17x faster (-14%)
cold	194 ms	26.5 ms: 7.32x faster (-86%)
prefetch_cold	47.1 ms	31.9 ms: 1.48x faster (-32%)

Benchmark	psycopg2_hf_1000	psycopg2_hf_pcache_1000
add	73.9 ms	75.7 ms: 1.02x slower (+2%)
cold	113 ms	26.6 ms: 4.24x faster (-76%)
prefetch_cold	45.7 ms	32.1 ms: 1.42x faster (-30%)

In both cases, adding a persistent cache made 'cold' substantially faster --- the predictable read pattern here has a 100% hit rate from the persistent cache. Even doing a bunch of other writes to the database and then coming back to this cold read still gets great hit rates.

The interesting thing is that simply prefetching the data gets you almost as much of a benefit as a persistent cache. (And note how the persistent cache actually slowed down in prefetch just a bit: that's the overhead of iterating the ghosts and determining that there's actually no need to talk to the database at all, they're already cached.) Of course, this is a best-case scenario for prefetch: we know exactly what we need to read, and we have a fast local database connection.

As noted, persistent caches, unless quite large, don't necessarily have the most important objects. Very important, but rarely changed, objects, may only live in Connection caches. (More about that in a minute.) RelStorage already has the infrastructure to store a set of OIDs, and the ability to prefetch them at storage opening time (or even ensure they're persisted into the cache). What it can't do is get that list of OIDs. They're frustratingly close: I think they're (approximately) the union of the OIDs stored in each Connection's pickle cache in the DB's connection pool at the time of close(). There's that registerDB method that would let the storage know about the DB and get access to those Connections, except that's not what that method does; also, the connections are all closed and their cache cleared before the storage knows its being closed, so that's too late.

Maybe the DB.close() method could let its storage know that it's about to close (whether by method or by event)? Then actually getting that list of OIDs, persisting it, and prefetching it on open, would be up to the storage implementation. One could imagine a storage wrapper that does that for any storage if it was a method, or for any storage in any database if it was an event.

OK, that "more" for knowing important objects: It's quite possible that the set of important objects changes over time. The objects needed at startup might be very different from those in the steady-state of the application. This can be partly mitigated by using a larger persistent cache, especially if writes are relatively rare. But one might like to capture the working set at particular points in the application lifecycle and persist it for pre-fetching later (e.g., grab just before the first request, and also just after the last request; prefetch the first at startup, prefetch the second just after startup). That seems to suggest that perhaps one necessary, generic, primitive that could be provided by ZODB is simply "get the working set[1]" Other policies could take it from there (including the storage wrapper or event or just plain storage method call on close() mentioned above) without having to know the details about connection pools and pickle caches.

[1] Right now we can only provide LRU information from pickle caches, but other policies (like zopefoundation/persistent#45 if I ever get around to it) could let us provide a better picture of the true working set.

jamadden · 2019-06-28T20:37:26Z

It's gotten much better in the 3.0 series (details below).

Right, those details. So 2.1's persistent cache turned out to do OK if you only had one process writing cache files, or if all processes writing cache files had essentially the exact same workload. The more the workloads differed, though, the worse the cache performance got. With just one process, it gets almost 100% hit ratio again. But look what happens when I add a second process with a different workload (-c 2):

Benchmark	2.1.1 persistent cache	3.0 master persistent cache
psycopg2_hf_pcache: add 1000 objects	104 ms	81.0 ms: 1.29x faster (-22%)
psycopg2_hf_pcache: read 1000 cold objects	144 ms	28.9 ms: 4.97x faster (-80%)
psycopg2_hf_pcache: read 1000 cold prefeteched objects	128 ms	33.6 ms: 3.80x faster (-74%)
psycopg2_hf: add 1000 objects	99.7 ms	82.3 ms: 1.21x faster (-17%)
psycopg2_hf: read 1000 cold objects	127 ms	118 ms: 1.08x faster (-7%)
psycopg2_hf: read 1000 cold prefeteched objects	130 ms	47.0 ms: 2.77x faster (-64%)

Might as well not have a persistent cache in 2.1, but 3.0 handles it just fine.

jimfulton added the enhancement label May 14, 2017

jamadden mentioned this issue Aug 31, 2017

Can a persistent cache file be shared by various instances? zopefoundation/ZEO#95

Closed

jamadden mentioned this issue May 22, 2019

Implement ZODB5 prefetch zodb/relstorage#239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: save/restore active objects on DB close/create #163

Idea: save/restore active objects on DB close/create #163

jimfulton commented May 14, 2017

jamadden commented Jun 28, 2019 •

edited

Loading

jamadden commented Jun 28, 2019

Idea: save/restore active objects on DB close/create #163

Idea: save/restore active objects on DB close/create #163

Comments

jimfulton commented May 14, 2017

jamadden commented Jun 28, 2019 • edited Loading

jamadden commented Jun 28, 2019

jamadden commented Jun 28, 2019 •

edited

Loading