Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Note: Eviction anomaly #289

Closed
jandam opened this issue Dec 7, 2018 · 5 comments
Closed

Note: Eviction anomaly #289

jandam opened this issue Dec 7, 2018 · 5 comments

Comments

@jandam
Copy link

jandam commented Dec 7, 2018

I'm experiencing poor performance with loading cache due to early value eviction.
Problem is that values are evicted after several gets from cache (in my case ~10). Sometimes value is evicted immediately after load. Problem occurs only with full cache and after several minutes of usage. All code runs on single thread.

Cache size: 256MB
Weighter: range: 10 bytes .. 3MB

In my case 10 values has about 12.5MB (5% of cache size)

I tracked down problem to 2 places.
*) eden queue have only 1% of total cache size. So in my case only 1 or 2 values remains in eden space. So very early are moved to accessOrderProbationQueue.

*) BoundedLocalCache::evictFromMain

  • chooses between first 'victim' and last 'candidate' from accessOrderProbationQueue. Last 'candidate' is fresh node evicted from eden space.
  • next BoundedLocalCache::admit is called to choose which candidate to evict.
    -- here is first problem that frequencySketch returns same value for both keys
    -- then random is used to choose which node should be evicted. I suppose that there is some repeating pattern

Proposed improvement:
*) allow to set eden space size (currently hard coded 1% of cache size/weight)

  • it will be useful when object size close to eden space size.

*) maybe there are some issues with frequency sketch that returns same values for different keys

  • it is probabilistic structure. Some support for user defined hash functions can help.

I need to further investigate this issue.

Cache builder:

        return Caffeine.<Key, Value>newBuilder()
                .recordStats()
                .executor(Runnable::run)
                .maximumWeight(maxWeight)
                .weigher((Weigher<Key, Value>) (key, value) -> value.getSizeInBytes())
                .build(this::createValue);

PS sorry for my English

@ben-manes
Copy link
Owner

PR #263 adds the setting option. Unfortunately I haven’t been able to work on this project but hopefully can catch up over the holidays.

There is a new research paper that corrects this problem by automatically tuning the window size based on the workload. I can email it privately if interested. I want to explore some further ideas along those lines.

@jandam
Copy link
Author

jandam commented Dec 7, 2018 via email

@ben-manes
Copy link
Owner

FYI, the research paper with our strategy to fix this automatically is now freely available if you follow the README's link (see Adaptive Software Cache Management). This uses ACM's Authorizer service to let you bypass the paywall.

I would like to explore adaptive moment estimation (adam) as an enhancement to the naive hill climber shown in the paper. The paper demonstrates that the techniques work, but leaves deeper optimizations to the community. In the case of hill climbing (aka gradient descent), the ML community has very good work on improvements. I am hoping to find some time over the holidays to implement their ideas.

The time/space overhead to the cache was shown to be minimal and the code complexity is low. After I have satisfied myself on the exact algorithm to adopt, I hope to find the time to implement it. This should then correct for this problem without requiring a user-visible configuration.

@ben-manes
Copy link
Owner

@jandam can you provide an access trace to verify my fix with?

Below shows the results for an extremely recency-biased trace. The eden queue (admission window) is increased from 1% to 80%, the maximum allowed in this prototype. That corrects the hit rate from 0.6% to the optimal 33.33%.

╔═════════════════════════════════════════════════════════════════════════╤══════════╤═════════╤═══════════╤═══════════╗
║ Policy                                                                  │ Hit rate │ Hits    │ Misses    │ Requests  ║
╠═════════════════════════════════════════════════════════════════════════╪══════════╪═════════╪═══════════╪═══════════╣
║ linked.Lru                                                              │ 33.33 %  │ 624,106 │ 1,248,216 │ 1,872,322 ║
╟─────────────────────────────────────────────────────────────────────────┼──────────┼─────────┼───────────┼───────────╢
║ opt.Clairvoyant                                                         │ 33.33 %  │ 624,108 │ 1,248,214 │ 1,872,322 ║
╟─────────────────────────────────────────────────────────────────────────┼──────────┼─────────┼───────────┼───────────╢
║ sketch.HillClimberWindowTinyLfu (adam 1% -> 80%)                        │ 33.33 %  │ 624,070 │ 1,248,252 │ 1,872,322 ║
╟─────────────────────────────────────────────────────────────────────────┼──────────┼─────────┼───────────┼───────────╢
║ sketch.HillClimberWindowTinyLfu (amsgrad 1% -> 80%)                     │ 33.33 %  │ 624,071 │ 1,248,251 │ 1,872,322 ║
╟─────────────────────────────────────────────────────────────────────────┼──────────┼─────────┼───────────┼───────────╢
║ sketch.HillClimberWindowTinyLfu (stochastic_gradient_descent 1% -> 80%) │ 33.16 %  │ 620,863 │ 1,251,459 │ 1,872,322 ║
╟─────────────────────────────────────────────────────────────────────────┼──────────┼─────────┼───────────┼───────────╢
║ sketch.WindowTinyLfu (1%)                                               │ 0.60 %   │ 11,169  │ 1,861,153 │ 1,872,322 ║
╚═════════════════════════════════════════════════════════════════════════╧══════════╧═════════╧═══════════╧═══════════╝

@ben-manes
Copy link
Owner

The adaptive improvements are now merged. I think this should solve your problem, but I don't have a weight-based recency trace to verify with. There could still be some fine tuning around weight eviction from the admission window. I'm closing due to lack of response, but you are welcome to reopen anytime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants