Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve LRU cache behavior #4210

Closed
wants to merge 6 commits into from
Closed

Conversation

chrisvoll
Copy link
Contributor

@chrisvoll chrisvoll commented Feb 4, 2017

Background

A few days ago, we noticed some performance issues with raster tiles on our maps. We were seeing some cases where tiles were reloading when they should have already been cached. For instance, in this jsbin, a ring of tiles reloads pretty consistently when zooming in and then back out:

image

Problem

We dove into the code and noticed that the LRU tile cache was discarding tiles when it wasn't supposed to. Based on the commit history, it looks like it was once an MRU cache, and still has some of the behavior of an MRU cache, where get operations delete the key and its data.

Normally this wouldn't have any significant consequences, but the cache is accessed more than once: here in findLoadedParent, and here in addTile. The first call in findLoadedParent purges the tile from the cache, so the call in addTile came up empty-handed and led to the tile being reloaded.

Solution

My PR changes the behavior of the cache to function like an LRU cache: get operations will now move the key to the top of the order stack instead of deleting it.

This works pretty well, but it uncovers an unrelated issue that gives us a beautiful artistic impression of the map we want: (my incorrect cache changes in earlier commits contributed to this, but my analysis below about the cache size still seems accurate)

image

What I've found is that the LRU cache size isn't always set to the correct value for raster tiles. In SourceCache, updateCacheSize is sometimes given misleading data. In the jsbin above, for instance, transform has tileSize: 512 (the dimensions of the retina-sized image tiles), but they're rendered at 256px. It may think that there is a 4x4 grid of tiles (16 total), for example, when it's actually 8x8 (64 total). Then you get a cache size of 80 instead of 320.

The cache runs into its limit pretty quickly, and starts discarding tiles that are still present in SourceCache._tiles. Their textures are then recycled so newer tiles can use them. The textures are overwritten, but the older SourceCache._tiles tiles still reference them, leading to the screenshot above.

My PR doesn't fix the underlying issue (the incorrect cache size—I'm not sure how to approach this without potentially breaking something else), but adds a change that prevents textures from being recycled if a tile is purged from the cache but remains in SourceCache._tiles (I removed this in subsequent changes). Now the tiles are happily cached and zooming is smoother.

Launch Checklist

This was my first big dive into the mapbox-gl-js code, so hopefully everything I mentioned here is accurate and makes sense! I'm more than happy to answer any questions or make any changes as needed.

  • briefly describe the changes in this PR
  • write tests for all new functionality
  • document any changes to public APIs - no API changes
  • post benchmark scores
  • manually test the debug page

Benchmarks

benchmark master 7b2b879 cv-lru-cache 545a88d
map-load 158 ms 104 ms
style-load 149 ms 142 ms
buffer 894 ms 923 ms
fps 60 fps 60 fps
frame-duration 3.6 ms, 0% > 16ms 3.6 ms, 0% > 16ms
query-point 0.90 ms 1.00 ms
query-box 64.17 ms 66.08 ms
geojson-setdata-small 8 ms 7 ms
geojson-setdata-large 104 ms 108 ms

@chrisvoll
Copy link
Contributor Author

Actually I've thought about this a bit more and I think my line of reasoning is partially flawed. The current cache behavior is correct, considering its intent is to only cache tiles that are no longer rendered on the map. The part that's now incorrectly missing from my PR is removing tiles from the cache once they're rendered on the map again.

In that case, an alternate proposal is to ensure that the get call in findLoadedParent doesn't purge tiles from the cache.

I'll update this PR today :)

@chrisvoll
Copy link
Contributor Author

I've updated this PR with the changes I mentioned!

My thinking behind this code is that the LRU cache's API should be explicit and predictable. Having the get operation delete data runs contrary to this, IMO, so I added an explicit remove call instead, with the option to not fire the onRemove callback.

@mourner
Copy link
Member

mourner commented Feb 8, 2017

@chrisvoll thanks a lot for the PR Chris! We'll check this out.

@chrisvoll
Copy link
Contributor Author

@mourner Just wanted to quickly follow up on this — I'm happy to help out in any way. Thanks! :)

@lucaswoj
Copy link
Contributor

@chrisvoll I'm nervous about merging this PR. The tile retention logic in SourceCache is complicated and delicate. We very often introduce more bugs by trying to fix another. We are overdue for a rewrite & simplification of SourceCache #2291.

@chrisvoll
Copy link
Contributor Author

@lucaswoj Thanks for the response! I definitely understand your concern, and want to be mindful of any potential side effects my change would have.

What if I simplified this PR to just ensure that this line doesn't remove a tile from the cache?

@lucaswoj
Copy link
Contributor

That is a good compromise @chrisvoll. I would be much more comfortable accepting that PR 😄

@chrisvoll
Copy link
Contributor Author

thanks for the feedback! I'll close this in favor of #4311

@chrisvoll chrisvoll closed this Feb 21, 2017
@chrisvoll chrisvoll deleted the cv-lru-cache branch February 21, 2017 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants