-
Notifications
You must be signed in to change notification settings - Fork 1k
Implement persistent caching in SourceMgr #431
Comments
I can work on this issue. I've started sketching out a Bolt |
Looking at |
AMAZING!! 🎉 🎉 🎉 🎊 - this is something that would make a huge difference for users, and i've got nothing even close to the time required to work on it.
Unfortunately, while that's adequate for a written file manifest/lock's purposes (by design), it likely won't be for gps' internal purposes. There are two issues I can think of off the top of my head:
And that's just with the versions. I generally tried not to allow such gotchas, but it wasn't a design goal that I explicitly kept track of, so I can't say with certainty that there aren't more such issues. Notwithstanding all that, I think it's still the right call to prototype this with strings, where it works. Layering in an encoding system later is not difficult. And, more generally, I'd strongly prefer to keep this proceeding in smaller chunks with smaller PRs (even if they have no immediate effect), rather than building up something huge and unreviewable. |
@sdboyer Have you given any thought to how the cache timeout should be configured/specified? My working branch is currently using a |
@jmank88 i'm avoiding thinking about TTLs thus far, as the information i'd like to see in the cache for the first pass should, i think, all be permanently cacheable by revision. only stuff like the version list lookups would be...oh wait, ugh. i guess that's unavoidable that we have to do something with it now. for now, let's just do something quick and dirty. like a 60s cache. initially, at least, i'd rather we not expose a choice around this to users. is it possible that we get some incremental PRs in, and do some branching by abstraction/feature flags, even if they're internal ones? |
I thought users would at least need/want some way of running w/o the cache, or to force a clean cache, but one minute is a lot shorter than I was imagining.
I've been trying to get incremental PRs in as fast as I can. Currently blocked by #1024 and #1025. |
I'm seeing speed up in every area of 'solver wall time', except for @sdboyer Can |
@jmank88 it's not strictly necessary to reach out and ping - we should be able to operate solely on information already in the cache (or none at all). this is basically a deficiency in the design of the FSM governing source interactions, but i opted for it initially just to simplify the model. there are two major aspects of that simplification. first, right now, the only way to know which of the possible but, this messes with a bunch of other things. for one, the i think there's also a number of other assumptions woven into there that i can't even conjure in the abstract - it'd take me digging in to really see them, and i'd no doubt discover more along the way. the assurance of there being exactly one object in memory responsible for managing each unique location on disk seems like a major one. the second big simplification is that just always requiring us to hit the net kept us in the comparatively-simpler-to-think-about world where we were really only ever operating on the most current version of upstream's reality. purely local information mostly just doesn't matter in dep today. this allowed me to defer on a bunch of trickier modeling questions that, while quite crucial in the long run (if we can't trust local reality, then dep is still crippled by left-pad events, even when code is committed to prior to the introduction of persistent caching, the benefits of relying on local realities were also far fewer, and didn't outweigh the costs. now that you're working on persistent caching, though, the balance there has changed, and it's worth looking at this seriously. side note: another benefit to revisiting the FSM and not requiring network activity (or even a local repo) will be that we can teach |
I did some preliminary refactoring and I'm seeing much better results now. Here are some example solver times (with the insignificant items omitted) from running Without cache:
With cache:
I still haven't fully solved the case where the mapped url changes. I'll try to isolate some of the other changes into separate PRs first. |
a bit weird that there'd be so much of a drop in tbh, i'm a bit surprised we're not seeing numbers in the <100ms, or even <10ms range. not a huge concern for right now, though - we're still in the run/right phases of "make it run, make it right, make it fast" within the scope of this particular feature. for now, i'll take <1s and be happy 😄 |
Indeed, after more tinkering I'm seeing total runtime around |
huh. redundant network calls in the currently-active logic, or in part of the new layer you're introducing? (what did you change when tinkering?) |
I'm working on isolating it now to find out for sure, but I have only been messing with the gateways, not the cache itself. |
that's (tentatively) great news. I would love nothing more than to have written some bugs that result in unnecessary net requests - that'd mean across-the-board speedups for everybody when you find them! the huge gain here will be obviating the need for any network calls on git projects in Gopkg.lock and already in the cachedir by deferring the initial network call until sometime after the |
From @sdboyer on December 30, 2016 11:46
With additional testing being done as part of the new tool, it's increasingly clear that we really, really need to get persistent, disk-backed caching into place. There are four basic things we need to cache:
go-get
metadata queryThe biggest immediate question is how to actually store it on disk. For the first three, at least, discrete files would probably work. However, I'm inclined to store everything together using bolt. That'll cover the last case more easily, especially because it is likely to require prefix queries; it'll also probably make it a bit cleaner to store TTLs.
(supercedes #8)
Copied from original issue: sdboyer/gps#130
The text was updated successfully, but these errors were encountered: