lazily decode cache files for checking invalidation #7516

gbaz · 2021-08-06T05:06:13Z

This yields a significant (15% ?) speedup on rebuilding build plans for projects with lots of individual cabal packages. (As in https://github.com/peterbecich/cabal-resolver-issue which is a repro for #7466). In such cases the cache files can grow quite large (up to approx 30M).

The way the old filemonitor / caching system worked, it would first read and fully parse each cache file, and then check it to see if it was invalidated (an operation that only involved the header portion at the beginning of the file). When files are small, the full parse isn't noticeable. But as files grow large, this parse can become quite expensive.

This changes the deserialization to go in two steps -- first parse the header info, and check for invalidation. Then, only if the cache is valid, proceed to deserialize the remaining (potentially large) serialized value. Testing reveals that this removes deserialization as a noticeable cost center immediately -- dropping it to about 3% of total time.

(note: the writing out of the cache files is still about 10% of total time, but that seems fairly unavoidable)

Cabal/src/Distribution/Utils/Structured.hs

fgaz

Thanks!

gbaz · 2021-08-07T04:23:50Z

The failure is because the binary 7.6.3 ships with doesn't export runGetOrFail or any other incremental method I can see :-/

cf: https://downloads.haskell.org/~ghc/7.6.3/docs/html/libraries/binary-0.5.1.1/Data-Binary-Get.html

Any suggestions the best way to deal with this?

gbaz · 2021-08-07T04:31:07Z

Oh, hrm. It has an incremental function runGetState but just no functions that don't throw on error... I suppose I can try to make do with that.

fgaz · 2021-08-08T06:15:31Z

You could move structuredDecodeTriple to cabal-install, which does not support ghc 7.6

gbaz · 2021-08-08T18:42:40Z

Great suggestion @fgaz, done!

Mikolaj · 2021-08-09T07:30:18Z

I wonder if writing out of the cache files can be done on a separate (low priority) thread top speed things up.

Mikolaj

I've bitched about whitespace, but otherwise it looks great.

cabal-install/src/Distribution/Client/FileMonitor.hs

gbaz · 2021-08-09T17:25:55Z

I wonder if writing out of the cache files can be done on a separate (low priority) thread top speed things up.

I thought about that too. We don't really use separate threads for IO ops elsewhere in cabal afaik, so I didn't know about setting a precedent. I'd think we'd need some slightly careful architecture to make sure that reading and writing didn't step on one another, etc., so not sure about the tradeoff between complexity and a constant win in some large cases.

Mikolaj · 2021-08-09T17:38:32Z

Yes, also, a user trying to exit before writing is finished, while otherwise cabal seems to have done it's job and is just hanging, would deserve a warning and a confirmation, which again adds to complexity of the solution.

If cabal always keeps its caches in memory and so never reads caches that it writes in the same session, that would simplify things. OTOH, this makes cabal use more memory than it would otherwise.

BTW, the CI broke on some timeout, so I'd ignore it. (edit: and merge)

emilypi · 2021-08-12T15:21:13Z

@Mergifyio backport 3.6

mergify · 2021-08-12T15:21:56Z

Command backport 3.6: success

Backports have been created

#7537 lazily decode cache files for checking invalidation (backport #7516) has been created for branch 3.6

…7537) * lazily decode cache files for checking invalidation (cherry picked from commit 3dcfe27) * Update Structured.hs (cherry picked from commit 5a4290c) * move structuredDecodeTriple to cabal-install (cherry picked from commit c1d5d4f) * fix type signatures (cherry picked from commit 7e30fd9) # Conflicts: # cabal-install/src/Distribution/Client/FileMonitor.hs * fix whitespace (cherry picked from commit f8bdd7f) * Update FileMonitor.hs Co-authored-by: Gershom Bazerman <[email protected]> Co-authored-by: gbaz <[email protected]>

lazily decode cache files for checking invalidation

3dcfe27

fgaz reviewed Aug 6, 2021

View reviewed changes

Cabal/src/Distribution/Utils/Structured.hs Outdated Show resolved Hide resolved

Update Structured.hs

5a4290c

fgaz approved these changes Aug 6, 2021

View reviewed changes

gbaz added the type: performance label Aug 7, 2021

move structuredDecodeTriple to cabal-install

c1d5d4f

gbaz and others added 2 commits August 8, 2021 15:03

Merge branch 'master' into gb/speed-cache-reading

3015bbd

fix type signatures

7e30fd9

Mikolaj self-requested a review August 9, 2021 06:46

Mikolaj approved these changes Aug 9, 2021

View reviewed changes

fix whitespace

f8bdd7f

gbaz merged commit 2bd0588 into master Aug 9, 2021

gbaz deleted the gb/speed-cache-reading branch August 9, 2021 22:20

mergify bot mentioned this pull request Aug 12, 2021

lazily decode cache files for checking invalidation (backport #7516) #7537

Merged

This was referenced Aug 30, 2021

Release prep for Cabal-3.6.1.0 and cabal-install-3.6.0.0 #7601

Merged

add Cabal 3.6.1.0 entry changelog #7600

Merged

gbaz mentioned this pull request Sep 27, 2021

Cabal takes 3 minutes to resolve dependencies #7466

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lazily decode cache files for checking invalidation #7516

lazily decode cache files for checking invalidation #7516

gbaz commented Aug 6, 2021

fgaz left a comment

gbaz commented Aug 7, 2021

gbaz commented Aug 7, 2021

fgaz commented Aug 8, 2021

gbaz commented Aug 8, 2021

Mikolaj commented Aug 9, 2021

Mikolaj left a comment

gbaz commented Aug 9, 2021

Mikolaj commented Aug 9, 2021 •

edited

Loading

emilypi commented Aug 12, 2021

mergify bot commented Aug 12, 2021

lazily decode cache files for checking invalidation #7516

lazily decode cache files for checking invalidation #7516

Conversation

gbaz commented Aug 6, 2021

fgaz left a comment

Choose a reason for hiding this comment

gbaz commented Aug 7, 2021

gbaz commented Aug 7, 2021

fgaz commented Aug 8, 2021

gbaz commented Aug 8, 2021

Mikolaj commented Aug 9, 2021

Mikolaj left a comment

Choose a reason for hiding this comment

gbaz commented Aug 9, 2021

Mikolaj commented Aug 9, 2021 • edited Loading

emilypi commented Aug 12, 2021

mergify bot commented Aug 12, 2021

Mikolaj commented Aug 9, 2021 •

edited

Loading