chunked: use mmap to load cache files #1857

giuseppe · 2024-03-05T12:29:53Z

reduce memory usage for the process by not loading entirely in memory any cache file for the layers.

The memory mapped files can be shared among multiple instances of Podman, as well as not being fully loaded in memory.

rhatdan · 2024-03-06T12:31:31Z

Is this something that will need to be back ported into podman 5.0 or wait for 5.1?

giuseppe · 2024-03-06T12:38:18Z

it is fine for 5.1

rhatdan · 2024-03-06T20:45:13Z

LGTM
@mtrmac PTAL

mtrmac

For now just a note on the public API, I didn’t read the actual cache part.

It’s possible that these things are not much a concern for the cache, I didn’t check.

store.go

giuseppe · 2024-03-07T08:20:33Z

I've pushed a new version that doesn't require any API change, and try to cast the output from LayerBigData to a os.File

mtrmac · 2024-03-07T11:45:29Z

Oh, that’s clever. Maybe worth a comment in the layers.go implementation, pointing at this assumption, so that it isn’t randomly broken by some wrapping helper.

giuseppe · 2024-03-08T11:18:44Z

@kolyshkin PTAL

giuseppe · 2024-03-12T11:57:58Z

rebased, please take a look

kolyshkin · 2024-03-14T02:44:34Z

I'm still looking at it, hope to finish tomorrow.

kolyshkin

I will continue tomorrow. For now, I've split this into two patches for easier review:

rename manifest to cacheFile
the rest of your changes

I might split it further as there are some code cleanups in here that may benefit from being in a separate commit.

kolyshkin · 2024-03-14T02:29:20Z

pkg/chunked/cache_linux.go


-		var lcd chunkedLayerData
+		return buf, buf, err


nit: for readability, I'd write this as return buf, buf, nil since err is always nil here.

pkg/chunked/cache_linux.go

kolyshkin

Added some more comments.

kolyshkin · 2024-03-14T17:55:43Z

pkg/chunked/cache_linux.go

-			cacheFile, err := readCacheFileFromReader(bigData)
-			if err == nil {
-				c.addLayer(r.ID, cacheFile)
+		// the layer is present in the store and it is already loaded.  Attempt to use


Suggested change

// the layer is present in the store and it is already loaded. Attempt to use

// The layer is present in the store and it is already loaded. Attempt to

pkg/chunked/cache_linux.go

kolyshkin · 2024-03-15T19:31:43Z

pkg/chunked/cache_linux.go

+		// The layer is present in the store and it is already loaded.  Attempt to
+		// re-use it if mmap'ed.
+		if l, found := loadedLayers[r.ID]; found {
+			if l.mmapBuffer != nil {


Does that mean if a layer is loaded via io.ReadAll (rather than Mmap), we are re-reading it?

this was meant to be an optimization to use only when the file is first created, so we already have its content in memory and to avoid reloading it. I've fixed it to not reload the cache when the file was initially loaded using io.ReadAll.

Applied the following fixup patch and pushed the new version:

diff --git a/pkg/chunked/cache_linux.go b/pkg/chunked/cache_linux.go index 386bda515..01bfc8d92 100644 --- a/pkg/chunked/cache_linux.go +++ b/pkg/chunked/cache_linux.go @@ -46,6 +46,12 @@ type layer struct { // mmapBuffer is nil when the cache file is fully loaded in memory. // Otherwise it points to a mmap'ed buffer that is referenced by cacheFile.vdata. mmapBuffer []byte + + // reloadWithMmap is set when the current process generates the cache file, + // and cacheFile reuses the memory buffer used by the generation function. + // Next time the layer cache is used, attempt to reload the file using + // mmap. + reloadWithMmap bool } type layersCache struct { @@ -201,7 +207,12 @@ func (c *layersCache) createCacheFileFromTOC(layerID string) (*layer, error) { if err != nil { return nil, err } - return c.createLayer(layerID, cacheFile, nil) + l, err := c.createLayer(layerID, cacheFile, nil) + if err != nil { + return nil, err + } + l.reloadWithMmap = true + return l, nil } func (c *layersCache) load() error { @@ -222,8 +233,8 @@ func (c *layersCache) load() error { // The layer is present in the store and it is already loaded. Attempt to // re-use it if mmap'ed. if l, found := loadedLayers[r.ID]; found { - if l.mmapBuffer != nil { - // It is loaded through mmap. Re-use it. + // If the layer is not marked for re-load, move it to newLayers. + if !l.reloadWithMmap { delete(loadedLayers, r.ID) newLayers = append(newLayers, l) continue

giuseppe · 2024-03-19T13:48:19Z

@kolyshkin are you fine with the last version?

Signed-off-by: Kir Kolyshkin <[email protected]>

giuseppe · 2024-03-20T17:18:18Z

rebased

mtrmac

Please add a comment around layerStore.BigData noting that pkg/chunked relies on the returned type being exactly *os.File.

LGTM otherwise.

pkg/chunked/cache_linux.go

reduce memory usage for the process by not loading entirely in memory any cache file for the layers. The memory mapped files can be shared among multiple instances of Podman, as well as not being fully loaded in memory. Signed-off-by: Giuseppe Scrivano <[email protected]>

giuseppe · 2024-03-20T19:53:16Z

@mtrmac addressed your comments and pushed a new version

mtrmac

Thanks! LGTM. Up to @kolyshkin now.

giuseppe · 2024-03-22T16:24:14Z

@kolyshkin @rhatdan PTAL

kolyshkin

lgtm

openshift-ci · 2024-03-26T21:14:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, kolyshkin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [giuseppe]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

giuseppe · 2024-03-27T08:27:50Z

/lgtm

openshift-ci · 2024-03-27T08:27:52Z

@giuseppe: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rhatdan · 2024-03-27T10:01:39Z

/lgtm

openshift-ci bot added the approved label Mar 5, 2024

giuseppe force-pushed the improve-chunked-cache-loading branch 3 times, most recently from 7460370 to 36da862 Compare March 6, 2024 08:31

giuseppe force-pushed the improve-chunked-cache-loading branch 2 times, most recently from 07be00f to ec1aba0 Compare March 6, 2024 13:53

mtrmac reviewed Mar 6, 2024

View reviewed changes

store.go Outdated Show resolved Hide resolved

store.go Outdated Show resolved Hide resolved

giuseppe force-pushed the improve-chunked-cache-loading branch 2 times, most recently from 43b4236 to 99049ca Compare March 7, 2024 08:19

giuseppe force-pushed the improve-chunked-cache-loading branch from 99049ca to fde18e6 Compare March 12, 2024 08:23

kolyshkin reviewed Mar 14, 2024

View reviewed changes

giuseppe force-pushed the improve-chunked-cache-loading branch 2 times, most recently from f911427 to d2801ec Compare March 14, 2024 09:42

kolyshkin reviewed Mar 14, 2024

View reviewed changes

giuseppe force-pushed the improve-chunked-cache-loading branch from d2801ec to 5259c9b Compare March 15, 2024 11:56

kolyshkin reviewed Mar 15, 2024

View reviewed changes

giuseppe force-pushed the improve-chunked-cache-loading branch from 5259c9b to af646a8 Compare March 15, 2024 20:59

pkg/chunked: rename metadata to cacheFile

f7e661f

Signed-off-by: Kir Kolyshkin <[email protected]>

giuseppe force-pushed the improve-chunked-cache-loading branch from af646a8 to 2a4e4b3 Compare March 20, 2024 16:47

mtrmac reviewed Mar 20, 2024

View reviewed changes

pkg/chunked/cache_linux.go Outdated Show resolved Hide resolved

pkg/chunked/cache_linux.go Outdated Show resolved Hide resolved

pkg/chunked/cache_linux.go Outdated Show resolved Hide resolved

pkg/chunked/cache_linux.go Outdated Show resolved Hide resolved

giuseppe force-pushed the improve-chunked-cache-loading branch from 2a4e4b3 to 080dbaf Compare March 20, 2024 19:52

mtrmac reviewed Mar 20, 2024

View reviewed changes

kolyshkin approved these changes Mar 26, 2024

View reviewed changes

rhatdan added the lgtm label Mar 27, 2024

openshift-ci bot assigned rhatdan Mar 27, 2024

openshift-merge-bot bot merged commit 22f7c28 into containers:main Mar 27, 2024
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunked: use mmap to load cache files #1857

chunked: use mmap to load cache files #1857

giuseppe commented Mar 5, 2024

rhatdan commented Mar 6, 2024

giuseppe commented Mar 6, 2024

rhatdan commented Mar 6, 2024

mtrmac left a comment

giuseppe commented Mar 7, 2024

mtrmac commented Mar 7, 2024

giuseppe commented Mar 8, 2024

giuseppe commented Mar 12, 2024

kolyshkin commented Mar 14, 2024

kolyshkin left a comment

kolyshkin Mar 14, 2024

kolyshkin left a comment

kolyshkin Mar 14, 2024

giuseppe Mar 15, 2024

kolyshkin Mar 15, 2024

giuseppe Mar 15, 2024

giuseppe commented Mar 19, 2024

giuseppe commented Mar 20, 2024

mtrmac left a comment

giuseppe commented Mar 20, 2024

mtrmac left a comment

giuseppe commented Mar 22, 2024

kolyshkin left a comment

openshift-ci bot commented Mar 26, 2024

giuseppe commented Mar 27, 2024

openshift-ci bot commented Mar 27, 2024

rhatdan commented Mar 27, 2024

	// the layer is present in the store and it is already loaded. Attempt to use
	// The layer is present in the store and it is already loaded. Attempt to

chunked: use mmap to load cache files #1857

chunked: use mmap to load cache files #1857

Conversation

giuseppe commented Mar 5, 2024

rhatdan commented Mar 6, 2024

giuseppe commented Mar 6, 2024

rhatdan commented Mar 6, 2024

mtrmac left a comment

Choose a reason for hiding this comment

giuseppe commented Mar 7, 2024

mtrmac commented Mar 7, 2024

giuseppe commented Mar 8, 2024

giuseppe commented Mar 12, 2024

kolyshkin commented Mar 14, 2024

kolyshkin left a comment

Choose a reason for hiding this comment

kolyshkin Mar 14, 2024

Choose a reason for hiding this comment

kolyshkin left a comment

Choose a reason for hiding this comment

kolyshkin Mar 14, 2024

Choose a reason for hiding this comment

giuseppe Mar 15, 2024

Choose a reason for hiding this comment

kolyshkin Mar 15, 2024

Choose a reason for hiding this comment

giuseppe Mar 15, 2024

Choose a reason for hiding this comment

giuseppe commented Mar 19, 2024

giuseppe commented Mar 20, 2024

mtrmac left a comment

Choose a reason for hiding this comment

giuseppe commented Mar 20, 2024

mtrmac left a comment

Choose a reason for hiding this comment

giuseppe commented Mar 22, 2024

kolyshkin left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Mar 26, 2024

giuseppe commented Mar 27, 2024

openshift-ci bot commented Mar 27, 2024

rhatdan commented Mar 27, 2024