-
Notifications
You must be signed in to change notification settings - Fork 43
[RFC] qemu: set maxmem to host memory size #336
Conversation
CI failed because I didn't vendor Can you guys please show me how to vendor a new go project using Or should I write a new function (look up /proc/meminfo) to get host memory instead import a new pkg? |
Hi @wcwxyz, thanks again for this new patch :) |
I've updated the patch. I noticed coverage drop down a bit. That's because some error case is not tested (but should be ok). |
qemu.go
Outdated
return ciaoQemu.Memory{}, fmt.Errorf("Unable to read memory info: %s", err) | ||
} | ||
// add 1G memory space for nvdimm device (vm guest image) | ||
memMax := fmt.Sprintf("%dM", int(float64(hostMem)/1024/1024)+1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the spirit of @jodh-intel , it might be nice if the variable names reflected the units they were holding, maybe like memMaxMegs
or similar. not a blocker :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emm... the code itself is clear enough. Since memMax
is a string, I think I'm gonna change hostMem
to hostMemBytes
to be clearer.
lgtm |
qemu.go
Outdated
return memory, nil | ||
} | ||
|
||
func getHostMemorySize() (uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer this was called getHostMemorySizeBytes()
to avoid any confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
qemu.go
Outdated
} | ||
|
||
func getHostMemorySize() (uint64, error) { | ||
f, err := os.Open("/proc/meminfo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function really need a test so I'd either pass this filename as an argument, or use a global that the tests can modify. For an almost identical example (feel free to copy):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
qemu.go
Outdated
return 0, err | ||
} | ||
|
||
return 0, fmt.Errorf("unable to get MemTotal in /proc/meminfo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this message can reference the variable for /proc/meminfo
.
qemu_test.go
Outdated
@@ -395,10 +395,16 @@ func TestQemuSetMemoryResources(t *testing.T) { | |||
|
|||
q := &qemu{} | |||
|
|||
hostMem, err := getHostMemorySize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/hostMem/hostMemBytes/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Thanks.
qemu_test.go
Outdated
if err != nil { | ||
t.Fatal(err) | ||
} | ||
memMax := fmt.Sprintf("%dM", int(float64(hostMem)/1024/1024)+1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a paranoid check to ensure that hostMemBytes > 0
before the division. It would also be clearer potentially to define variables for the division + addition with comments explaining why those values were chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you're right. Will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but I'd like @grahamwhaley to review as well.
qemu_test.go
Outdated
file := filepath.Join(dir, "meminfo") | ||
// file doesn't exist | ||
if _, err := getHostMemorySizeBytes(file); err == nil { | ||
t.Fatal() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You could remove the comment above and make this:
t.Fatalf("expected failure as file %q does not exist", file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sure.
qemu.go
Outdated
@@ -106,6 +108,11 @@ const ( | |||
maxDevIDSize = 31 | |||
) | |||
|
|||
const ( | |||
// NVDIMM device needs memory space (1024MB is suggested by @grahamwhaley) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the comment but please can you add a link to an issue/PR rather than mentioning individuals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like this? clearcontainers/runtime#380 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the full url, github seems automatically format that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks correct to me :-) (yes, github nicely hides the full URL where it can :-) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jodh-intel, you should not mention @grahamwhaley directly into the source code. And please refer only the isssue, not the very specific comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
@jodh-intel thanks for review. I'll update the code after @grahamwhaley's review. |
I take @grahamwhaley is ok with the rest of the patch. |
I'll review tomorrow, please don't merge until I have reviewed this one. |
adding DNM label until @sboeuf gets a chance to review etc. |
qemu.go
Outdated
@@ -106,6 +108,11 @@ const ( | |||
maxDevIDSize = 31 | |||
) | |||
|
|||
const ( | |||
// NVDIMM device needs memory space (1024MB is suggested by @grahamwhaley) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @jodh-intel, you should not mention @grahamwhaley directly into the source code. And please refer only the isssue, not the very specific comment.
qemu.go
Outdated
const ( | ||
// NVDIMM device needs memory space (1024MB is suggested by @grahamwhaley) | ||
// See https://github.com/clearcontainers/runtime/issues/380#issuecomment-321216520 | ||
defaultNvdimmMemory = 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this constant maxMemoryOffset
because that's basically what it is, and we don't need to specify the reason in the variable name. As long as we have your comment, a generic name sounds better to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
qemu.go
Outdated
@@ -468,13 +476,20 @@ func (q *qemu) setCPUResources(podConfig PodConfig) ciaoQemu.SMP { | |||
return smp | |||
} | |||
|
|||
func (q *qemu) setMemoryResources(podConfig PodConfig) ciaoQemu.Memory { | |||
func (q *qemu) setMemoryResources(podConfig PodConfig) (ciaoQemu.Memory, error) { | |||
hostMemBytes, err := getHostMemorySizeBytes("/proc/meminfo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/proc/meminfo
is a very well known path for Linux. Please define this as a const and don't provide it through this call, instead use it directly from the function getHostMemorySizeBytes()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
qemu.go
Outdated
} | ||
|
||
// add 1G memory space for nvdimm device (vm guest image) | ||
memMax := fmt.Sprintf("%dM", int(float64(hostMemBytes)/1024/1024)+defaultNvdimmMemory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong to do that, isn't it ? I mean you're clearly asking to run a VM with a memory max greater than what the system has to offer ? Or am I reading the code wrong ?
Instead you should check that the system can allow at least the expected memory to be set and this would be a nice error to throw. IMO, we don't want to continue if our host cannot provide the amount of memory requested, because the user could end up in not being able to run the workload he expects.
And I think that you should validate that the expected amount of memory + the offset can fit in the host memory. But please don't consider this case as throwing an error, just log a warning, because the offset might be too much in some cases that could end up functioning very well. Indeed, in case you have the NVDIMM using 200MiB and you ask 1.5GiB on a 2GiB host, you really need only 1.7 and this will work great. But if we check with our offset of 1GiB, we re gonna fail, that's why we don't want to generate an error in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. maxmem is used to define upper bounds for memory hotplug. You request memory via qemu -m MEMORY
.
Say run qemu -m 2048M,slots=2,maxmem=30000M
, you got 2048M for on your VM and you can hot plug in 2 rom device (pc-dimm) into VM, the total memory can not exceed 30000M. In case your host memory is less than 30000M, hot plug will fail when your ask for memory when host doesn't have available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what you're saying, in case the host memory is less than 30G, the hotplug will fail. That's why I think we should try to set a maxmem higher than the total RAM available on the host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I understand what you're trying to say.
You're wrong about "offset". It is not consuming real host memory. Nvdimm needs "offset" to work on qemu. It takes memory space, not the real memory. (I'm not familiar with nvdimm device, but a quick test suggests that).
For maxmem, this is just the memory upper bounds for qemu. Setting maxmem to host_total_memory+offset
is not asking qemu to run a VM with that much memory. Qemu will accept or reject to proceed any memory hotplug requests based on the maxmem. Whether hot plug works or not, depends kernel memory management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 'offset', allocating that extra space and the extra slot does not take any host memory at startup. Then later we plug the NVDIMM into that space. I think we are probably safe not calculating that into the 'total host memory' equation.
For 'mem_request > host_memory' - whether the host allows or denies a VM (CC-container) to be created if it requests more memory than the host has either available or system total I believe actually comes down to how the host kernel is configured for over-commit. If the system owner has knowingly configured the system to allow VM overcommit then we should probably not error. I think we could be more informative though with one or both of the following options maybe:
- warn if the requested memory is > the current system free memory (the default kernel non-overcommit failure case I believe)
- warn if the requested memory is > the system memory total (because you are likely going to run into performance issues if your container uses all that memory)
For more information on kernel overcommit settings etc. see:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/Documentation/sysctl/vm.txt
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/Documentation/vm/overcommit-accounting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @grahamwhaley for elaborating on memory overcommit.
I don't think we should do anything about user requested memory. Runtime should not log error/warn solely based on overcommit or free memory. When QEMU fails to allocate memory, it errors out. There's enough information for user to see.
Kernel memory management is tricky. I really don't think we should make assumptions about memory is not enough for VM under any condition. Because we can only know memory actually runs out when there's allocation failure at the moment.
@grahamwhaley @sboeuf What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with that theory - we cannot tell when a VM launch will fail or not, as we don't have the intimate knowledge of the system.
Later if we find that the qemu failure is non-obvious from the logs, then we can maybe work on improving how the error/logs are gathered and pushed up through the runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I believe this part of code is ok, if @sboeuf doesn't have any objections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok you convinced me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review @sboeuf @grahamwhaley. I'll post v2.
qemu.go
Outdated
if err != nil { | ||
continue | ||
} | ||
bytes := uint64(size) * 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we converting to bytes ? I mean we're basically multiplying by 1024 here to divide twice from the caller. I would not bother doing this and instead name the function getHostMemorySizeKb
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change to KB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this v2, the code itself looks good to me, but I would like to see those generic functions/variables as part of hypervisor.go
so that they could be reused for other hypervisor implementations in the future.
I am looking forward to the v3 ;)
qemu.go
Outdated
maxMemoryOffset = 1024 | ||
) | ||
|
||
const procMemInfo = "/proc/meminfo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this to hypervisor.go
since it can be reused across different hypervisor implementations.
qemu.go
Outdated
return memory, nil | ||
} | ||
|
||
func getHostMemorySizeKb(memInfoPath string) (uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this to hypervisor.go
since it can be reused across different hypervisor implementations.
Setting maxmem to host memory size wouldn't change memory footprint. It only defines the upper bounds for memory hotplug. Currently we set maxmem to mem*1.5, which is not enough for running a small VM (eg. 128M) considering nvdimm device consumes memory address space. This will also allow us to do memory hotplug (maybe in the future). Fixes #337 Signed-off-by: WANG Chao <[email protected]>
Here comes v3. Thanks. |
Setting maxmem to host memory size wouldn't change memory footprint. It
only defines the upper bounds for memory hotplug.
Currently we set maxmem to mem*1.5, which is not enough for running a
small VM (eg. 128M) considering nvdimm device consumes memory address
space.
This will also allow us to do memory hotplug (maybe in the future).
Signed-off-by: WANG Chao [email protected]