Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mmap/reorg #235

Closed
wants to merge 3 commits into from
Closed

Mmap/reorg #235

wants to merge 3 commits into from

Conversation

vireshk
Copy link
Contributor

@vireshk vireshk commented May 29, 2023

This reorganizes some code in the vm-memory crate, which will later be used by:

#231

Copy link
Contributor

@Ablu Ablu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some minor comments. People with deeper code knowledge will have to comment on the architecture :).

src/mmap_unix.rs Show resolved Hide resolved
src/volatile_memory.rs Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
src/mmap_unix.rs Show resolved Hide resolved
src/mmap_unix.rs Show resolved Hide resolved
src/mmap_unix.rs Show resolved Hide resolved
src/mmap_unix.rs Outdated Show resolved Hide resolved
src/mmap_unix.rs Outdated Show resolved Hide resolved
src/mmap_windows.rs Outdated Show resolved Hide resolved
src/mmap_windows.rs Outdated Show resolved Hide resolved
@vireshk vireshk force-pushed the mmap/reorg branch 4 times, most recently from d585fe1 to cebaa62 Compare May 30, 2023 10:35
Ablu
Ablu previously approved these changes May 30, 2023
src/atomic.rs Show resolved Hide resolved
src/atomic.rs Outdated Show resolved Hide resolved
src/guest_memory.rs Outdated Show resolved Hide resolved
src/mmap_windows.rs Show resolved Hide resolved
src/mmap.rs Outdated Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
src/mmap_unix.rs Show resolved Hide resolved
src/volatile_memory.rs Outdated Show resolved Hide resolved
jiangliu
jiangliu previously approved these changes Jun 9, 2023
@JonathanWoollett-Light
Copy link
Contributor

JonathanWoollett-Light commented Jun 12, 2023

On volatile_memory: Remove redundant unsafe blocks while the lint name doesn't spring to my mind immediately I beleive there is a lint for checking for unsafe blocks within unsafe blocks. Are we not currently warning on this? Would it make sense to add this in this commit?

@vireshk
Copy link
Contributor Author

vireshk commented Jun 13, 2023

On volatile_memory: Remove redundant unsafe blocks while the lint name doesn't spring to my mind immediately I beleive there is a lint for checking for unsafe blocks within unsafe blocks. Are we not currently warning on this? Would it make sense to add this in this commit?

@epilys did mention about another lint which can be of use here: rust-lang/rust#71668

I couldn't find a lint which will warn if unsafe block is added to a unsafe fn though.

@epilys
Copy link
Member

epilys commented Jun 13, 2023

@JonathanWoollett-Light @vireshk the lint for nested unsafe blocks is clippy::multiple_unsafe_ops_per_block the lint for missing unsafe blocks in unsafe functions does not seem to be stable yet: unsafe_block_in_unsafe_fn so we can ignore it for now

@epilys
Copy link
Member

epilys commented Jun 13, 2023

I couldn't find a lint which will warn if unsafe block is added to a unsafe fn though.

The opposite is what rustc will require in the feature (unsafe actions in unsafe fn bodies will require unsafe blocks)

I mixed up the RFC name with the lint name, it appears the lint is actually stable since 1.52:

https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html#unsafe-op-in-unsafe-fn

Copy link
Collaborator

@roypat roypat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've gone through all the commits. 075e37c is fine, but for the others I have some high level questions/comments.

  1. 57693f1: As others have already pointed out, there is https://rust-lang.github.io/rfcs/2585-unsafe-block-in-unsafe-fn.html, which is an accepted rust-lang RFC and means that going forwards, not having the unsafe blocks will be at the very least a warning. We should maybe consider enabling this lint ahead of time for this crate, but that's out of scope here. Having the unsafe blocks even inside unsafe functions is quite valuable imo, because it immediately tells you what parts of a function is actually performing unsafe operations, and also forces you to accurately document the safety invariants (by means of #![deny(clippy::undocumented_unsafe_blocks)]). In that sense, I don't think this commit is a change we want to do.
  2. 4d73f17: The call stack of these methods is starting to be very deep, which makes understanding the code pretty complex (this is already a big problem in this crate, but I'd like to not make it worse). In this PR all the new methods have exactly one call-site. I think the context for these changes is that in the follow up, you'll want to call these from different cfg'd contexts. However, as Andreea noted on that PR, you can mark individual lines of code with #[cfg(...)] instead of whole methods. With that, the number of call-sites should stay at 1 I think, and these functions can be re-inlined, so this commit isn't necessary.
  3. f3a9fc5: The changes in this commit are not motivated in the commit message. I'm not sure why this is required, and it adds cognitive similar complexity as the previous commit. I see that in the follow up PR, it is this structure that VolatileSlice gains a reference to, but why can't we just reuse MmapRegion there? The type parameter shouldn't cause problems, since VolatileSlice has it, too.
  4. 6cfe417: I understand that this is based on a concern Andreea had on the follow up PR. My main concern is the same as the previous two commits, that we start introducing too many layers of abstraction here. The structures added here are a new layer on top of MmapRegionBuilder. The diff of this commit touches mostly on test code, so I think what happened at some point in the past is that someone did not want to use the builder in test code, so they added the tuple-based construction methods as shortcuts. Any actual new information that is used during construction of MmapRegions should be integrated with the MmapRegionBuilder (e.g. the two new Xen fields from the follow up PR should become #[cfg(xen)] fields in the build, with associated setters). Outside of test code, the builder should always be used to construct mmap regions. In that sense, I think the implementation how it was previous to this commit is fine (it should probabl be moved into a test module, but I'd like to keep breaking changes to the public API to a minimum). However, this problem will resurface when it comes to testing the Xen code, as using builders is probably overly verbose. I propose we revisit when it comes to that point, and then add a subset of the changes introduced in this commit (having a struct that encapsules the fields of the tuple, and a utility method that uses the build to construct a GuestMmapRegion from the tuple-struct) in the Xen test module. The existing methods can be then marked as #[cfg(not(xen))].

Sorry if this is a lot (or if I got some things wrong - cross-referencing with the other PR was difficult at time). The main themes in my feedback are keeping hierarchies flat, and keeping the changes to the public API for non-xen users minimal. Before I do another of this PR/the follow up, could you try addressing @andreeaflorescu's comment in the other PR about minimizing the scope of cfg blocks? I think that will make review a lot easier, and should bring down the size of the diffs quite a bit too. Thanks!

src/volatile_memory.rs Outdated Show resolved Hide resolved
Add separate helpers to copy from and to the volatile slice.

This allows the real logic to be present at a single place, which can be
easily updated for special cases later on. For example extra memory
mapping for systems where the regions can't be mapped in advance, like
in case of Xen's Grant mappings.

Signed-off-by: Viresh Kumar <[email protected]>
Add a separate structure to represent the memory mapped with mmap(),
which can be freed automatically from drop().

Also move MmapRegion creation code to MmapRegionBuilder::build_region().

Signed-off-by: Viresh Kumar <[email protected]>
@vireshk
Copy link
Contributor Author

vireshk commented Jun 16, 2023

Thanks for the review Patrick, really appreciate it !

  1. 57693f1: I don't think this commit is a change we want to do.

Dropped.

  1. 4d73f17: With that, the number of call-sites should stay at 1 I think, and these functions can be re-inlined, so this commit isn't necessary.

It may look messy without that, but I am okay with dropping it. Dropped.

  1. f3a9fc5: The changes in this commit are not motivated in the commit message.

I am not sure I understood that.

I'm not sure why this is required.

There are a couple of reasons for that:

  • One is that this struct Mmap is used by mmap_xen.rs file (renamed as MmapUnix there) as well, for all the xen variants.
  • Also the way it is currently implemented doesn't look very clean or Rust-ish to me (though I am still a newbie). The object (mapped address) here is allocated from MmapRegionBuilder structure and freed by MmapRegion 's drop() method, conditionally. A cleaner way (from my understanding of Rust) normally is to allocate objects is in a way that the resource get freed automatically by the drop method, normally unconditionally. It is okay to pass the object from builder to region, like instance of Mmap now, but the mapped raw address is not really an end object which should be used like this.
  • And yes VoltaileSlice could have used MmapRegion as well, but MmapInfo, which is Mmap for non-xen cases, is used in mmap_unix.rs as well to write some common code and so I reused the same type in VolatileSlice too.
  1. 6cfe417: I understand that this is based on a concern Andreea had on the follow up PR. My main concern is the same as the previous two commits, that we start introducing too many layers of abstraction here. The structures added here are a new layer on top of MmapRegionBuilder. The diff of this commit touches mostly on test code, so I think what happened at some point in the past is that someone did not want to use the builder in test code, so they added the tuple-based construction methods as shortcuts. Any actual new information that is used during construction of MmapRegions should be integrated with the MmapRegionBuilder (e.g. the two new Xen fields from the follow up PR should become #[cfg(xen)] fields in the build, with associated setters).

Yeah, this is exactly what I have done for MmapRegionBuilder.

Outside of test code, the builder should always be used to construct mmap regions.

Hmm, not really I guess. MmapRegionBuilder is a Unix only concept. The common interfaces provided by this crate (in mmap.rs) are GuestRegionMmap and GuestMemoryMmap instead and all the end users should use them. For example MmapRegionBuilder isn't there for Windows based implementation. If you see the vhost crate too, it uses these two interfaces only and not the builder. And so we required new structures for the parameters of these two structures, which is what was suggested by Andreea. Which eventually gets used in mmap_unix.rs as well.

Sorry if this is a lot (or if I got some things wrong - cross-referencing with the other PR was difficult at time).

I am fine with whatever changes I need to make to get over with this work. I just want to get over this soon :)

The main themes in my feedback are keeping hierarchies flat, and keeping the changes to the public API for non-xen users minimal.

Sure.

Before I do another of this PR/the follow up, could you try addressing @andreeaflorescu's comment in the other PR about minimizing the scope of cfg blocks? I think that will make review a lot easier, and should bring down the size of the diffs quite a bit too. Thanks!

Hmm, I already did a lot of that and the diff is really small now comparatively (from where I started).

Since there is a lot of code to review here, I would really prefer to get this pull request merged before we start reviewing the other one, which introduces Xen based stuff. I understand that you had to look at the other pull requests too, to get an insight on how things are getting used eventually and it is not super easy.

My idea behind this separate pull request was to try and make the review process easier and shorter and merge code that is already reviewed, instead of waiting for the rest of it to be ready as well.

I have repushed this branch now after removing the two commits and making the bitmap change you suggested.

Thanks.

The list of (similar) arguments for various structures used for memory
mapping is getting long and that leads to a wide variety of functions
with different signatures.

Simplify the same with the help of two new structures: MmapRange and
GuestMmapRange.

This also allows the structures to be incrementally updated for
different memory mapping models later on, like Xen, without updating the
signature of the methods anymore.

Signed-off-by: Viresh Kumar <[email protected]>
@vireshk
Copy link
Contributor Author

vireshk commented Jun 16, 2023

  1. 4d73f17: With that, the number of call-sites should stay at 1 I think, and these functions can be re-inlined, so this commit isn't necessary.

It may look messy without that, but I am okay with dropping it. Dropped.

Now that I have rebased xen/mmapv2 over the new changes (and pushed), it doesn't look that messy afterall. :)

@roypat
Copy link
Collaborator

roypat commented Jun 20, 2023

  1. f3a9fc5: The changes in this commit are not motivated in the commit message.

I am not sure I understood that.

I meant that the commit message only explains what the commit is doing, but not why. Got it now though, thanks for the explanation!

I'm not sure why this is required.

There are a couple of reasons for that:

  • One is that this struct Mmap is used by mmap_xen.rs file (renamed as MmapUnix there) as well, for all the xen variants.

  • Also the way it is currently implemented doesn't look very clean or Rust-ish to me (though I am still a newbie). The object (mapped address) here is allocated from MmapRegionBuilder structure and freed by MmapRegion 's drop() method, conditionally. A cleaner way (from my understanding of Rust) normally is to allocate objects is in a way that the resource get freed automatically by the drop method, normally unconditionally. It is okay to pass the object from builder to region, like instance of Mmap now, but the mapped raw address is not really an end object which should be used like this.

Ah, I think this is actually a pretty common pattern in rust. All the structures that take ownership over raw file descriptors (e.g. File::from_raw_fd) come to mind :)

  • And yes VoltaileSlice could have used MmapRegion as well, but MmapInfo, which is Mmap for non-xen cases, is used in mmap_unix.rs as well to write some common code and so I reused the same type in VolatileSlice too.

Okay, but for unix, we don't need anything extra in VolatileSlice - the additional field should be #[cfg(xen)], and just contain the xen specific type. And then at this point, since only the Xen specific type uses this, it's fine to inline its few fields, I think.

  1. 6cfe417: I understand that this is based on a concern Andreea had on the follow up PR. My main concern is the same as the previous two commits, that we start introducing too many layers of abstraction here. The structures added here are a new layer on top of MmapRegionBuilder. The diff of this commit touches mostly on test code, so I think what happened at some point in the past is that someone did not want to use the builder in test code, so they added the tuple-based construction methods as shortcuts. Any actual new information that is used during construction of MmapRegions should be integrated with the MmapRegionBuilder (e.g. the two new Xen fields from the follow up PR should become #[cfg(xen)] fields in the build, with associated setters).

Yeah, this is exactly what I have done for MmapRegionBuilder.

Outside of test code, the builder should always be used to construct mmap regions.

Hmm, not really I guess. MmapRegionBuilder is a Unix only concept. The common interfaces provided by this crate (in mmap.rs) are GuestRegionMmap and GuestMemoryMmap instead and all the end users should use them. For example MmapRegionBuilder isn't there for Windows based implementation. If you see the vhost crate too, it uses these two interfaces only and not the builder. And so we required new structures for the parameters of these two structures, which is what was suggested by Andreea. Which eventually gets used in mmap_unix.rs as well.

Mh, yeah, right now it is, but it doesnt have to stay that way. Instead of adding shared abstraction on top of it, we can just extend the builder to cover both xen and unix, with setters/fields being appropriately cfg'd and then maybe different .build methods. Because right now, what used to be MmapRegionBuilder::build() -> libc::map() turned into MmapRegionBuilder::build() -> MmapRegionBuilder::mmap() -> MmapInfo::with_checks() -> Mmap::new() -> libc::map() on unix, and MmapRegionBuilder::build() -> MmapRegionBuilder::mmap() [but a different one] -> MmapXen::mmap() -> MmapXenTrait::mmap() -> MmapXenGrant::mmap_range() -> Mmap::new() -> libc::mmap() on xen. The cognitive complexity of this call stack is too much, especially given how similarly called everything is :(. And then MmapRange adds another layer on top of all of this (and the on-the-fly mapping in VolatileSlice does not even use this specific call stack, so I'm struggling a bit to see where this is used - if we don't use this for the mapping in the VolatileSlice, then why are we making this change?). Intuitively, I'd say 1 intermediary max to get from build to libc::mmap is something we should strive for here. If this means a little bit of code will be duplicated then that's fine.

Sorry if this is a lot (or if I got some things wrong - cross-referencing with the other PR was difficult at time).

I am fine with whatever changes I need to make to get over with this work. I just want to get over this soon :)

The main themes in my feedback are keeping hierarchies flat, and keeping the changes to the public API for non-xen users minimal.

Sure.

Before I do another of this PR/the follow up, could you try addressing @andreeaflorescu's comment in the other PR about minimizing the scope of cfg blocks? I think that will make review a lot easier, and should bring down the size of the diffs quite a bit too. Thanks!

Hmm, I already did a lot of that and the diff is really small now comparatively (from where I started).

Since there is a lot of code to review here, I would really prefer to get this pull request merged before we start reviewing the other one, which introduces Xen based stuff. I understand that you had to look at the other pull requests too, to get an insight on how things are getting used eventually and it is not super easy.

Right, yeah, this was my main problem - I needed to cross-ref between two git checkouts, which was a bit tedious at times xP

My idea behind this separate pull request was to try and make the review process easier and shorter and merge code that is already reviewed, instead of waiting for the rest of it to be ready as well.

I have repushed this branch now after removing the two commits and making the bitmap change you suggested.

Thanks.

@vireshk
Copy link
Contributor Author

vireshk commented Jun 20, 2023

  • And yes VoltaileSlice could have used MmapRegion as well, but MmapInfo, which is Mmap for non-xen cases, is used in mmap_unix.rs as well to write some common code and so I reused the same type in VolatileSlice too.

Okay, but for unix, we don't need anything extra in VolatileSlice - the additional field should be #[cfg(xen)], and just contain the xen specific type. And then at this point, since only the Xen specific type uses this, it's fine to inline its few fields, I think.

For unix we used to have PhantomData<&'a T>, which we can avoid now with a single generic field here. Also since this is set as Option, it is set to None for the non-xen case.

For xen, I need a new parameter to all the ::with_bitmap() methods. Making this field available for just Xen, would mean a lot of duplicated code for these functions, unless I am missing something here.

@vireshk
Copy link
Contributor Author

vireshk commented Jun 20, 2023

Hmm, not really I guess. MmapRegionBuilder is a Unix only concept. The common interfaces provided by this crate (in mmap.rs) are GuestRegionMmap and GuestMemoryMmap instead and all the end users should use them.

Mh, yeah, right now it is, but it doesnt have to stay that way. Instead of adding shared abstraction on top of it, we can just extend the builder to cover both xen and unix, with setters/fields being appropriately cfg'd and then maybe different .build methods.

This is already done in my other patchset. We have a method named with_mmap_info(), which is used to do that exactly.

The problem is that we need something for GuestRegionMmap and GuestMemoryMmap as well and that's where the Range based new structures come in as the parameters list became really big, which was suggested to me by Andreea and @Ablu to reduce the Xen specific code btw, which eventually looks great to me at least.

Because right now, what used to be MmapRegionBuilder::build() -> libc::map() turned into MmapRegionBuilder::build() -> MmapRegionBuilder::mmap() -> MmapInfo::with_checks() -> Mmap::new() -> libc::map() on unix,

I can merge the Mmapinfo::with_checks() and Mmap::new(), they are kept separate to make the code efficient and skip an if comparison though.

The new structure Mmap came out of earlier reviews only, and dropping that would mean going back to duplicating code now.

and MmapRegionBuilder::build() -> MmapRegionBuilder::mmap() [but a different one] -> MmapXen::mmap() -> MmapXenTrait::mmap() -> MmapXenGrant::mmap_range() -> Mmap::new() -> libc::mmap() on xen.

Right.

The cognitive complexity of this call stack is too much, especially given how similarly called everything is :(.

This is bound to happen when the code is required to do more and handle more complex cases, I don't how this can be avoided. I am sure that initially we won't have wanted to keep separate files like mmap.rs and mmap_unix.rs, but we ended up doing that as we wanted to support different operating systems, etc. These abstractions will come in if want to support more with less code.

And then MmapRange adds another layer on top of all of this (and the on-the-fly mapping in VolatileSlice does not even use this specific call stack, so I'm struggling a bit to see where this is used -

This call stack provides extra information, i.e. mmap_flags and mmap_data, which is used by Xen mapping routines, while the mapping is made. For foreign mapping, it is done right from this call stack as MmapRegionBuilder::build() ends up mapping the memory. For grant mapping though, the information is saved in the structure and used later when the real mapping happens via VolatileSlice. This information is also used to know which mapping type Xen supports, foreign or grant, and so it is used at build() all the time.

if we don't use this for the mapping in the VolatileSlice, then why are we making this change?).

We do use it :)

Intuitively, I'd say 1 intermediary max to get from build to libc::mmap is something we should strive for here. If this means a little bit of code will be duplicated then that's fine.

This new strucuture is really useful to free the resources (munmap()) when the instance goes out of scope. This is required for each Xen mapping model as well as Unix. I can remove this strucure if you like and start duplicating the same code, but that would be going backwards and then I will have to implement a drop() trait for many structures which is avoided right now with this.

@vireshk
Copy link
Contributor Author

vireshk commented Jun 26, 2023

Replaced by: #241

@vireshk vireshk closed this Jun 26, 2023
@vireshk vireshk deleted the mmap/reorg branch June 26, 2023 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants