Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move dont_need into MVP? #384

Closed
lukewagner opened this issue Oct 2, 2015 · 28 comments
Closed

Move dont_need into MVP? #384

lukewagner opened this issue Oct 2, 2015 · 28 comments

Comments

@lukewagner
Copy link
Member

Weighing costs/benefits, I was thinking there's little reason not to bump dont_need into the MVP: it's trivial to implement (if you don't want to do the implied sycall, you can just memzero), it'd be immediately used by the toolchain's malloc, and it would be immediately useful on mobile for reducing RSS and thus probability of oom-kill.

@ghost
Copy link

ghost commented Oct 2, 2015

+1

@kg
Copy link
Contributor

kg commented Oct 2, 2015

sgtm

@ghost
Copy link

ghost commented Oct 2, 2015

One challenge for current asm.js implementations will be to ensure that the array buffer elements start aligned to a page boundary. I don't think any other alignment is useful and that app developers should be able to depend on this alignment. This was practical on Odin last time I looked, but I have not looking in this for v8 and jsc implementations?

Could this alignment requirement be added to the wasm spec.

@lukewagner
Copy link
Member Author

Yes, this is vaguely implied in the current text by "The addr and length parameters above would be required to be multiples of page_size." and would be made more explicit.

@titzer
Copy link

titzer commented Oct 2, 2015

I can see this being harmless, but it seems like it's a member of a class
of memory-control operations that should be designed to work together, like
memcpy, memove, memset, etc. Maybe it makes sense to decide for all of them
at once and do a chunked approach as opposed to a one-by-one approach?

On Fri, Oct 2, 2015 at 5:45 PM, Luke Wagner [email protected]
wrote:

Yes, this is vaguely implied in the current text
https://github.com/WebAssembly/design/blob/master/FutureFeatures.md#finer-grained-control-over-memory
by "The addr and length parameters above would be required to be
multiples of page_size." and would be made more explicit.


Reply to this email directly or view it on GitHub
#384 (comment).

@lukewagner
Copy link
Member Author

I think dont_need is in a fundamentally different category from those ops: it's not about performance, it's about exposing the raw ability to release physical pages associated with virtual memory (thereby decreasing RSS, etc).

Now I was thinking that maybe memcpy allows exposing the underlying OS functionality to remap pages (e.g., mremap), but if it has plain C semantics, I don't think we'll be able to use mremap or anything on Windows (but maybe there's an angle I'm missing). Furthermore, with SIMD.wasm, I would think we'd be able to get pretty good performance for memcpy/memmove/memset from pure wasm (sure native can do AVX 512-bit copies, but I wonder if we're not just memory-bound by that point and so SIMD.wasm's 128-bit is sufficient). Lastly, dont_need can be used to optimize memset(0) since semantically it leaves the pages 0. So while I think we definitely want dont_need, I'm not sure we need builtin memset/memcpy/memmove.

@titzer
Copy link

titzer commented Oct 2, 2015

If this boils down to a memset(0) with an additional performance hint, why
wouldn't we also want to add memset(0) without the performance hint?

Is this really a critical thing for MVP? Or, does shipping an MVP without
it lack something fundamental?

On Fri, Oct 2, 2015 at 9:20 PM, Luke Wagner [email protected]
wrote:

I think dont_need is in a fundamentally different category from those
ops: it's not about performance, it's about exposing the raw ability to
release physical pages associated with virtual memory (thereby decreasing
RSS, etc).

Now I was thinking that maybe memcpy allows exposing the underlying OS
functionality to remap pages (e.g., mremap), but if it has plain C
semantics, I don't think we'll be able to use mremap or anything on
Windows (but maybe there's an angle I'm missing). Furthermore, with
SIMD.wasm, I would think we'd be able to get pretty good performance for
memcpy/memmove/memset from pure wasm (sure native can do AVX 512-bit
copies, but I wonder if we're not just memory-bound by that point and so
SIMD.wasm's 128-bit is sufficient). Lastly, dont_need can be used to
optimize memset(0) since semantically it leaves the pages 0. So while I
think we definitely want dont_need, I'm not sure we need builtin
memset/memcpy/memmove.


Reply to this email directly or view it on GitHub
#384 (comment).

@jfbastien
Copy link
Member

I think it's pretty critical to allow developers to reduce their memory footprint. In fact, it's unfortunate that array buffers are guaranteed to be committed physical pages, especially since many applications optimistically allocate too much. For memory constrained devices I think it's very useful to be able to avoid physical reservation.

Agreed it's not fundamental (apps can work without it).

It also seems easy to support, so I like this proposal.

@lukewagner
Copy link
Member Author

@titzer We may not be able to specify the effects of this operation on RSS, but I think it's more than a "hint": the implied implementation is nothing more than call to madvise(DONTNEED) on POSIX and VirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT) on Windows which means immediately releasing resident pages. That, if you read these ranges again, you get zeros is just a (quite fortunate, since we need determinism) byproduct. Heck, for certain parameters or on certain OSes, dont_need might even be slower than walking through memory writing zeros.

@titzer
Copy link

titzer commented Oct 2, 2015

So I might be in the minority here, but I think we should avoid some
feature creep into the MVP, just so it's easier to ship it sooner.

We also want to have a nice train of releases and updates, so maybe this
would make sense as a little cluster of memory-advice related operations
that could follow on right after MVP? That would test our plan to have
WebAssembly be updated regularly.

On Fri, Oct 2, 2015 at 10:15 PM, JF Bastien [email protected]
wrote:

I think it's pretty critical to allow developers to reduce their memory
footprint. In fact, it's unfortunate that array buffers are guaranteed to
be committed physical pages, especially since many applications
optimistically allocate too much. For memory constrained devices I think
it's very useful to be able to avoid physical reservation.

Agreed it's not fundamental (apps can work without it).

It also seems easy to support, so I like this proposal.


Reply to this email directly or view it on GitHub
#384 (comment).

@lukewagner
Copy link
Member Author

@titzer Definitely agreed on the goal of minimality of MVP, but this seemed like that it was in the sweet spot of being both trivial to implement/specify and known necessary/useful. Trivial or not, I know at some point we need to just arbitrarily draw the line and say "catch the next train"; but it wasn't clear to me that we're at that point yet. I don't have a strong opinion on this matter, though.

@ghost
Copy link

ghost commented Oct 2, 2015

@lukewagner The requirement for the addr and length parameter to be a multiple of the page size does not help if the implementation does not align the start of the buffer elements to a physics page. This does make a real difference for the application, and could this requirement also be added to the spec.

@lukewagner
Copy link
Member Author

@JSStats An implementation would align its buffer at some page boundary and, even if it didn't, an implementation could still do a good job by memset(0)ing the leading and trailing slop and madvise()ing the page-aligned middle. And none of this (esp. absolute start address) is relevant to the application; it just needs to worry about making the start and length arguments to dont_need be page_size-aligned.

@ghost
Copy link

ghost commented Oct 2, 2015

@lukewagner It does affect the application in practice. If this alignment could not be guaranteed then an application could not depend on using dont_need on a page-by-page basis, rather would have to try and merge requests hoping that the runtime can free some of them. Not specifying this would have unintended negative consequences.

Also not including dont_need in the MVP might have the unintended consequence of implementations trying to free pages in memset calls. Someone might just have to add support for v8 and jsc too to get this in if they are not particularly interested.

@ghost
Copy link

ghost commented Oct 2, 2015

@lukewagner Just curious: is a VirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT) sequence as effective as leaving the page un-committed and requiring it to be committed before being used?

@lukewagner
Copy link
Member Author

@JSStats Again, implementations are just going to page-align; with wasm, the linear memory is created when the module is loaded as part of the instance so it's much easier than the situation with ArrayBuffers.

Just curious: is a VirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT) sequence as
effective as leaving the page un-committed and requiring it to be committed before being used?

Yes; iiuc, MEM_COMMIT doesn't make pages resident, it just means that if you touch the page it doesn't segfault in user space (a zero page will be made resident on demand). The terminology "commit" is a little confusing here, with Windows having a very particular meaning.

@MikeHolman
Copy link
Member

It sounds like on Windows there no real benefit to the dont_need over memset(0) or am I misunderstanding?

@lukewagner
Copy link
Member Author

@MikeHolman The benefit is removing physical pages from the working set while keeping the virtual address range committed. See the description of MEM_RESET in VirtualAlloc which specifically mentions the decommit+commit pattern. As a concrete example, see pages_purge in jemalloc (jemalloc doesn't require well-defined zeros, hence it can use MEM_RESET).

@AndrewScheidecker
Copy link

Yes; iiuc, MEM_COMMIT doesn't make pages resident, it just means that if you touch the page it doesn't segfault in user space (a zero page will be made resident on demand). The terminology "commit" is a little confusing here, with Windows having a very particular meaning.

You're right that it doesn't make the pages resident, but committed pages consume "commit charge" (i.e. page file space) even if they don't consume physical pages. I'm not sure if there's an analogous state to "reserved but not committed" on POSIX systems. I think Windows also allocates page tables for any committed pages, but not for pages that are only reserved.

@ghost
Copy link

ghost commented Oct 2, 2015

Another issue might be that VirtualFree(MEM_DECOMMIT);VirtualAlloc(MEM_COMMIT) is not atomic. In a multi-threaded implementation a thread might see the decommitted page if the implementation does not hide this. It might be enough to retry in a fault handler, spinning. But perhaps it would be decided to avoid a fused dont_need operation in wasm to avoid this implementation complexity.

@jfbastien
Copy link
Member

Lack of atomicity is indeed a huge issue. We need to make sure this can be done atomically on Windows. It seems like bzero + MEM_RESET does what we want?

Alignment is an unrelated issue IMO, we should discuss elsewhere.

Page table entries are necessarily consumed, but that's not a true issue: it doesn't consume physical pages (which is what matters on low-memory devices), and the TLB hides the cost of traversing the PTEs.

Anyhow, I'm OK will making this:

  • An MVP feature, if we resolve implementability issues (as @lukewagner says it doesn't seem that big of a feature).
  • Adding this to MVP, but just doing bzero for now.
  • Punting it to just after MVP.

We definitely want this, but I'm OK if we get it soon enough.

@AndrewScheidecker
Copy link

I think exposing this as a page-sized memset(0) is fine for the MVP. Post-MVP it seems like there needs to be a way to decommit pages within the instance's linear address-space (i.e. call VirtualFree(MEM_DECOMMIT) on Windows).

@lukewagner
Copy link
Member Author

@jfbastien Yeah, even if this is pretty trivial to add (no atomicity issue before threads), I'm fine waiting too. I was even thinking that having a batch of low-hanging post-v.1 fruit (much easier than "dynamic linking" and "threads") might help ease the transition into post-v.1 iteration.

@AndrewScheidecker Sounds good to me; I guess POSIX would simulate that with mprotect(NONE);madvise(DONTNEED). I think the major river to cross with that and the related mprotect ops is that both require the engine to use signal handling which 1 or 2 engines have raised issues with.

@titzer
Copy link

titzer commented Oct 5, 2015

On Mon, Oct 5, 2015 at 4:43 PM, Luke Wagner [email protected]
wrote:

@jfbastien https://github.com/jfbastien Yeah, even if this is pretty
trivial to add (no atomicity issue before threads), I'm fine waiting too. I
was even thinking that having a batch of low-hanging post-v.1 fruit (much
easier than "dynamic linking" and "threads") might help ease the transition
into post-v.1 iteration.

That sounds good. We can have a batch of "bulk memory ops" that fit
together as a unit.

How about renaming "dont_need" to be more "verby", since it zeroes the
memory and also advises it?

@AndrewScheidecker https://github.com/AndrewScheidecker Sounds good to
me; I guess POSIX would simulate that with
mprotect(NONE);madvise(DONTNEED). I think the major river to cross with
that and the related mprotect ops is that both require the engine to use
signal handling which 1 or 2 engines have raised issues with.


Reply to this email directly or view it on GitHub
#384 (comment).

@kg
Copy link
Contributor

kg commented Oct 5, 2015

How about renaming "dont_need" to be more "verby", since it zeroes the memory and also advises it?

Not to veer into bikeshed, but does the verb 'discard' accurately capture what it implies? discard the contents of the page (zero it out) and discard the page/its backing store (decommit it if wanted)

@lukewagner
Copy link
Member Author

sgtm on both points

@titzer
Copy link

titzer commented Oct 5, 2015

Discard sgtm

On Mon, Oct 5, 2015 at 8:46 PM, Luke Wagner [email protected]
wrote:

sgtm on both points


Reply to this email directly or view it on GitHub
#384 (comment).

@lukewagner
Copy link
Member Author

Right, closing with the answer "no" and the two small changes captured in #388.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants