Atomic memory allocator #521

nidin · 2019-02-28T14:22:18Z

Atomic allocator for shared memory

dcodeIO · 2019-02-28T14:47:23Z

Would it be feasible, as an alternative, to create a common wrapper around any existing memory allocator, given that the interface is always the same? I.e. guarantee that max. one thread at a time executes an allocation or free operation through a lock when any such operation is attempted?

nidin · 2019-02-28T14:54:56Z

We need a global variable shared either in global namespace or in memory namespace. If memory.shared is true, we can use atomic.cmpxchg to allocate else normal way.

dcodeIO · 2019-02-28T15:03:52Z

My though process was that, if we had a common wrapper, we could provide for example allocator/tlsf.atomic, allocator/buddy.atomic, allocator/arena.atomic that use the respective mm but also the common atomic wrapper that sits between the user and the mm. I'm not sure about the locking overhead though. Might be significant when locking each attempt to allocate/free. Not sure.

nidin · 2019-02-28T16:44:16Z

Allocation in shared memory, only possible with atomic operation otherwise, threads can try to allocate different size blocks in same location which will cause memory corruption. There is an other way, we can restrict only main thread can allocate memory by importing allocation functions to thread from main instance, but if my understanding is correct, it’s not possible now since wasm only support worker based threads. There is no way to pass exports from main thread to workers. So we need to stick to atomic operation to allocate memory from threads.

…

On Thu 28. Feb 2019 at 4:03 PM, Daniel Wirtz ***@***.***> wrote: My though process was that, if we had a common wrapper, we could provide for example allocator/tlsf.atomic, allocator/buddy.atomic, allocator/arena.atomic that use the respective mm but also the common atomic wrapper that sits between the user and the mm. I'm not sure about the locking overhead though. Might be significant when locking each attempt to allocate/free. Not sure. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMGZnOHbVCfwG5Jljkqt2ol5eBMkXoIks5vR-_dgaJpZM4bW7GB> .

dcodeIO · 2019-02-28T16:54:31Z

So, what if we'd designate let's say memory offset 8 to hold a value whether any thread is currently within either allocate or free? Something like

function memory_allocate(size: usize): usize {
  while (atomic.cmpxchg(8, 0, 1)) {}
  var ret = original_memory_allocate(size);
  store(8, 0);
  return ret;
}

Wouldn't that work with any allocator if all threads then used the wrapper?

nidin · 2019-02-28T19:27:09Z

Yes, this will work

…

On Thu 28. Feb 2019 at 5:54 PM, Daniel Wirtz ***@***.***> wrote: So, what if we'd designate let's say memory offset 8 to hold a value whether any thread is currently within either allocate or free? Something like function memory_allocate(size: usize): usize { while (!atomic.cmpxchg(8, 0, 1)) {} var ret = original_memory_allocate(size); store(8, 0); return ret; } Wouldn't that work with any allocator if all threads then used the wrapper? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMGZv5E-9-6q1ow109jLXDDYW0Aym-gks5vSAnMgaJpZM4bW7GB> .

MaxGraey · 2019-02-28T19:28:54Z

May be better use futex for this long lock section? I mean Atomic.wait/notify

nidin · 2019-02-28T20:17:04Z

wait/notify should be slightly faster than while loop.

…

On Thu 28. Feb 2019 at 8:29 PM, Max Graey ***@***.***> wrote: May better use futex for this long section? I mean Atomic.wait/notify — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#521 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMGZj2qtiWVSu9jqEwz_6Sy0iHDG20fks5vSC35gaJpZM4bW7GB> .

dcodeIO · 2019-03-01T08:32:00Z

So, according to the example on the threads spec, a working mechanism here could be

function lock(addr: usize): void {
  while (atomic_cmpxchg<i32>(addr, 0, 1)) {
    atomic.wait<i32>(addr, 1, -1);
  }
}

function unlock(addr: usize): void {
  atomic.store<i32>(addr, 0);
  atomic.notify<i32>(addr, 1);
}

const MM_LOCK: usize = 8;

function memory_allocate(size: usize): usize {
  lock(MM_LOCK);
  var ret = original_allocate(size);
  unlock(MM_LOCK);
  return ret;
}

function memory_free(addr: usize): void {
  lock(MM_LOCK);
  original_free(addr);
  unlock(MM_LOCK);
}

Now, if shared memory is enabled with the respective compiler flag, the compiler would automatically inject MM_LOCK at offset 8 in static memory, similar to what it does with HEAP_BASE, ensuring that all threads are operating on the same assumptions.

Special care must be taken when initializing a memory allocator of course. The main thread will usually set it up while threads inherit its state, which can be done conditionally based on isDefined(MM_LOCK) and performing another lock/unlock step to determine whether to initialize or inherit.

MaxGraey · 2019-03-01T13:17:11Z

More anvanced mutex implementation with spin locks:

const SPIN_LOCK_ITER_LIMIT: i32 = 128;

function mutexLock(addr: usize): void {
    var stat = 0;
    for (let i = 0; i < SPIN_LOCK_ITER_LIMIT; i++) {
      stat = atomic.cmpxchg<i32>(addr, 0, 1)
      if (!stat) break;
    }
    if (stat == 1) {
      stat = atomic.xchg<i32>(addr, 2);
    }
    while (stat) {
      atomic.wait<i32>(addr, 0, 2); // <-- not sure about this params
      stat = atomic.xchg<i32>(addr, 2);
    }
}

function mutexUnlock(addr: usize): void {
    if (addr == 2) addr = 0;
    else if (atomic.xchg<i32>(m, 0) == 1) return;
    for (let i = 0; i < SPIN_LOCK_ITER_LIMIT; i++) {
      if (addr && atomic.cmpxchg<i32>(addr, 1, 2)) return;
    }
    atomic.notify<i32>(addr, 1);
}

dcodeIO · 2019-03-01T13:31:49Z

What does it do? Spinlock first and if that doesn't work, wait, in order to reduce context switches?

MaxGraey · 2019-03-01T13:34:32Z

It just do spin lock limited to 128 iterations and after fallback to atomic.wait/notify (futex) approach after iter timeout. So if could lock/unlock faster if possible

MaxGraey · 2019-03-01T13:36:26Z

Ideally after each iteration we should signalize cpu to sleep(0) but this not support on wasm

dcodeIO · 2019-03-01T14:29:16Z

I see, that's the usual tradeoff between wasting cycles and switching context then. I wonder how that'd compare to a naive wait/notify approach without a way to signal sleep(0), though, as this is guaranteed to waste more cycles than on a non-wasm platform then which might, or might not, be more costly than the context switch (in some scenarios). Feels like something to benchmark eventually :)

MaxGraey · 2019-03-01T14:31:05Z

yeah, definitely need benchmark.

dcodeIO · 2019-06-19T17:51:50Z

One more remaining building block for a shared memory manager/gc, apart from using locking, appears to be that the current implementations utilize globals to store some of their state. If I'm not mistaken, global states (except immutable globals like __heap_base) are not shared and their values differ between threads, so the information stored there must be synchronized somehow, like storing inside of the MM/GC control structure in memory.

nidin · 2019-06-20T07:34:29Z

Is mutable global thread safe?
Or can we use atomic operations on globals?

jtenner · 2019-06-22T03:43:57Z

globals are not shared between instances. There is no possible way to use globals unless they are readonly or mutable for the given instance.

We need to store all data in shared memory between the instances.

dcodeIO · 2020-05-25T16:49:14Z

Closing this PR as part of 2020 vacuum as it appears to be outdated. In general there are still some open questions regarding a thread-safe allocator, in particular whether we should rather think about a more JS-y approach like workers and postMessage, keeping allocation local to each thread.

nidin and others added 2 commits February 28, 2019 15:21

Atomic memory allocator

55d07b6

Atomic allocator for shared memory

Update atomic.ts

182c9b9

dcodeIO closed this May 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atomic memory allocator #521

Atomic memory allocator #521

nidin commented Feb 28, 2019

dcodeIO commented Feb 28, 2019

nidin commented Feb 28, 2019

dcodeIO commented Feb 28, 2019

nidin commented Feb 28, 2019 via email

dcodeIO commented Feb 28, 2019 •

edited

Loading

nidin commented Feb 28, 2019 via email

MaxGraey commented Feb 28, 2019 •

edited

Loading

nidin commented Feb 28, 2019 via email •

edited

Loading

dcodeIO commented Mar 1, 2019 •

edited

Loading

MaxGraey commented Mar 1, 2019 •

edited

Loading

dcodeIO commented Mar 1, 2019 •

edited

Loading

MaxGraey commented Mar 1, 2019 •

edited

Loading

MaxGraey commented Mar 1, 2019

dcodeIO commented Mar 1, 2019

MaxGraey commented Mar 1, 2019

dcodeIO commented Jun 19, 2019

nidin commented Jun 20, 2019

jtenner commented Jun 22, 2019

dcodeIO commented May 25, 2020

Atomic memory allocator #521

Atomic memory allocator #521

Conversation

nidin commented Feb 28, 2019

dcodeIO commented Feb 28, 2019

nidin commented Feb 28, 2019

dcodeIO commented Feb 28, 2019

nidin commented Feb 28, 2019 via email

dcodeIO commented Feb 28, 2019 • edited Loading

nidin commented Feb 28, 2019 via email

MaxGraey commented Feb 28, 2019 • edited Loading

nidin commented Feb 28, 2019 via email • edited Loading

dcodeIO commented Mar 1, 2019 • edited Loading

MaxGraey commented Mar 1, 2019 • edited Loading

dcodeIO commented Mar 1, 2019 • edited Loading

MaxGraey commented Mar 1, 2019 • edited Loading

MaxGraey commented Mar 1, 2019

dcodeIO commented Mar 1, 2019

MaxGraey commented Mar 1, 2019

dcodeIO commented Jun 19, 2019

nidin commented Jun 20, 2019

jtenner commented Jun 22, 2019

dcodeIO commented May 25, 2020

dcodeIO commented Feb 28, 2019 •

edited

Loading

MaxGraey commented Feb 28, 2019 •

edited

Loading

nidin commented Feb 28, 2019 via email •

edited

Loading

dcodeIO commented Mar 1, 2019 •

edited

Loading

MaxGraey commented Mar 1, 2019 •

edited

Loading

dcodeIO commented Mar 1, 2019 •

edited

Loading

MaxGraey commented Mar 1, 2019 •

edited

Loading