Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why restrict immutable types to only containing other immutable types? #31

Closed
Jamesernator opened this issue Jun 25, 2019 · 108 comments
Closed

Comments

@Jamesernator
Copy link

This restriction doesn't feel particularly necessary and seems inconsistent with the rest of the language allowing reference types on objects and in arrays.

Given the newer syntax is proposed to be #{/#[ rather than const it's not really an issue for conciseness either.

@littledan
Copy link
Member

This restriction is central to the idea that tuples and records are primitives, not objects. They have a guarantee of deep immutability and a lack of identity. When @keithamus presented shallowly immutable objects and arrays to TC39, there was a lot of interest expressed in deep immutability, which we get from this restriction.

Do you have a use case that would require mixing records/tuples with mutable objects, which would not be well-served by Object.freeze?

@dead-claudia
Copy link
Contributor

BTW, I just wanted to drop in and state that deeply immutable objects could be sent via postMessage and similar without the need to copy and still retain shared-nothing semantics, but immutable objects with mutable values, even deeply nested within them, cannot.

So even if mixing is allowed, it would be necessary to tell them apart quickly, so browsers could elide the copy where possible.

@Jamesernator
Copy link
Author

Do you have a use case that would require mixing records/tuples with mutable objects, which would not be well-served by Object.freeze?

Well just creating keys from any reference types. e.g. Storing edges between DOM elements:

const edges = new Set()

// On click or something add edge between nodes
edges.add(#{ source: nodeElement1, target: nodeElement2 })

@Jamesernator
Copy link
Author

Jamesernator commented Jun 28, 2019

so browsers could elide the copy where possible.

I think that browsers should still be able to avoid a copy of objects with reference types just by poking holes into the value types.

e.g. Consider an object like:

const o = #{
  a: 10,
  b: reference1,
  c: #{ 
    d: 'foo',
    e: reference2,
  },
}

In order to copy o you could store it in memory as two values:

{
  // This would be an immutable value type so copying can be elided
  value: {
    a: 10,
    b: refHole(0),
    c: {
      d: 'foo',
      e: refHole(1),
    },
  },
  // Only this needs to be cloned
  refMap: {
    0: reference1,
    1: reference2,
  },
}

Then when doing something like postMessage you don't need to copy the .value because it's immutable. You only need to clone things in the .refMap.

@rickbutton
Copy link
Member

rickbutton commented Jun 28, 2019

@Jamesernator to your first example, that will work. Under the current proposal it is perfectly valid to create a Set or Map that contains const objects or const arrays, only the inverse (a const object or const array that contains an object or array) is invalid.

Edit: see this section in the overview: https://github.com/rricard/proposal-const-value-types#usage-in-mapsetweakmap

@Jamesernator
Copy link
Author

to your first example, that will work.

No it doesn't because #{ source: nodeElement1, target: nodeElement2 } contains references (nodeElement1 and nodeElement2).

And sure you can hack around it by doing something like:

class IdMap {
  #map = new WeakMap();

  get(object) {
    if (!this.#map.has(object)) {
      this.#map.set(object, Symbol())
    }
    return this.#map.get(object)
  }
}

const edges = new Set()
const idMap = new IdMap()

edges.add(#{ source: idMap.get(nodeElement1), target: idMap.get(nodeElement2) })

But now the object is kinda pointless as you can't get the nodeElement1 and nodeElement2 back from the const object only these transparent keys.

To repair this you then have to add a reverse mapping:

class IdMap {
  #map;
  #reverseMap;

  get(object) {
    if (!this.#map.has(object)) {
      const idKey = Symbol()
      this.#map.set(object, idKey)
      this.#reverseMap.set(idKey, new WeakRef(object))
    }
    return this.#map.get(object)
  }

  lookup(id) {
    const ref = this.#reverseMap.get(idKey)
    return ref && ref.deref()
  }
}

const edges = new Set()
const idMap = new IdMap()

edges.add(#{ source: idMap.get(nodeElement1), target: idMap.get(nodeElement2) })

But this is quite a bit of boilerplate to do something so simple.

@rickbutton
Copy link
Member

Ah, yes, apologies. I didn't pay enough attention to the example. You are correct, that will not work as you have described it.

@zeorin
Copy link

zeorin commented Jul 2, 2019

If an immutable object is not always deeply immutable, you cannot in practice treat it as immutable, since then anything nested in it could have changed in between defining it and accessing it.
The use case of requiring deep immutability as a guarantee is IMO more common than requiring shallow immutability, which is in any case easily achieved using Object.freeze().
LambdaCast's most recent episode on stucturing data goes into some of the reasons deep immutability is very, very useful.

@ljharb
Copy link
Member

ljharb commented Jul 2, 2019

The trick is that “mutable” isn’t clearly defined - frozen objects are mutable if they have internal slots, and private fields on a frozen instance are mutable. An immutable function on a frozen object can still return changing values over time, as can getters.

It gets even more fuzzy if you take into account side channel metadata, like via weak refs/collections, that are arguably effectively part of the object even if they’re not accessed directly off of it.

@dead-claudia
Copy link
Contributor

@Jamesernator

I think that browsers should still be able to avoid a copy of objects with reference types just by poking holes into the value types.

You'd still have to replace the entire root structure if those are ever read. The sender has one reference to it while the child has another, and since reference types cannot retain identity across the message passing border, they have to be cloned. But in order to clone them, the identity must change. You can't do this in-place or you'd be mutating the immutable structure, something that's obviously unsound. And of course, this also carries on recursively to the root of the immutable structure itself. And this is why I say you have to clone the entire immutable structure in a structured clone algorithm if any mutable reference exists anywhere.

@Jamesernator
Copy link
Author

Jamesernator commented Jul 3, 2019

You'd still have to replace the entire root structure if those are ever read.

No you don't need to clone the entire structure, only the ref map. Basically a "value type" under the refHole thing consists of two things:

  • An entirely immutable object that contains some number of holes where references are stored.
    • This object doesn't need to be cloned when passing cross thread because it can't change
  • The ref map which maps ref holes to references
    • This needs to be cloned recursively as that's just the nature of refs, this might be empty in which case it doesn't need to be cloned, if the refMap is empty then this is necessarily a value type with no references

My implementation wasn't super good above as the deep object shared the same refcounter for the refMap, a better example/implementation would look more like this:

# -> is a reference to an object elsewhere
# Anything else is just by value

valueType:
    immutableObject:
        a: 10,
        b: refHole(0),
        c:
            d: 'foo'
            # ref counter is scoped to the object not to the root unlike
            # my previous example
            e: refHole(0)
            f: refHole(1)
        g:
          m: 20
        h: refHole(1)
        i:
          j: 'blah'
          k: 'blaz'
    refMap:
        subMaps:
            # This is just a reference to c's refMap
            c: ->
                ownMap:
                    0: -> someObjectOnC0
                    1: -> someObjectOnC1
            # g has no references so needs no refMap
            g: -> NULL
        ownMap:
            0: -> someObjectOnImmutableObject0
            1: -> someObjectOnImmutableObject1

So equality of two value types would be:

memCompare(valueType1.immutableObject, valueType2.immutableObject)
   && refMapsEqual(value1.refMap, value2.refMap)

In the case where no references are contained then refMapsEqual is basically just NULL == NULL which is dirt cheap and is probably skippable with some cleverness. There's probably tricks I haven't thought of that can make comparing the refmaps fast too given the refMap is still immutable (but it contains pointers so memCompare won't work).

For structured clone, yeah the refMap gets cloned via structured clone as per usual which might be deep. This is just the nature of reference types, deep references implies deep cloning; but it's still a pay for what you use scheme. If you don't have deep references you don't pay for them.

@Jamesernator
Copy link
Author

which is in any case easily achieved using Object.freeze().

I dunno why people keep pointing out Object.freeze as a substitute. It still doesn't solve the case of set.add(#{ value: someReference }) or #{ value: someReference } === #{ value: someReference } which is the whole point of creating value types in the first place.

And sure there's techniques like compositeKey/rekey, or exposing a function .equals. But this is a lot of boilerplate for something that is trivial with primitives and creates an arbitrary disparity between primitives and objects.

The use case of requiring deep immutability as a guarantee is IMO more common than requiring shallow immutability

I'm not really compelled that "deep immutability as a guarantee" is useful. Javascript is not a typed language, anyone can pass anything to any function, if you expect a specific shape you have to check your data is the shape you expect.

Also as I said in my previous comment, this is still just a pay for what you use. If you don't use references in the value type you don't pay for them.

@dead-claudia
Copy link
Contributor

@Jamesernator What happens if you attempt to access one of those "ref holes"?

@Jamesernator
Copy link
Author

You get the value pointed to by the refMap.

e.g.:

o = 
  immutableObject:
    a: refHole(0),
    b: 10
  refMap:
    ownHoles:
      0: -> someObject
const o = #{ a: someObject, b: 10 } 

o.a === someObject // true
o.b === 10 // true

Similarly if access an object with nested refHoles you just get the inner value type.

e.g.:

o =
  immutableObject:
    a: refHole(0)
    b: 10
    c:
      d: refHole(0)
      e: 12
  refMap:
    ownKeys:
      0: -> someObject
    subMaps:
      # This is a pointer to the map held by c
      c: ->
        ownKeys:
          0: -> someObject2

inner =
  # The immutableObject can just be a memory slice of o
  # it doesn't actually need to copy this
  immutableObject:
    d: refHole(0)
    e: 12
  # This refMap doesn't need to be copied either as it's already been
  # created, just deref the pointer
  refMap:
    ownKeys:
      0: -> someObject2
const o = #{ a: someObject, b: 10, c: { d: someObject2, e: 12 } }

# No copies or deep reads are needed to do this
const inner = o.c;

@dead-claudia
Copy link
Contributor

@Jamesernator I'm referring to after sending it through something like postMessage, not immediately and synchronously.

@Jamesernator
Copy link
Author

I'm referring to after sending it through something like postMessage, not immediately and synchronously.

It would have a new refMap, all objects in the new refMap would've been structurally serialized as per the normal structured serialization algorithm.

e.g.:

const o = #{ o: someObject, alsoO: someObject }

const messageChannel = new MessageChannel()

messageChannel.port1.onmessage = ({ data }) => {
  // o was cloned as per the usual algorithm
  // data has a different refMap than o but the immutableObject
  // can still be backed by the same memory
  data.o === data.o2 // true
  data.o === o.o // false
}

messageChannel.port2.postMessage(o)

@dead-claudia
Copy link
Contributor

dead-claudia commented Jul 3, 2019

@Jamesernator Consider the scenario of multiple worker IPC. Given entry script main.js and a worker script worker.js, would result === initial in main and would result.o === ref?

// main.js
var worker = new Worker("worker.js")
var ref = {}
var initial = const {o: ref}
worker.onmessage = e => {
	var result = e.data
	if (result !== initial) throw new Error("not equal")
}
worker.postMessage(initial)

// worker.js
self.onmessage = e => {
	self.postMessage(e.data)
}

If result === initial and result.o === ref don't evaluate to the same value, I'd love to see a potential implementation of it. 😉 Edit: Particularly if the first evaluates to true and the second evaluates to false. The inverse isn't exactly useful to anyone.

@zeorin
Copy link

zeorin commented Jul 3, 2019

I'm not really compelled that "deep immutability as a guarantee" is useful. Javascript is not a typed language, anyone can pass anything to any function, if you expect a specific shape you have to check your data is the shape you expect.

For me that would mean in practice there's not really any difference between how I'd use an immutable object or a regular one. Because the immutable one could have mutated, just like a regular one. Immutability is not about types, e.g. Clojure is untyped but has immutable data.

Anyway. Why not both? ##{}/##[] for deep immutability, #{}/#[] for shallow?

@Jamesernator
Copy link
Author

Both result === initial and result.o === ref would be false. This would be the same as if initial was simply a plain object.

Surviving the round trip would require cross-heap garbage collection which I don't think it feasible.

@Jamesernator
Copy link
Author

For me that would mean in practice there's not really any difference between how I'd use an immutable object or a regular one

I still don't get the point, the immutable one could only have mutated† if and only if it contains a reference type. Why would you have data that you don't know what it is?

This is why I say this is a typing problem, if you don't know what data you even have how can you possibly reason about it? Nothing is forcing you to use references if you don't need them.

† (But not really the value itself didn't mutate, just one of the objects it referenced, the value still maintains the same identity because it still points and will always point to the same references).


Also I would like to point out that frozen objects should already meet the motivation points in the proposal.

E.g. Consider a frozen object:

const o = Object.freeze({ __proto__: null, x: 10, y: 20 })

Its members don't change, its key order doesn't change, in fact nothing about it changes. This is basically the definition of a value type.

About the only difference is that the engine needs to be able to store it as a key in a WeakMap. But otherwise there's nothing preventing its representation being identical to a value type.

And if the representation is identical there should be no reason that engines couldn't apply all of the points in the overview to frozen objects.

About the only thing this proposal does that couldn't be done on frozen objects is object identity which allows use of the common operators like === or set.add, etc. But that's just sugar ultimately, engines could just as well just expose a Object.deepEqual(o1, o2) that is especially fast for frozen objects (as they can be backed with value types).

@devsnek
Copy link
Member

devsnek commented Jul 11, 2019

Engines already optimize even non-frozen objects into structs if possible. Without interior mutability this proposal is left standing on equality and the with operator. Those are useful, but if I can't use const objects anyway because they don't hold references it's kind of a moot point.

@dead-claudia
Copy link
Contributor

TC39 people: I've been privately working on a proposal that would address the mutability issue in multithreaded code, so that's something that could be worked around via a concept of agent "ownership" of objects. Not public yet, but I've been looking into it and I'll bring it up once I'm ready. So I see no need to restrict immutable types to just immutable values.

For the short term, the HTML spec could "clone" them by returning an updated immutable record/array with the relevant descendants cloned. An engine could expose whether an immutable object is deeply immutable, and let HTML implementations detect that and skip the clone if the engine supports it.

@mheiber
Copy link
Contributor

mheiber commented Aug 21, 2019

@Jamesernator the deep freeze idea is interesting, but I don't think it gets us 100% immutability, as the prototype chain will still be mutable: no way to avoid that afaik cc @littledan

@benjamn
Copy link
Member

benjamn commented Oct 1, 2019

I'm very excited about this proposal, but I think it's important to tolerate mutable child references, and I have at least one use case to support this suggestion.

The Apollo Client library has a method called readQuery, which caches results according to its input arguments (as well as tracking dependencies used during the computation, so that cached results can be invalidated later).

The input arguments include things like the query document object and a store object. These references are currently combined into a single canonical object reference, which can be used as a Map key to store the final readQuery result. Forcing the query to become immutable might be possible, but converting the store object to an immutable record seems impossible—for one thing, the store object often contains reference cycles, which makes canonicalization tricky/impossible.

To be most useful for Apollo's purposes, this proposal would need to tolerate references that have not been recursively converted to immutable records. Instead, the reference identity of those objects should be used, much like Map keys or Set elements. If you embrace this idea, then I think questions about mutable prototype chains become moot, because it's no longer essential to convert every property of every object/array to be immutable.

It may still be worthwhile for the developer to convert some/all child objects to canonical records/tuples, since that will improve the chances that a higher-level data structure will be === to another structurally equivalent data structure. How much effort the developer is required to expend in this direction is very much an application-level concern, and not something I believe the language should mandate.

Where this line of thinking leads, I believe, is a version of this proposal that does shallow rather than deep comparisons. If shallow array elements (or object keys/values) are === to each other, it would be convenient to have an easy way of producing === immutable tuples/records, even when some of the child objects have not been made immutable. To that end, I like @zeorin's idea of ##{...} + ##[...] for deeply immutable structures, and #{...} + #[...] for shallow canonical structures that can contain mutable elements.

@dead-claudia
Copy link
Contributor

Although wouldn't just nested #{...}/#[...] structures just inherently be deeply immutable assuming they don't contain any mutable references? Also, for optimization purposes, engines may want to track structures that are known to be deeply immutable anyways, even if they aren't syntactically marked as such - they could take a fast path with zero proxies and host objects for serialization, something very useful for web worker communication, Node IPC, and such.

@bmeck
Copy link
Member

bmeck commented Jan 10, 2020

If we avoid allowing non-primitive/value type references inside records and tuples it seems a GC wouldn't need to traverse them. I think this could be a very valuable property and might play into this discussion more as that would alleviate some need for allocation pools and allow very large numbers of these types without affecting GC.

@devsnek
Copy link
Member

devsnek commented Jan 10, 2020

that can be an internal flag of immutable types that happen to not hold any objects, the language doesn't have to enforce it.

@lilactown
Copy link

Let me try and describe the conversations that I see after reading through this issue a couple times. I think that there are three (intertwined) problems that are being used to motivate disallowing non-value types in records and tuples:

  1. How do we map deep equality to JavaScript's current equality semantics?
  2. How do we efficiently construct and compare nested data structures?
  3. How do we unlock the ability to efficiently move data between realms, e.g. via postMessage

Disallowing non-value types is perhaps the simpler way to solve these 3 problems. However, I think that there are massive tradeoffs that haven't been addressed, such as what @benjamn posted here: #31 (comment)

The key insight is that immutable structures that restrict what can be put inside them become "infectious," and reduces the applicability in places that we would really like immutable behavior, e.g. structural sharing and fast equality checks.

A library or framework cannot choose to place application data inside of a record or tuple because it would be coercive, either forcing a deep conversion at each point of communication with the framework (slow) or forcing the application dev to program with immutable data throughout (restrictive). Something like apollo's cache or React's VDOM tree is exactly the kind of thing that could benefit A LOT from ready-made immutable data structures integrated at the VM level, but it would be impossible for them to adopt without allowing mutable data within them for the reasons that @benjamn laid out.

@dead-claudia
Copy link
Contributor

Something I left implied in the thread that should've probably been made explicit: you can tell if records and tuples only carry immutable data with a single bit. It's relatively easy to track:

  • Start each record or tuple with the deeply immutable bit set.
  • When appending a record or tuple value to a parent record or tuple, check if the entry has the bit clear. If it's clear, clear the parent's deeply immutable bit.
  • When appending an object value to a parent record or tuple, clear the parent's deeply immutable bit.
  • In all other cases, leave the bit as-is.

When cloning, an implementation could check this bit and if it's set, reuse the value rather than creating a full copy.

@noppa
Copy link

noppa commented Jun 11, 2020

@Lokeh Note that there's a separate proposal Symbols as WeakMap keys that would make it possible, albeit not very ergonomic, to indirectly reference mutable objects using a symbol that is also used as a weak reference to an object. The usage could be something like

const container = #{
    foo: 1,
    mutableThing: myCollectionOfReferences.add({
        bar: 2,
    }),
};
// Change the mutable state
myCollectionOfReferences.get(container.mutableThing).bar = 3

@awto
Copy link

awto commented Jun 11, 2020

@noppa how does this solve the quick copy into another context feature?

This looks even slower, as we'll have to copy the whole collections, or still traverse all the structure to get the references. And this isn't different too just storing the real references. And also these get/set operations aren't something which looks possible to implement efficiently. Unlike for real references where it is just a memory access by addresses.

If the goal is to have memcpy into other context, the implementation can maintain a separate list of references in the structure. Since it is immutable it is easy to maintain, and serialize the list using standard means for objects.

@awto
Copy link

awto commented Jun 11, 2020

@Lokeh

  1. How do we map deep equality to JavaScript's current equality semantics?

Object reference is a primitive type (an address) or anything else which identifies the object instance. It can be compared just like numbers, the same way === operator does.

  1. How do we efficiently construct and compare nested data structures?

The same like 1. - references are primitive types and treated as primitive types. It doesn't need to traverse into objects, it just compares references, which are primitive and easy to compare.

  1. How do we unlock the ability to efficiently move data between realms, e.g. via postMessage

It is implementation details, but certainly there are many solution, more efficient than having separate object to reference representation maps. E.g. storing a list of references and serializing them separately, the rest is just copied with memcpy, if there are no references - it is just memcpy. But there is a choice.

So I really don't see any trade off at all, we don't loose anything by allowing objects (but gain a lot), and we don't gain anything denying objects.

@rricard
Copy link
Member

rricard commented Jun 11, 2020

@awto I encourage you to read the Symbol as WeakMap Keys proposal as it is explained there why for instance a reference primitve type is not possible: we attempted to create a Box primitive but it has implications on membranes that would be a non-starter for people working on Secure ECMAScript (SES).

Our goal is to enable you via a combination of Symbol as WeakMap Keys and a userland solution to effectively put objects in your record and tuple structures but we do not want that to be possible by default as it both technically complicated (because of the reasons I just mentioned) and we do think that the default should prevent you from having objects in record and tuple structures without the user being explicit about it. Here is a crude example of that (I would not recommend changing Symbol.prototype but that is a possibility if you really want such ergonomics):

reftracker.js

const refs = new WeakMap();

export function ref(obj) {
    const key = Symbol();
    refs.set(key, obj);
    return key;
}

export function addDerefToSymbol() {
    Symbol.prototype.deref = function () {
        return refs.get(this);
    };
}

refusage.js

import { ref, addDerefToSymbol } from "./reftracker.js";

addDerefToSymbol();

const container = #{
    foo: 1,
    mutableThing: ref({
        bar: 2,
    }),
};

container.mutableThing.deref().bar = 3;

@awto
Copy link

awto commented Jun 11, 2020

@rricard yes, the Symbol as WeakMap Keys proposal is cool and useful on its own but I still don't see any reasons for the limitation there, except:

This limitation exists because one of the key goals of the Record & Tuple Proposal is to have deep immutability guarantees and structural equality by default.

But having an object doesn't violate the deep immutability guarantees and structured equality, the reference is a primitive atomic type and it never changes. It is not different to have a number which refers to an object in some map (well, easier with WeakMap Symbol keys). When I change the object which the number refers too I don't violate the deep immutability. You just prevent making this fast and clean in the code. And I thought one of the main reasons for the proposal was performance.

Regarding SES reason, I don't understand why I need to suffer from the limitation because of some other feature I will not use soon. SES will impose other restrictions on other already released language features I suppose. So there'll be two modes anyway. Why not implement this restriction as part of SES mode?

I was very enthusiastic about this proposal, because of compound keys and I like I could use it for guaranteed constant cost field access and not relying on inline caches heuristics. But with the restriction, I cannot use it.

Most of my maps have object keys, the primitive compound key can be done by just converting them to a string, and not that easy polyfill for objects as keys.

No performance improvements too, as even after the WeakMap improvement is here, accessing an object will require WeakMap access, which is not the same easy to optimize.

@rricard
Copy link
Member

rricard commented Jun 11, 2020

It feels like you're advocating for a sort of automatic reference type which would let me type:

const container = #{
    foo: 1,
    sub: {
        bar: 2,
    },
};
const container2 = #{
    foo: 1,
    sub: {
        bar: 2,
    },
};

container.sub.bar = 3; // ok

container === container2; // => false because the two object literals have different identities

We deem this form as being error prone in the sense as we do not signal where that reference type is inserted. It is really easy to introduce an object in the structure without realizing it. Additionally that means that we have a primitive type that gets created implicitely and dereferencing from it is also done implicitely. This has tons of strong implications for this proposal: mostly in terms of complexity, which would make the proposal less likely to ship at all.

If you are arguing for a boxing type as explained in the Symbol as WeakMap Keys proposal I invite you to open an issue there as this is something that we have considered but left it aside because of membranes. I'm sure we can discuss that a bit more but there is also value in having the possibility to refer to objects inside of SES and Symbols as Weakmap keys is truly the solution to that.

Finally a word about performance guarantees, Record and Tuple does not offer guarantees in-spec and we do not make choices based on guarantees before we even attempted to implement it in an engine.

Given all of that, I'm curious about what you think about the example from my earlier comment, it does not seem too much of a burden to add objects in records and tuples that way but maybe I'm not seeing an essential use case here.

@awto
Copy link

awto commented Jun 11, 2020

@rricard

It feels like you're advocating for a sort of automatic reference type which would let me type:

yes, exactly, I see no problems with it, it the same as I would wrongly expect the next will add just one object to Set

   set.add({bar:2});
   set.add({bar:2})

JS still prevents much more painful errors (possible in, say, C++/Java) where after adding an object into a collection I can change the object. And so the collection becomes corrupted. With Record/Tuples it is impossible and it is awesome.

The usages like in your example can be prevented by TypeScript, if (and only if) needed. And the issues (caused by such usages) aren't looking very hard to discover.

Additionally that means that we have a primitive type that gets created implicitely and dereferencing from it is also done implicitely.

I think, maybe, dereferencing shouldn't be a part of the proposal, the access operator just returns the reference value, what is done by the reference further doesn't look like the responsibility of this proposal. The same as we just use variables names.

If you are arguing for a boxing type as explained in the Symbol as WeakMap Keys proposal I invite you to open an issue there as this is something that we have considered but left it aside because of membranes.

I need to study the proposal for more detail first.

Finally a word about performance guarantees, Record and Tuple does not offer guarantees in-spec and we do not make choices based on guarantees before we even attempted to implement it in an engine.

Yes, not guarantees of course, but leaving the door open for engines to implement it, and maybe they will someday. And I suppose they will add optimizations only if the feature will be popular enough, so another reason for not limiting it and not making it complex too much.

Given all of that, I'm curious about what you think about the example from my earlier comment, it does not seem too much of a burden to add objects in records and tuples that way but maybe I'm not seeing an essential use case here.

Yes, the code complexity burden in your example isn't that heavy to worry about, I worry only about performance, and there are still some chances WeakMap Symbol keys proposal isn't shipped.

@devsnek
Copy link
Member

devsnek commented Jun 11, 2020

@erights can you explain the ses stuff? I'm still not sure why a record holding an object presents a different challenge to a membrane compared to an object holding an object.

@littledan
Copy link
Member

tl;dr You can wrap an object in a membrane in constant time, so that property access on it will wrap the inner things. But you can't process a Record in a membrane similarly--the choices are both unacceptable

  • iterating over all of it (not constant time)
  • putting a Proxy around its object wrapper (breaks ===)

@devsnek
Copy link
Member

devsnek commented Jun 11, 2020

Making SES walk over the structure seems more than acceptable to me. Additionally, couldn't SES also just deny the theoretical Box() global? We seem to really be limiting ourselves based on a very niche use case.

@littledan
Copy link
Member

You wouldn't have to deny the whole Box global, just the deref method, and you're good. But, this comes down to two value judgements (whether it's OK for membranes to take linear time, and whether it's OK to add things in TC39 that SES would have to deny) which we could discuss further. I can understand the arguments on both sides for how much these constraints should be weighed.

Would folks be OK with moving this discussion about SES and Box to the repo https://github.com/tc39/proposal-symbols-as-weakmap-keys ? I still like to think about Box as part of the design space we're going for there. Also, maybe we could talk it over in an upcoming SES call?

@devsnek
Copy link
Member

devsnek commented Jun 11, 2020

I think this is the right place to discuss this; weakmap symbol keys have uses regardless of the direction of this proposal.

whether it's OK to add things in TC39 that SES would have to deny

IMO yes. If you do things with JS that you know won't be easily forward compatible, you take that responsibility on yourself. We should be aware of it of course, and try to avoid it, but I don't think that SES deniability is in itself a good reason to discount something.

Also, maybe we could talk it over in an upcoming SES call?

Sure, is that "SES Strategy" in the tc39 calendar?

@littledan
Copy link
Member

Yes, that meeting.

@lilactown
Copy link

Thanks for replying @rricard. Here are some of my thoughts:

Our goal is to enable you via a combination of Symbol as WeakMap Keys and a userland solution to effectively put objects in your record and tuple structures but we do not want that to be possible by default as it both technically complicated (because of the reasons I just mentioned) and we do think that the default should prevent you from having objects in record and tuple structures without the user being explicit about it.

I have read the Symbol + WeakMap proposal before, but this is like forcing a developer to program with oven mitts: you still have the "infectious" problem where you have the options of coercing or defensively putting data into a ref type, when that reference might be immutable! I.e. it might change every time the underlying data changes. This is poor performance and ergonomics for the very basic case of placing a function, date or user constructed object into an immutable structure. Again, not good when living at the center of a framework like Apollo's cache or React's VDOM.

This limitation exists because one of the key goals of the Record & Tuple Proposal is to have deep immutability guarantees and structural equality by default.

When I've used immutable data in Clojure(Script), OCaml, and Erlang/Elixir, deep structural equality was guaranteed for immutable data. This is typically achieved by equality being polymorphic, where mutable types use reference identity and immutable data check reference identity and if not equal, recurse. This allows a "fast yes" when comparing two immutable structures while still allowing for two different references in memory to be structurally equal.

Adopting this for JS' immutable structures would match the expectations of developers experienced with programming with immutable data, and JS pedagogy can draw on the the wealth of experience in teaching this in languages like OCaml, Clojure, Elixir, etc. to help inexperienced developers understand this behavior.

Like I said in my post yesterday, there are many different solutions to this and it's not clear to me how the authors of the proposal landed on the proposed solution when looking at the pros/cons. As someone who would be a primary consumer of the proposal I am not convinced that creating a different solution from the status quo is worthwhile.

@rricard
Copy link
Member

rricard commented Jun 11, 2020

@Lokeh

The main goal of this proposal is to offer immutable datastructures within the JavaScript language. While we appreciate that other functional languages let you do such things, we want to have a feature that is correctly integrated into JS and we chose to create those "coumpound primitives" instead of creating new kind of objects (you can use a userland library for that). The thing with coumpound primitives is that they don't have an identity so now it becomes hard to reconcile with that dual identity/structural equality notion you're referring to.

We have proposed plenty of solutions for getting back those properties in userland (and combined with Symbol as WM Keys) in this thread and others. One approach is to wrap your record and tuple into an object alongside its external references (represented as array indices), that way you can store the structure with slots that can change and compare structure to structure or set of slots to set of slots. We are interested in knowing what those solutions lack apart from ergonomics. Ergonomics are important but the cases that you cited come from libraries mostly, most users won't have to write library code.

Like I said in my post yesterday, there are many different solutions to this and it's not clear to me how the authors of the proposal landed on the proposed solution when looking at the pros/cons.

We are always open to other solutions and explored quite a lot of them already (as seen in symbols as wm keys), the main problem is that the cons list usually contains breaking something in JS or another advanced proposal.


EDIT: I realize that this answer might have been blunt. Essentially, I'm interested in links/examples of those cases you're talking about so we can see how we can make them work with Record and Tuple. Maybe that will lead us to explore other solutions that we did not think about yet.

@milankinen
Copy link

milankinen commented Aug 2, 2020

First of all - thanks for this proposal, I'm very excited about it! This primitive-only restriction also got my attention so I'm throwing my cents here.

In my opinion, the deep equality of tuples/records is an essential feature of this proposal and my concern is that the userland approaches can't guarantee it without sacrificing correctness/performance.

--

If I've understood correctly, one suggested approach was index based refs and companion array. I'm worried that this approach would potentially produce both false positives and negatives.

Let's assume that we have an (non-recursive) conversion function from an arbitrary object to a record so that it'll put references into companion array and user per-object indexing. A (simplified) implementation might look something like this:

function recordify(obj) {
    const structure = {}
    const arr = []
    Object.keys(obj).forEach(key => {
        const val = obj[key]
        if (isPrimitive(val)) {
            structure[key] = val
        } else {
            structure[key] = arr.length
            arr.push(val)
        }
    })
    return [Record(structure), arr]
}

It should be trivial to detect that the equalities from this function produce a lot of false positives because the function does not distinquish different refs, e.g.

recordify({foo: 1, bar: {a: 1}})[0] === recordify({foo: 1, bar: {b: 2}})[0]  // false positive!

This, of course, can be avoided by comparing the companion as well. But then we'll lose the possibility to use the record e.g. in sets and as map keys and also make memoization at that usage side a lot more complicated that it'd be by just comparing the equality of the record.

And of course there is the false negative case. Let's take an example:

const msg = {text: "tsers"}
const sender = {nick: "milankinen"}
const mail1 = {id: 1, msg, sender}
const partialMail = {id: 1, sender}
const mail2 = {...partialMail, msg}

Since the order of object's keys is not guaranteed, the recordified versions of mail1 and mail2 are not (necessarily) equal although they should be. This, luckily, can be fixed by always sorting the keys before assigning lookup refs, thus sacrificing performance for correctness.

--

Symbol-based refs + weakmaps, as suggested in Symbols as Weakmap keys proposal, would indeed solve some of those index ref problems. However, what worries me is that all given examples omit the most crucial part w.r.t. equality - the reverse lookup from object to symbol.

Because Symbol.for requires string as its key, it'll be out of question. The only option that popped into my mind is to use the second WeakMap for reverse lookup. Applying it to the @rricard's solution results in the following implementation:

const refs = new WeakMap();
const keys = new WeakMap();

export function ref(obj) {
    const existingKey = keys.get(obj) 
    if (existingKey) {   
        return existingKey
    } else {
        const key = Symbol();
        refs.set(key, obj);
        keys.set(obj, key);   
        return key;
    }
}

This poses one challenge: in WeakMap, only keys are weakly referenced. In order to make symbols stable, we hovewer, need to use them also as values in the reverse lookup map. I must admit that I haven't dug into JS engine internals so that I had competence to say how sophisticated their GC is in case of this kind of "weak cross-referencing". That's why I tried to explore the implications with the following code:

const refs = new WeakMap();
const keys = new WeakMap();
(function loop() {
    requestAnimationFrame(() => {
        for (let i = 0; i < 10000; i++) {
            const ref = {};
            const key = {};
            refs.set(key, ref);
            keys.set(ref, key);
        }
        loop();
    });
})();

Running this on my MBP 2019, Chrome 84.0.4147.105 yields the following stats: 144MB heap usage, ~460ms/frame which seems to be more than order of magnitude higher than the "baseline" which is the same code with only one weakmap (10MB heap, 3ms/frame). Heap usage stabilizes in the end so at least V8 GC can collect stale refs at some point but the overhead for this is significant.

Another and more controversial choice would be using hidden property for key inside ref object. This actually yields a little bit "better" results (200MB heap, 130ms/frame) but would require the hidden mutation of the ref object and wouldn't work with frozen objects so it's not viable solution either.

So that's why I'd like know if there any other solutions that would resolve the equality issue but still maintain the performance and without causing any potential memory leaks?

--

On the other hand, I understand @rricard's point about accidentally introducing references to the nested structure and totally agree with it. Having #{message: {text: "tsers"}} !== #{message: {text: "tsers"}} is something that I'd definitely fell into by accident at some point!

I wouldn't be so worried about the mutation because that's usually (at least hopefully) intentional and like @awto said, reference equality has the well-known semantics in JS and most people (or at least I) would still expect them to hold, even after mutation. And because of the nature of JS, is it truly possible to enforce immutability for the whole structure (i.e. aRecord.text.__proto__.lol = "bal")?

The box primitive in the Symbols as Weakmap keys proposal was definitely the cleanest solution but unfortunately it seems to be denied by SES. Is there any way to revive/revisit it somehow? I'm absolutely clueless about the membranes and SES stuff but if I understood correctly, the reason for ditching box was because object sharing via Box.prototype.deref couldn't be disabled beforehand in the frozen realms? In that case, did you consider defining the isolations at the box's construction time (e.g. Box(ref, {isolate: false})) to allow/disallow sharing? Or maybe I just missed the point?

--

I'm really hoping that this non-primitive restriction gets solved somehow (either by language or userland solution) because in its current form, it limits the potential and appliances of this proposal a lot.

@devsnek
Copy link
Member

devsnek commented Aug 2, 2020

I think SES can also handle boxes by treating records/tuples like objects instead of like primitives, but that has a performance cost. Personally I think that having boxes more than makes up for that performance cost.

@dead-claudia
Copy link
Contributor

@milankinen I'd highly recommend filing a bug against V8 over that GC brokenness. (~22k iterations/sec on a high-end laptop is way too slow for something that simple, and that short-term memory growth also seems suspicious.)

@littledan
Copy link
Member

@milankinen It's true that the reverse lookup for the WeakMap would be important for the Symbol as WeakMap key approach. We left it out of the code samples to keep them simple. I am surprised that the performance impact is not more like doubling cost.

@devsnek I guess the unfortunate part here is how you can't put a Proxy over a Record/Tuple for a membrane (or, you can put one around a wrapper, but equality won't work). This causes an additional asymptotic cost, making it difficult to take this kind of tradeoff.

@devsnek
Copy link
Member

devsnek commented Aug 9, 2020

@littledan we could allow them to be proxy targets, not like proxies aren't already weird

@littledan
Copy link
Member

@devsnek Sure, the question is more, how do we define equality on Proxies then? I think it'd be best if we kept how === is a reliable operation and doesn't invoke Proxy traps.

@Fenzland
Copy link

Fenzland commented Jan 27, 2021

That is unnecessary and make usage limited. For example:

const map= new Map();

const foo= document.querySelector('.foo:checked');
const bar= document.querySelector('.bar:checked');

if (map.has(#[foo, bar])) {...} else {...}
// or
if (map.has(#{
    selectedFoo: foo,
    selecetdBar: bar,
})) {...} else {...}

As the example, the most important feature of Tuple and Record for me is not immutable, but can pack things as key of Map or Set. If this proposal end up to make this limit, there is a door be closed.

If we want prevent to make misunderstanding like this, we can just only forbid the literal Tuple and Record contains literal arraies, objects, functions or classes.

const container = #{
    foo: 1, // fine
    bar: #{ // fine
        bar: 2,
    },
    baz: #[0,1,2], // fine
    sub: { // SyntaxError
        bar: 2,
    },
    array: [0,1,2], // SyntaxError
    method: function(){} // SyntaxError
    arrow: ()=>{} // SyntaxError
    class: class {} // SyntaxError
};

But when we write like this:

const sub= {bar: 2};

const container0 = #{
    foo: 1,
    sub: sub,
};

const container1 = #{
    foo: 1,
    sub,
};

or this

const sub2= {bar: 2};
const sub3= {bar: 2};

const container2 = #{
    foo: 1,
    sub: sub2,
};

const container3 = #{
    foo: 1,
    sub: sub3,
};

We surely know what we want, and what expected:

container0.sub.bar = 3;
assert(container0 === container1);
assert(container1.sub.bar === 3);

container2.sub.bar = 3;
assert(container2 !== container3);
assert(container3.sub.bar === 2);

On the other hand, this proposal allow expressions in Tuple and Record literals, if they result to objects, the limit will cause lots of meaningless runtime exceptions.

const #{
    foo: functionMayReturnPrimitiveOrObject(),
};

@rricard
Copy link
Member

rricard commented Feb 25, 2021

We intend to add Box to this proposal which would give a way of including objects in Record/Tuple. Follow on discussion on the subject should go into #200.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests