Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boxes: unwrap when calling JSON.stringify? #232

Closed
rricard opened this issue Jun 21, 2021 · 35 comments · Fixed by #241
Closed

Boxes: unwrap when calling JSON.stringify? #232

rricard opened this issue Jun 21, 2021 · 35 comments · Fixed by #241
Labels
boxes All the discussions related to built-in object placeholders take in account for next draft
Milestone

Comments

@rricard
Copy link
Member

rricard commented Jun 21, 2021

As part of #197 it was proposed to unwrap boxes when JSON.stringifying.

Do we want to make that change? if yes, this ticket will track the change...

@rricard
Copy link
Member Author

rricard commented Jul 12, 2021

We are likely to do it, ignoring it would be footgunny and the alternative is to throw.

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Jul 13, 2021

I'm trying to write the spec text for this, but it's really hard to decide on a good behavior.

  1. Should .unbox() be called before calling .toJSON?
  2. Should .unbox() be called before calling the replacer function? (the replacer is called after .toJSON)
  3. Should .unbox() be called on the result of the replacer function?

Replying "yes" to (1) lets us respect the .toJSON method of objects wrapped inside boxes (which seems like a behavior we would want), but makes it impossible (1 implies 2) to pass the box to the replacer (which might be needed to serialize boxes differently so that they can then be unserialized).

@acutmore
Copy link
Collaborator

I thought Boxes were not objects, so it shouldn't be trying to call .toJSON on it?

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Jul 13, 2021

No, but the algorithm will either first check if it's an object with .toJSON, or if it's a box.

JSON.stringify(Box({ toJSON() { return { x: 1 } } }));
JSON.stringify({ toJSON() { return Box({ x: 1 }) } });

I think it's only possible to make only one of those to return '{"x":1}'.

@acutmore
Copy link
Collaborator

acutmore commented Jul 13, 2021

Could it not recurse? If the value is a box type, then it is unboxed and the algo starts again on the unboxed value? Sorry if I'm missing something!


Though I like the idea of the replacer being called before unboxing a box, i.e. passing the Box directly to the replacer. That way you could create a bi-directional replacer and reviver pair that does recreate the box on JSON.parse. Maybe something like this:

function replacer(key, value) {
  if (Box.isBox(value)) {
    return {  __box: true, value: value.unbox() };
  }
  ...
}

function reviver(key, value) {
  if (value?.__box) {
    return Box(reviver(key, value.value));
  }
  ...
}

@nicolo-ribaudo
Copy link
Member

Could it not recurse? If the value is a box type, then it is unboxed and the algo starts again on the unboxed value? Sorry if I'm missing something!

Potentially yes, and that was the first approach I tried. However, in this case it only calls toJSON once:

JSON.stringify({
  toJSON() {
    return {
      toJSON() {
        return { x: 1 };
      }
    };
  }
});

so this should probably call toJSON once:

JSON.stringify({
  toJSON() {
    return Box({
      toJSON() {
        return { x: 1 };
      }
    });
  }
});

@acutmore
Copy link
Collaborator

Potentially yes, and that was the first approach I tried. However, in this case it only calls toJSON once:

JSON.stringify({
  toJSON() {
    return {
      toJSON() {
        return { x: 1 };
      }
    };
  }
});

so this should probably call toJSON once:

JSON.stringify({
  toJSON() {
    return Box({
      toJSON() {
        return { x: 1 };
      }
    });
  }
});

Thanks for that example, I can see the issue now :)

@nicolo-ribaudo
Copy link
Member

If we can't find a solution, we should remember that there is already another primitive that throws: bigints. Users can either define BigInt.prototype.toJSON, or pass a custom replacer.

@ljharb
Copy link
Member

ljharb commented Jul 13, 2021

I really hope we don’t follow that example; it drastically hinders usability.

@acutmore
Copy link
Collaborator

So I think this is solvable, similar to how the ecma-262 gramma is defined, by using a flag.

As we recurse if we store a [ignoreToJSON] flag saying that the last object we processed was returned from a call to toJSON so we wont look up toJSON for the next object level. If we then recurse again the flag is cleared. We don't clear this flag when unboxing a box.

function replacer(key, value) {return value};

JSON.stringify({ //------------ skipped (toJSON)
    prop1: 1, //--------------- skipped (toJSON)
    toJSON() { //-------------- called
      return { //-------------- passed via replacer                      {
        prop2: 2, //----------- passed via replacer                        "prop2": 2,
        toJSON() { // --------- passed via replacer, skipped (function)
          return { x: 1 };
        },
        prop3: { //------------ skipped (toJSON)                           "prop3":
            toJSON() { //------ called
                return { //---- passed via replacer                          {
                    y: 2 //---- passed via replacer                            "y": 2
                }; //                                                        }
            }
        }
      }; //                                                              }
    }
}, replacer);

JSON.stringify({ //------------ passed via replacer   {
    prop1: 1, // -------------- passed via replacer     "prop1": 1,
    prop2: { //---------------- skipped                 "prop2":
        prop3: 2, //----------- skipped
        toJSON() { //---------- called
          return { //---------- passed via replacer       {
              x: 1 //---------- passed via replacer         "x": 1
          }; //                                           }
        }
    }
}, replacer); //                                      }
Pseudo state machine

START(key, null)                  -> key: null
START(key, number)                -> key: number
START(key, string)                -> key: string
START(key, function)              -> skip
START(key, object)[ignoreToJSON]  -> OBJ_ignoreToJSON(key, object)
START(key, object)                -> OBJ(key, object)

OBJ(key, { toJSON: function }) -> START( key, call(toJSON) )[+ignoreToJSON]
OBJ(key, object)               -> key: { [p in object]: START( p, object[p] )[-ignoreToJSON] }

OBJ_ignoreToJSON(key, object) -> key: { [p in object]: START( p, object[p] )[-ignoreToJSON] }

With boxes:

JSON.stringify(Box( //--------- passed via replacer, auto-unbox
    { //----------------------- skipped (toJSON)
        toJSON() { //---------- called
            return { //-------- passed via replacer                   {
                x: 1 //-------- passed via replacer                     "x": 1
            }; //                                                     }
        }
    }
), replacer);

JSON.stringify({ //------------ skipped (toJSON)
    toJSON() { //-------------- called
        return Box( //--------- passed via replacer, auto-unbox
            { //--------------- passed via replacer                   {
                x: 1 //                                                 "x": 1
            } //                                                      }
        );
    }
}, replacer);
Additional pseudo state machine rule

START(key, box) -> START( key, box.unbox() )

@nicolo-ribaudo
Copy link
Member

That's a really interesting idea!

I'm also going to search about the history of toJSON to understand the reason behind the current behavior (why it doesn't recourse).

@acutmore
Copy link
Collaborator

acutmore commented Jul 14, 2021

Good idea to check the history of it.

Looking at the toJSON section to mdn https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#tojson_behavior
The example there returns this from the toJSON. So maybe it is to support that?

class Base {
  toJSON() {
     ... // custom JSON behavior
  }
}

class SubClass extends Base {
  /** @override */
  toJSON() {
      return this; // SubClass wants to revert to standard JSON serilisation
  }
}

@nicolo-ribaudo
Copy link
Member

For reference, I found this email discussing .toJSON (https://esdiscuss.org/topic/tojson-contract) and these discussing JSON.stringify in general (https://esdiscuss.org/topic/tojsonstring-and-other-introspections, https://esdiscuss.org/topic/more-json-spec-questions). However, they haven't been more useful than just reading the current implementation.

@mhofman
Copy link
Member

mhofman commented Oct 1, 2021

As I mentioned in #254 (comment), I believe the approach taken is trying to be too smart and too accommodating of unclear behavior.

JSON.stringify({ toJSON() { return Box({ x: 1 }) } });

IMO this should return undefined (or throw?).

The way I understand toJSON is to allow an object to provide a directly serializable representation. Box is not a directly serializable representation, it's at best an indirection to one. And since toJSON doesn't recurse other toJSON to find the final representation, we should not unbox a result from toJSON.

This is the behavior I propose: toJSON? -> replacer? -> unbox and recurse? (if 'toJSON' was not called previously)

@mhofman
Copy link
Member

mhofman commented Oct 1, 2021

It also solves the problem of the conflicting behavior with the replacer:

JSON.stringify(
  {
    box: Box({ x: 1, toJSON: () => ({ y: 2 }) }),
  },
  (k, v) => v
);

Would expectedly return {"box":{"y":2}} like it would without a replacer. We can get that result because the box would be unwrapped, and we can then recurse SerializeJSONProperty if we find an object in the box, without qualms since we know the original value was never an object itself (making sure we don't call toJSON twice on the same holder[key])

Edit: it would also not hide intermediate box from the resolver in a case of a Box(Box(Box(foo))). Not sure if that really matters, but it'd be consistent with letting the resolver decide how to handle each value.

@mhofman
Copy link
Member

mhofman commented Oct 1, 2021

Thinking a little more about the recurse, I think the approach could be to pass the box as holder and key as empty string. Here's what the full algo would look like:

          1. If Type(_holder_) is Box, then
            1. Let _value_ be ? Unbox(_holder_).
          1. Else,
            1. Let _value_ be ? Get(_holder_, _key_).
          1. Let _toJSONCalled_ be *false*.
          1. If Type(_value_) is Object or BigInt, then
            1. Let _toJSON_ be ? GetV(_value_, *"toJSON"*).
            1. If IsCallable(_toJSON_) is *true*, then
              1. Set _value_ to ? Call(_toJSON_, _value_, « _key_ »).
              1. Set _toJSONCalled_ to *true*.
          1. If _state_.[[ReplacerFunction]] is not *undefined*, then
            1. Set _value_ to ? Call(_state_.[[ReplacerFunction]], ToObject(_holder_), « _key_, _value_ »).
          1. If Type(_value_) is Object, then
            1. If _value_ has a [[NumberData]] internal slot, then
              1. Set _value_ to ? ToNumber(_value_).
            1. Else if _value_ has a [[StringData]] internal slot, then
              1. Set _value_ to ? ToString(_value_).
            1. Else if _value_ has a [[BooleanData]] internal slot, then
              1. Set _value_ to _value_.[[BooleanData]].
            1. Else if _value_ has a [[BigIntData]] internal slot, then
              1. Set _value_ to _value_.[[BigIntData]].
            1. Else if _value_ has a [[BoxData]] internal slot, then
              1. Set _value_ to _value_.[[BoxData]].
          1. If _value_ is *null*, return *"null"*.
          1. If _value_ is *true*, return *"true"*.
          1. If _value_ is *false*, return *"false"*.
          1. If Type(_value_) is Record or Type(_value_) is Tuple, set _value_ to ! ToObject(_value_).
          1. If Type(_value_) is Box,
            1. If _toJSONCalled_ is *false*, return ? SerializeJSONProperty(_state_, the empty String, _value_).
            1. Else, return *undefined*. // or throw a *TypeError* exception.
          1. If Type(_value_) is String, return QuoteJSONString(_value_).
          1. If Type(_value_) is Number, then
            1. If _value_ is finite, return ! ToString(_value_).
            1. Return *"null"*.
          1. If Type(_value_) is BigInt, throw a *TypeError* exception.
          1. If Type(_value_) is Object and IsCallable(_value_) is *false*, then
            1. If ! IsTuple(_value_) is *true*, then
              1. Let _isArrayLike_ be *true*.
            1. Else,
              1. Let _isArrayLike_ be ? IsArray(_value_).
            1. If _isArrayLike_ is *true*, return ? SerializeJSONArray(_state_, _value_).
            1. Return ? SerializeJSONObject(_state_, _value_).
          1. Return *undefined*.

@acutmore
Copy link
Collaborator

acutmore commented Oct 3, 2021

One of the reasons for calling the replacer before unboxing, is so the serialised string can retain the position where boxes were. That way the reviver can more closely reconstruct the original value.

#232 (comment)

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Agreed, and that's still the case with my proposed algorithm.

@acutmore
Copy link
Collaborator

acutmore commented Oct 4, 2021

Got my head around the technique now, I hadn't caught that Box goes around twice, first as a value, then as a holder.

I've added the diff comments incase it helps others too.

+1. If Type(_holder_) is Box, then
+  1. Let _value_ be ? Unbox(_holder_).
+1. Else,
   1. Let _value_ be ? Get(_holder_, _key_).
 1. Let _toJSONCalled_ be *false*.
 1. If Type(_value_) is Object or BigInt, then
   1. Let _toJSON_ be ? GetV(_value_, *"toJSON"*).
   1. If IsCallable(_toJSON_) is *true*, then
     1. Set _value_ to ? Call(_toJSON_, _value_, « _key_ »).
     1. Set _toJSONCalled_ to *true*.
 1. If _state_.[[ReplacerFunction]] is not *undefined*, then
-  1. Set _value_ to ? Call(_state_.[[ReplacerFunction]], _holder_, « _key_, _value_ »).
-  1. Set _value_ to ! MaybeUnwrapBox(_value_).
-1. Else,
-    1. Set _value_ to ! MaybeUnwrapBox(_value_).
-  1. If _toJSONCalled_ is *false*, then
-    1. If Type(_value_) is Object or BigInt, then
-    1. Let _toJSON_ be ? GetV(_value_, *"toJSON"*).
-    1. If IsCallable(_toJSON_) is *true*, then
-    1. Set _value_ to ? Call(_toJSON_, _value_, « _key_ »).
-    1. Set _value_ to ! MaybeUnwrapBox(_value_).
+  1. Set _value_ to ? Call(_state_.[[ReplacerFunction]], ToObject(_holder_), « _key_, _value_ »).
 1. If Type(_value_) is Object, then
   1. If _value_ has a [[NumberData]] internal slot, then
     1. Set _value_ to ? ToNumber(_value_).
   1. Else if _value_ has a [[StringData]] internal slot, then
     1. Set _value_ to ? ToString(_value_).
   1. Else if _value_ has a [[BooleanData]] internal slot, then
     1. Set _value_ to _value_.[[BooleanData]].
   1. Else if _value_ has a [[BigIntData]] internal slot, then
     1. Set _value_ to _value_.[[BigIntData]].
+  1. Else if _value_ has a [[BoxData]] internal slot, then
+    1. Set _value_ to _value_.[[BoxData]].
 1. If _value_ is *null*, return *"null"*.
 1. If _value_ is *true*, return *"true"*.
 1. If _value_ is *false*, return *"false"*.
 1. If Type(_value_) is Record or Type(_value_) is Tuple, set _value_ to ! ToObject(_value_).
+1. If Type(_value_) is Box,
+  1. If _toJSONCalled_ is *false*, return ? SerializeJSONProperty(_state_, the empty String, _value_).
+  1. Else, return *undefined*. // or throw a *TypeError* exception.
 1. If Type(_value_) is String, return QuoteJSONString(_value_).
 1. If Type(_value_) is Number, then
 1. If _value_ is finite, return ! ToString(_value_).
 1. Return *"null"*.
 1. If Type(_value_) is BigInt, throw a *TypeError* exception.
 1. If Type(_value_) is Object and IsCallable(_value_) is *false*, then
 1. If ! IsTuple(_value_) is *true*, then
     1. Let _isArrayLike_ be *true*.
 1. Else,
     1. Let _isArrayLike_ be ? IsArray(_value_).
 1. If _isArrayLike_ is *true*, return ? SerializeJSONArray(_state_, _value_).
 1. Return ? SerializeJSONObject(_state_, _value_).
 1. Return *undefined*.

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

I've added the diff comments incase it helps others too.

👍

TIL github supports manual diffs code blocks!

@nicolo-ribaudo nicolo-ribaudo reopened this Oct 4, 2021
@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Oct 4, 2021

Q: are these outputs correct, according to your algorithm?

  1. A receiver that does nothing

    let obj = {
      box: Box({ x: 1 })
    };
    
    JSON.stringify(obj, function (key, value) {
      console.log(this, key, value);
      return value;
    });
    # this key value
    1 { "": { box: Box({ x: 1 }) } } "" { box: Box({ x: 1 }) }
    2 { box: Box({ x: 1 }) } "box" Box({ x: 1 })

    (And then it throws a TypeError, or omits the box from the result)

  2. A receiver that just unboxes boxes

    let obj = {
      box: Box({ x: 1 })
    };
    
    JSON.stringify(obj, function (key, value) {
      console.log(this, key, value);
      if (typeof value === "box") return Box.unbox(value);
      return value;
    });
    # this key value
    1 { "": { box: Box({ x: 1 }) } } "" { box: Box({ x: 1 }) }
    2 { box: Box({ x: 1 }) } "box" Box({ x: 1 })
    3 Box({ x: 1 }) "" { x: 1 }
    4 { x: 1 } "x" 1

Additionally:

  • Even if someone defines Box.prototype.toJSON, that proposed algorithm ignores it. We might want to respect it, similarly to how we respect BigInt.prototype.toJSON (and in single-compartment non-ses apps people could do Box.prototype.toJSON = Box.unbox).

Also, would this be a valid "polyfill" for Box.unbox? If so, it can be use to escape compartments' virtualization:

const box = Box({ x: 1 });
unbox(box);

function unbox(box) {
  let result;
  let i = 0;
  JSON.stringify(box, function (key, value) {
    switch (i++) {
      case 0:
	    // It receives "" as the key and Box({ x: 1 }). Return Box({ x: 1 }).
        // The algorithm then calls SerializeJSONProperty(state, "", Box({ x: 1 }))
	    return value;
      case 1:
        // 'this' is Box({ x: 1 }), and 'value' is its contents
        result = value;
        return {};
  });
  return result;
}

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

  1. A receiver that does nothing

(And then it throws a TypeError, or omits the box from the result)

I'll assume your example meant to not call Box.unbox in the first part.
Why would it throw a TypeError? It should get into

+1. If Type(_value_) is Box,
+  1. If _toJSONCalled_ is *false*, return ? SerializeJSONProperty(_state_, the empty String, _value_).

Since toJSONCalled is false.

And then proceed as for the table in the second section with manual unboxing. Aka, serializer manually unboxing or not has no impact on how the JSON serialization would happen for boxes.

The only impacted objects by the undefined or throw TypeError would be:

let obj = {
  toJSON() : function() {
    return Box({ x: 1 });
  },
};
  • Even if someone defines Box.prototype.toJSON, that proposed algorithm ignores it. We might want to respect it, similarly to how we respect BigInt.prototype.toJSON

Fair, I'll think about how to integrate it.

If so, it can be use to escape compartments' virtualization:

Edit: my virtualization is wrong, please ignore.

It wouldn't since the algorithm calls toJSON between internal unboxing and the serializer, which would perform the same virtualization (see #254 (comment))

   1. If IsCallable(_toJSON_) is *true*, then
     1. Set _value_ to ? Call(_toJSON_, _value_, « _key_ »).
function toJSON(key) {
  const target = unwrap(this);
  const targetToJSON = target.toJSON;
  if (typeof targetToJSON === 'function') {
    return targetToJSON.call(target, key);
  }
  return target;
}

function Box(target) {
  // ...
    wrappedTarget = Object.freeze({
      target,
      toString,
      toJSON,
    });
  // ...

  return OriginalBox(wrappedTarget);
}

So the serializer only has access to the original box as this (which it can unbox with a virtualized Box.unbox), or the virtualized content of the box as returned by the internal toJSON.

@nicolo-ribaudo
Copy link
Member

I'll assume your example meant to not call Box.unbox in the first part.

Yes thanks, I updated the example.

So the serializer only has access to the original box as this (which it can unbox with a virtualized Box.unbox), or the virtualized content of the box as returned by the internal toJSON.

That toJSON is Box.unbox(box).toJSON, called using Box.unbox(box) as the receiver?

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

  • Even if someone defines Box.prototype.toJSON, that proposed algorithm ignores it. We might want to respect it, similarly to how we respect BigInt.prototype.toJSON

Fair, I'll think about how to integrate it.

I believe this is the case for Tuple (and Record) as well, right? I mean I didn't make any changes in that regard. Anyway, if we'd want to support this, I think the following would work just fine:

-1. If Type(_value_) is Object or BigInt, then
+1. If Type(_value_) is Object or BigInt, or Box, then

It wouldn't break any virtualization because the box value itself is not virtualized (only possibly the content).

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

That toJSON is Box.unbox(box).toJSON, called using Box.unbox(box) as the receiver?

I wasn't thinking clearly on Friday apparently, my virtualization doesn't work. Apologies.

I do however believe my proposed change to the JSON.stringify algorithm has less gotchas than the current one.

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Oct 4, 2021

Np, I created a playground with simple JSON.stringify so that we can see how it works. It only supports boxes (not R&T) and objects (not arrays).

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Neat!

  • This is without Box unwrapping

I believe this is the same as "current proposal", or did I miss something?

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Oct 4, 2021

No: with the current proposal JSON_stringify(Box({ x: 1 })) returns {"x":1}, while without box unwrapping it throws (or it could return undefined, but throwing is consistent with BigInt).

"current proposal" = "what we have have at https://tc39.es/proposal-record-tuple/#sec-json.stringify"

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Ugh, I apparently opened the wrong link.

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Nit: the Box.prototype.toJSON would need to recursively unbox to avoid the no nested toJSON pitfall:

Box.prototype.toJSON = function () { 
    let cur = this;
    while (Type(cur) === 'box') cur = Box.unbox(cur);
    return cur;
}

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Oct 4, 2021

Thanks, I updated the playground link

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Actually, worse, it's need to call toJSON itself when it finds one.

Box.prototype.toJSON = function (key) { 
    const content = Box.unbox(this);
    if ('toJSON' in content) {
      return content.toJSON(key);
    }
    return content;
}

Anyway, this shows requiring users to manually implement a transparent toJSON is not easy if even we can't get it right the first time...

@nicolo-ribaudo
Copy link
Member

A fourth option would be to still rely on .toJSON but have a default .toJSON implementation.

@mhofman
Copy link
Member

mhofman commented Oct 4, 2021

Yeah, but as I wrote in #254 (comment), that would make locked down compartments surprising in that they wouldn't behave the same.

@nicolo-ribaudo nicolo-ribaudo added the boxes All the discussions related to built-in object placeholders label Dec 17, 2021
@nicolo-ribaudo
Copy link
Member

We removed boxes from the proposal, since it was stalled because of them. I'm closing this issue, but we'll keep track of it if we'll bring them up again as a follow on proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
boxes All the discussions related to built-in object placeholders take in account for next draft
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants