Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we null-short toString() and hashCode? #393

Closed
munificent opened this issue Jun 6, 2019 · 6 comments
Closed

Should we null-short toString() and hashCode? #393

munificent opened this issue Jun 6, 2019 · 6 comments

Comments

@munificent
Copy link
Member

Currently, we say the rest of a method chain after a null-aware operator is shorted, even calls to extension methods. That's potentially not useful if those extension methods are on nullable types (#392). If we change that to stop shorting on nullable extension methods, then it seems the principle for null-shorting is "only short operations that would fail".

Given that, what does that say about null-shorting toString() and hashCode? Those operations wouldn't fail. Should they still be shorted?

My intuition is that we do want to short them because it's likely that a user does not want to call toString() on null just because some prior operation happened to return null. Unlike an extension method declared on a nullable type, there's no point where there's a clear intention to allow the operation on things that are null.

Anyway, I'm not sure if re-opening the discussion around null-shorting extension methods has implications for these members too. Perhaps they should not be available on nullable types at all? (In other words, since Object and Null each define their own toString() we treat the union as not having that member since it isn't the same member.)

@lrhn
Copy link
Member

lrhn commented Jun 7, 2019

I agree that we should short-circuit based entirely on syntax, not on the methods used.
It's not just a good idea, it's almost necessary because of the NNBD typing.

It's a good idea because it's very confusing if x?.toString() did not skip calling toString, but why is x?.foo.toString() different? Both are guarded by x being null. There is no obvious intent to treat them differently.

Take list?.length.toString(). The type that .toString() is applied to is int. We have to do some convoluted logic to recongize that it happens inside a null-shortening continuation, and check if .toString() would also apply to int? (it does), and in that case delimit the shortening here.
That changes the type of the expression from String? to String. What if I wanted String?—what should I write to get that? I can't actually write anything, but if we continue the null-shortening, I could write (list?.length).toString() to get out of it.

Take: x?.someList.toString().split(","). Would we call toString() and everything down-chain from it?
Again, if I wanted that behavior, I could easily get it by adding a pair of parentheses, but there is no way to request further shortening.

Assume some class C overrides toString to take an optional argument of type int.
Then take: var f = something?.theC.toString; f(16);.
The type that toString is applied to is C. We then check that toString exists on C? (it does), and do the tear-off with the wrong type.
The type of f becomes String Function() instead of String Function([int])?. Again, there is no way to get to the null-shortening behavior, but there is an easy syntax for getting the non-shortening behavior.

As for the underlying model: My model for nullable extension methods, and NNBD types in general, is that there is a magical Top type which declares the "Object members" (toString, ==, hashCode, runtimeType and noSuchMethod), and which both Object and Null implements. The type is not denotable in any way, but Object and Null do implement the same toString methods. (So does dynamic, which is why the Object members are not subject to dynamic invocation).

@eernstg
Copy link
Member

eernstg commented Jun 7, 2019

I totally agree on "keep it simple and syntactic", cf. #394 (comment).

There's no need for any magic in the model, that is, in the subtype graph: The top type can be denoted by void, dynamic and Object? (and the extras like "can't use this value" don't matter wrt subtyping), and it declares 5 members, and both Null and Object are completely regular subtypes of the top type.

However, there is one more topic at play here: As part of that eternal quest to empower developers by not requiring them to waste brain cycles on accidental complexity, I think it's important to converge on an interpretation of the null object.

I think we have implicitly agreed to interpret null as a marker for absence in many contexts:

  • A nullable type represents the addition of optionality to the underlying non-null type, so null means "there is no such object". This affects member access, in that we insist on knowing that "there is such an object" (exceptions below).

  • A null-shorted chain of operations considers null to indicate that "there's nothing to do here", which makes sense if "there is no such object".

  • Null-shorting for collections, Do collection elements null short? #394, would also let [?bar] and ?[bar] evaluate to the empty list when bar is null, and the interpretation is again that bar was absent.

These rules are applicable to code which is aimed at an application domain, and they allow us to model absence quite consistently.

However, there are situations where the code needs to abstract that distinction away (between being present and being absent). In particular:

  • For data transfers we can use T? to allow passing null as well as passing a T (that is: assignments, passing actual arguments to a function/method, returning), as long as the recipient has a nullable type. So in these cases we are transferring both the T, if any, and otherwise we transfer the fact that there was no such object. This makes presence a dynamic property, or you could say that it is a 'lifted' version of the type T.

  • For highly reusable collections like List<E> we can allow passing a nullable type argument, which enables dynamic management of presence. It's technically easy to do, because the implementation of classes like List can be (and is) constrained to only do things that will work for any object. Again, organizing objects into lists is a lot like data transfer because that doesn't do anything application specific to them. In general, some pieces of code will work with any object (which may be typed by a top type or by a type variable whose bound is a top type), and they all allow for "abstracting over absence" in this sense.

  • Finally, some pieces of code should "abstract away from absence" and work on any object, because it is more low-level, focusing on the technical representation rather than application domains. Code used to perform reflective-ish operations, mocking classes, debugging etc go into this bucket, but also more fine-grained stuff like testing x != null.

We decided (#175) that we allow Object operations on expressions of a nullable type, but we've had several arguments against that or at least hesitation (including this issue, and even on #175 after it was closed). I think that decision is backed up by this second category of situations: Very general code where we wish to abstract away from nullability (so we don't wish to do anything different with null).

For the Object methods, we certainly need to support testing for null (and we have used operator == for that); hashCode is justified by the ability to have collections containing null; runtimeType and noSuchMethod may be unpopular for many reasons, but they are focused on the representation; so the only outlier is toString().

toString() is really tricky because it's used in string interpolation, and it's likely to be overridden by a very large number of classes in a way which is intended for an application domain rather than a low-level representational purpose. So it's no surprise that exactly toString comes up, with the argument that "we don't want 'null' to occur in our output, we want a static check that null can never occur here!"

We do need to support both the application domain level interpretation of null as "absence" (so null gets special treatment), and general/"representational" code that abstracts that distinction away, and the only problem seems to be that toString() has a very ambiguous status.

I suspect that this ambiguity could be biased toward the application domain, especially for string interpolations.

We could then introduce an extra constraint that "... $e ..." and "... ${e} ..." require e to have a non-null type, and low-level code (where it's OK to print null) would need to use "... ${e.toString()} ...".

This doesn't help developers who write e.toString() explicitly, but we always recommended using string interpolation, so that shouldn't be too bad.

WDYT?

@lrhn
Copy link
Member

lrhn commented Jun 11, 2019

I kind-of like the "interpolation must be non-nullable" rule. It is a "high-level" feature.
My main fear is that it will cause too much migration pain. I'll try to look at the core-library migration and see where it might affect code unexpectedly (done: https://dart-review.googlesource.com/c/sdk/+/103124/2..3).

@munificent
Copy link
Member Author

What if I wanted String?—what should I write to get that?

Wouldn't you write:

list?.length?.toString()

?

Basically, if we decide a method chain stops shorting for whatever reason, you can always opt into more null-awareness by just writing ?. again. Though that suffers from the problem now where you can no longer tell if the last ?. is because length is nullable or just because it may get shorted entirely.

Overall, @lrhn, I'm persuaded about null-shorting. I think a simple syntactic rule that users can easily remember plus an easy syntax to opt out of that behavior when it isn't what they want is a better solution than something that does the right thing more often but is harder for users to know that it does the right thing.

I like the principle of requiring non-nullable types in interpolation. It will prevent "null" accidentally appearing in user strings. But my suspicion is that it will be too onerous in practice to do. Maybe we can do an experiment to see how often this would force users to write ?? 'null'.

@munificent
Copy link
Member Author

Closing out because I agree the answer should be "yes" so I think we're in agreement now.

@lrhn
Copy link
Member

lrhn commented Jun 11, 2019

You would probably never need to write ?? 'null', although that is a valid workaround. I'd probably write "The value is ${value.toString()}". That handles the case where value may be null, and you never need to write it otherwise, so I'd expect it to become idiomatic for nullable interpolations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants