Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullable Reference Types: Allow initialization at construction site #2109

Closed
Suchiman opened this issue Dec 27, 2018 · 28 comments
Closed

Nullable Reference Types: Allow initialization at construction site #2109

Suchiman opened this issue Dec 27, 2018 · 28 comments

Comments

@Suchiman
Copy link

When you turn on Nullable Reference Types, all fields and properties must be initialized at constructor time.
This effectively kills object initializer syntax and usual cases like EntityFramework and DTOs where you don't have constructors, only a bag of properties.
What i'd like to see would be support for full initialization at construction site.
Effectively, similiar to how you can skip calling the constructor of a struct when you explicitly assign all fields (using definitive assignment rules), i'd like to request support for following to work:

public class MyDto // no warnings here
{
    public string LastName { get; set; }
    public string FirstName { get; set; }
}

public static void Main()
{
    var dto = new MyDto
    {
        LastName = "Doe",
        FirstName = "John"
    }; // no warning

    var dto = new MyDto
    {
        LastName = "Doe"
    }; // warning: FirstName is uninitialized

    var dto = new MyDto();
    dto.LastName = "Doe";
    DoSomething(dto); // warning: FirstName uninitialized

    var dto = new MyDto();
    dto.LastName = "Doe";
    dto.FirstName = "John";
    DoSomething(dto); // no warning
}
@HaloFour
Copy link
Contributor

I think you'd need a way for the compiler to know that this is a DTO type. Otherwise, from a metadata perspective, the compiler would never be able to know that those properties need to be initialized. As far as it knows a default constructor set them to some non-null default value.

Also, for the second example:

    var dto = new MyDto
    {
        LastName = "Doe"
    }; // warning: FirstName is uninitialized

I'd say that there shouldn't be a warning here because you haven't done anything with dto at that point. If you were to allow dto to escape out of the method without having first assigned FirstName to something then it could warn at that point.

@Suchiman
Copy link
Author

Suchiman commented Dec 27, 2018

@HaloFour fair enough regarding the second point.
I was thinking using having no explicit constructor as heuristic, although one could argue, why not have this by default? The compiler tags the constructor with an attribute that denotes which properties and fields are initialized by this constructor (for no constructor case, the attribute sits on the implicit generated constructor and simply says none of them).
The compiler reads the attribute for the constructor that got called to know whats left to initialize.
That would mean we never get the warning on constructors or on types but only at the call site.
I think that would be okay even for libraries that might never instantiate a type they expose, since NRT don't replace validation and if they've got NRTs on, they'll receive the warning.
The biggest downside seems to me that you now potentially get the uninitialized warning as often as you new that type and not just once per constructor / type anymore.

@john-h-k
Copy link

What about having some sort of "this field can be null, but once it's set, it will never be null again" indicator? Allows post constructor initialization but once it's been assigned it is certainly not null

@HaloFour
Copy link
Contributor

@Suchiman

I was thinking using having no explicit constructor as heuristic,

You mean that the compiler would annotate the type with this metadata as it compiles it, as long as #nullable enable was on? I could see that. As for whether or not that should be the default I'll leave on the table for now, I just wanted to point out that I think some kind of metadata needs to exist, either on the type or on the constructor, so that the compiler would be able to understand this "post-constructor initialization" requirement when consuming the type as I don't think it can tell an explicit parameterless constructor from a default parameterless constructor from metadata.

@Suchiman
Copy link
Author

@johnkellyoxford how do you analyze that? The compiler currently tracks that once you initialize a nullable property to non null, it will be non null for the rest of the method. Even costly whole program analysis couldn't reliable track this i believe.

@YairHalberstadt
Copy link
Contributor

Related: #1684

@YairHalberstadt
Copy link
Contributor

I would like to see a unified proposal that neatly solves both issues. I don't believe either of these are strong enough on their own.

@john-h-k
Copy link

@Suchiman didn't really think of the implementation details of it.

@Thaina
Copy link

Thaina commented Dec 28, 2018

Did you mean

public class MyDto
{
    public string? LastName { get; set; }
    public string? FirstName { get; set; }
}

@Suchiman
Copy link
Author

@Thaina no.

@Richiban
Copy link

This is yet another use case crying out for the presence of records. Your DTO doesn't want null in its fields but the only way to accomplish that in today's C# is to write your own constructor, because that's the only way that C# currently has of forcing you to provide data when you instantiate a class.

@HaloFour
Copy link
Contributor

This is yet another use case crying out for the presence of records.

While I agree that there is overlap here with records I don't think that records would make sense for a lot of DTOs. Having positional construction for a DTO with a lot of fields would be very cumbersome, and additions would always be a breaking change which is not currently the case.

@Richiban
Copy link

@HaloFour

Having positional construction for a DTO with a lot of fields would be very cumbersome

True that; in fact, I wouldn't recommend (in any language) using positional deconstruction with any more than two or three properties.

additions would always be a breaking change which is not currently the case.

Surely the only options for either records or DTOs is that you either allow null (or some other default value) or it is necessarily a breaking change.

@YairHalberstadt
Copy link
Contributor

@HaloFour
Optional parameters and named parameters to the rescue!

@HaloFour
Copy link
Contributor

@Richiban

True that; in fact, I wouldn't recommend (in any language) using positional deconstruction with any more than two or three properties.

Did you mean "construction" there? My tolerance for construction would probably be higher than 2-3, but I've commonly dealt with DTOs with 20+ non-null fields. That would be absurd for a constructor, even if you used named parameters.

Surely the only options for either records or DTOs is that you either allow null (or some other default value) or it is necessarily a breaking change.

The question is when you require the property to be assigned. As a DTO is traditionally a class of properties with no constructor or other business logic it doesn't fit well into the pattern expected by NRTs. If appeasing the compiler requires a lot of refactoring that would discourage people from using NRTs.

Regardless, I think that the team should really look into how NRTs play with DTO scenarios to see where they could perhaps smooth over the experience.

@YairHalberstadt

Optional parameters and named parameters to the rescue!

Optional parameters are still a breaking binary change and require all source consuming that method to be recompiled. And if they have a safe default (especially one that is supported for an optional parameter) there's little reason to add it to the constructor anyway.

@Richiban
Copy link

Richiban commented Dec 28, 2018

@HaloFour

Did you mean "construction" there?

No, I did mean deconstruction, but I still feel that we're mostly on the same page.

My point was that it might be common and entirely reasonable to deconstruct a Point into its x and y or a Person into firstname, middleName, surname, but what are the chances you're going to legitimately have an instance of some other DTO that has, say, seven properties and you want a local reference to every one of them? It seems much more likely that the right thing to do is to get a reference to the containing object and simply dot into it.

When it comes to construction though, other code style guidelines go out the window for me and I will lean on the type system as much as possible to make my code as safe as possible, readability be damned. I will write:

public class Person
{
	public Person(
		string FirstName,
		string surname,
		DateTime dob,
		string favouriteFood,
		Color eyeColor,
		IReadOnlyCollection<Person> parents,
		IReadOnlyCollection<Person> Children
	) {
		FirstName = firstName ?? throw new ArgumentNullException("firstName");
		Surname = surname ?? throw new ArgumentNullException("surname");
		Dob = dob ?? throw new ArgumentNullException("dob");
		FavouriteFood = favouriteFood ?? throw new ArgumentNullException("favouriteFood");
		EyeColor = eyeColor ?? throw new ArgumentNullException("eyeColor");
		Parents = parents ?? throw new ArgumentNullException("parents");
		Children = children ?? throw new ArgumentNullException("children");
	}

	public string FirstName { get; }
	public string surname { get; }
	public DateTime dob { get; }
	public string favouriteFood { get; }
	public Color eyeColor { get; }
	public IReadOnlyCollection<Person> parents { get; }
	public IReadOnlyCollection<Person> Children { get; }
}

Yes, I am sick to death of these bloody constructors, even I can generate them the first time using visual studio and no, I am not prepared to give them up.

@HaloFour
Copy link
Contributor

@Richiban

When it comes to construction though, other code style guidelines go out the window for me and I will lean on he type system as much as possible to make my code as safe as possible, readability be damned.

That's your prerogative and I'm certainly not arguing against the use of constructors in your code base. However, it's my experience that most DTO-generating tools do not generate constructors and even if they understand the concept of non-nullability they don't enforce it at the DTO level, and that the frameworks with which these DTOs are used have little/no support for constructors anyway. This includes Entity Framework and LINQ to SQL. EF Core looks like it's adding constructor support, but I doubt it'll ever come to Entity Framework proper, but even then I doubt most existing projects would be changed to use it and I'd expect that it wouldn't be used to force 100% assignment of non-nullable properties anyway.

I think DTOs also have additional considerations not likely considered by NRTs. First, I'd posit that it's relatively rare for code to actually create the type directly. It's expected that the ORM or serialization library creates the type most of the time. Second, I'd posit that, especially for ORM-related DTOs, that not all of the non-nullable properties are expected to be populated at the time of creation. This would be true of any server-generated fields like identities or calculated fields. But given that those properties would be correctly populated in the majority of cases I think it's appropriate that they would be marked as non-null.

@Richiban
Copy link

Can't we have a middle ground when it comes to DTOs then, and say that even with nullable reference types enabled this does not produce a warning / error...

public class Foo
{
	public string A { get; set; }
	public string B { get; set; }
}

... but this does?

var foo = new Foo
{
	A = "Hello world"
}; // Error: Non-null property `B` has not been assigned

@HaloFour
Copy link
Contributor

@Richiban

Can't we have a middle ground when it comes to DTOs then, and say that even with nullable reference types enabled this does not produce a warning / error...

That's exactly what this proposal suggests. 😄

I suggested that maybe the compiler wait to warn until foo escapes the method if it detects that not all non-null properties are assigned, either via initializer or direct property assignment, e.g.:

// 1.
var foo = new Foo
{
    A = "Hello",
    B = "World"
};
return foo; // doesn't warn

// 2.
var foo = new Foo();
foo.A = "Hello";
foo.B = "World";
return foo;  // doesn't warn

// 3.
var foo = new Foo()
{
    A = "Hello"
};
foo.B = "World";
return foo; // doesn't warn

// 4.
var foo = new Foo();
return foo; // warns

// 5.
var foo = new Foo()
{
    A = "Hello"
};
return foo; // warns

// 6.
var foo = new Foo();
foo.A = "Hello";
return foo; // warns

This is quite a bit more complicated and involves the compiler tracking the assignment state of every non-nullable property of the DTO. It also doesn't solve the issue when it comes to server-generated fields, but that might just be unavoidable.

Either way, the conversation is about enforcement of non-null-ness by the consumer of the DTO, not within the DTO itself.

@bbarry
Copy link
Contributor

bbarry commented Dec 28, 2018

I'm inclined to write that code something like this:

public class Person
{
    private readonly Builder _value;

    public struct Builder
    {
        public Builder(Person p) { this = p._value; }
        public string FirstName { get; set; }
        public string Surname { get; set; }
        public DateTime Dob { get; set; }
        public string FavouriteFood { get; set; }
        public Color EyeColor { get; set; }
        public Person Build() => new Person(this); //todo: add validation here
    }

    private Person(Builder value) { _value = value; }

    public string FirstName => _value.FirstName;
    public string Surname => _value.Surname;
    public DateTime Dob => _value.Dob;
    public string FavouriteFood => _value.FavouriteFood;
    public Color EyeColor => _value.EyeColor;
}

which does give me warnings about non-nullable properties being uninitialized though I am unsure how to fix them.

Changing the builder constructor to this works to rid the warnings:

        public Builder(Person p) { 
            this = p._value; 
            FirstName = FirstName;
            Surname = Surname;
            FavouriteFood = FavouriteFood;
         }

But that really seems quite unhelpful and it does nothing to help me know that this code is really a problem:

void M() {
    var p = new Person.Builder { FirstName = "Frank", Surname = "Smith"}.Build(); //ideally warn about food here
    var c = new Person.Builder(p) { FirstName = "Joe" }.Build(); // but not here
    Console.WriteLine("{0} {1}", p.FirstName, p.Surname); // "Frank Smith" 
    Console.WriteLine("{0} {1}", c.FirstName, c.Surname); // "Joe Smith"
    Console.WriteLine(c.FavouriteFood.Length); // NRE; expected
}

sharplab

@Richiban
Copy link

@HaloFour While that could certainly work, I'm not sure I like it due to the fact that it's now possible to write C# statements that cannot be extracted to a method, despite the fact that it would be perfectly valid to do so.

For example, I think it would be quite hard to explain users why this is allowed:

public Foo M() 
{
	var foo = new Foo();

	foo.A = "A";
	foo.B = "B"

	return foo;
}

but this is not:

public Foo M() 
{
	var foo = new Foo();

	foo.A = "A";

	N(foo);

	return foo;
}

public void N(Foo foo)
{
	foo.B = "B";
}

I think it's much more natural to have a rule that "we allow mutable DTOs to have their non-nullable properties set my the initialzing code rather than the constructor, but you must instantiate the object and set its properties in a single statement (i.e. in an object initialization expression)".

@Suchiman
Copy link
Author

Suchiman commented Dec 31, 2018

@Richiban
well we already have this right now:

works:

using System;
public class C
{
    public void M()
    {
        Test test;
        test.A = 42;
        test.B = 21;
        Console.WriteLine(test.ToString());
    }
}

public struct Test
{
    public int A;
    public int B;

    public override string ToString()
    {
        return A.ToString() + " " + B.ToString();
    }
}

error CS0165: Use of unassigned local variable 'test':

using System;
public class C
{
    public void M()
    {
        Test test;
        test.A = 42;
        N(test); // error CS0165: Use of unassigned local variable 'test'
        Console.WriteLine(test.ToString());
    }
    
    public void N(Test test)
    {
        test.B = 21;
    }
}

public struct Test
{
    public int A;
    public int B;

    public override string ToString()
    {
        return A.ToString() + " " + B.ToString();
    }
}

@Richiban
Copy link

@Suchiman

    public void M()
    {
        Test test;
        test.A = 42;
        test.B = 21;
        Console.WriteLine(test.ToString());
    }

I honestly didn't know you can do this in C#... I'm still not sure I fully understand it though.

I was exploring it in LinqPad just now and found that, never mind other methods, if you just remove the assignment to B so that you're left with:

	Test test;
	test.A = 42;
	Console.WriteLine(test.ToString());

then you get the same compiler error!

@Suchiman
Copy link
Author

@Richiban it's actually even in the documentation 😉 https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/using-structs#example-2
A struct is considered initialized if all of its fields are initialized.

@Richiban
Copy link

Well well, you really do learn something every day!

@forbjok
Copy link

forbjok commented Apr 20, 2020

I think you'd need a way for the compiler to know that this is a DTO type. Otherwise, from a metadata perspective, the compiler would never be able to know that those properties need to be initialized. As far as it knows a default constructor set them to some non-null default value.

I don't see why DTO types would need to be treated specially in any way. Warning at the construction site (and never at the declaration site) if a property is left uninitialized should probably just be the universal default behavior.

Also, for the second example:

    var dto = new MyDto
    {
        LastName = "Doe"
    }; // warning: FirstName is uninitialized

I'd say that there shouldn't be a warning here because you haven't done anything with dto at that point. If you were to allow dto to escape out of the method without having first assigned FirstName to something then it could warn at that point.

I'm not convinced that it makes sense to allow constructing the object in an incomplete state at all without producing a warning (or better yet, error). If so, then at least the compiler would have to be clever enough to produce a warning or error if you try to call a method or property on the object before it is completely initialized.

@chucker
Copy link

chucker commented May 22, 2020

This and #2328 seems related; can the two issues be merged?

And, FWIW:

This is yet another use case crying out for the presence of records.

Records don't actually fix this, because there's a subtle difference between each property being init-only/immutable (which records do guarantee), and each being required to have a value (which they don't). The other issue features some discussion over a separate notion of required init or similar: #2328 (comment)

@YairHalberstadt
Copy link
Contributor

Closing as championed in #3630

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants