Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cosmos DB] Support a default value for non-nullable properties #21006

Open
thomaslevesque opened this issue May 21, 2020 · 23 comments
Open

[Cosmos DB] Support a default value for non-nullable properties #21006

thomaslevesque opened this issue May 21, 2020 · 23 comments

Comments

@thomaslevesque
Copy link
Member

thomaslevesque commented May 21, 2020

Suppose I have an entity like this:

public class Todo
{
    public string Id { get; set; }
    public string Title { get; set; }
    public bool IsDone { get; set; }
}

If the container has a document with no value defined for IsDone, trying to load it throws an InvalidOperationException: Nullable object must have a value.

Although I understand the reason for this, it means my code can never handle documents where IsDone is null or undefined (which can happen, for instance, if the property didn't exist in the beginning but was added later). I could make the property nullable in the entity, but then I would need to handle the null case everywhere the property is used, which isn't ideal.

Also, if I query with a filter like .Where(t => !t.IsDone), documents where the IsDone property is undefined are not returned. I understand that it's because of the Cosmos DB semantics (null or undefined is neither true nor false), but it would be nice if null or undefined could be handled as if it was a default value (false in this case). See also Azure/azure-cosmos-dotnet-v3#682.

I think this could easily be handled by supporting a default value. This value could either be specified with a [DefaultValue] attribute on the property, or with a HasDefaultValue(T value) method in the fluent model builder API. The effect would be to change the SQL from NOT(doc['IsDone']) to NOT(doc['IsDone'] ?? <defaultValue>).

I think it would be a very valuable feature to make data model evolutions easier. Currently, if I want to add a property when the container already contains documents, I have to either make it nullable (which isn't ideal if it's not supposed to be null) or update all documents to add the property.

@ajcvickers
Copy link
Member

@AndriySvyryd Duplicate of #13131, but this was split into dotnet/EntityFramework.Docs#1712, #17722, and #17746. I'm not which of these matches this particular request.

@AndriySvyryd
Copy link
Member

@ajcvickers We can keep this one separate then as none of the split ones match it.

@ajcvickers ajcvickers added this to the Backlog milestone May 22, 2020
@steveevers
Copy link

steveevers commented Apr 7, 2021

This is very much needed, however, you can get most of the way there if you don't need to query using the property by doing something like this:

public class Todo
{
    public string Id { get; set; }
    public string Title { get; set; }
    public bool? isDone { get; set; }

    [NotMapped]
    public bool IsDone
    {
        get { return this.isDone.HasValue && this.isDone.Value; }
        set { this.isDone == value; }
    }
}

That is to say, make it nullable, but make a wrapper around it that defaults the null to false when read. You get the safety of having a default value for when the document doesn't contain the value and you can still deserialize the type from cosmos without throwing exceptions.

@vyarymovych
Copy link

vyarymovych commented Jan 20, 2022

@steveevers In general it works, but there is one caveat.
It doesn't work if you want to apply query projection which is very important with regards to query performance (to query specific props rather than everything). In this case it doesn't even execute property getter and crashes with the same error ('Nullable object must have a value.'). Example: .Select(t => new TodoProjection { IsDone = t.IsDone })
Any ideas how to workaround this?

@dbalikhin
Copy link

Any update on this feature?
It's very annoying to add new non-nullable fields.

@ajcvickers
Copy link
Member

@dbalikhin This issue is in the Backlog milestone. This means that it is not planned for the next release (EF Core 7.0). We will re-assess the backlog following the this release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources. Make sure to vote (👍) for this issue if it is important to you.

@dbalikhin
Copy link

Thanks @ajcvickers.
I don't think EF Core is even usable with CosmosDB without this feature. Unless you don't need to add new fields in the future. But that's the point of unstructured data storage.
New fields defined as nullable: cannot properly use a.HasValue / a.Value in query, because of null vs undefined.
New fields defined as non-nullable: cannot properly retrieve items, because this field can be undefined.
No migrations to set the default value on field creation.
More like CosmosDB limitations of course.

@dbalikhin
Copy link

Okay, so it doesn't like nullable.HasValue and nullable.Value, but it works with "nullable.FieldName condition X".
So nullable.IsDone == true or nullable.IsDone == null.
Stick with nullable for now...

@rcmosher
Copy link

One option for providing a default value is to use a ValueConverter with convertsNulls set to true. An example for a boolean that defaults to false when undefined or null.

  public class UndefinedDefaultConverter : ValueConverter<bool, bool?>
  {
#pragma warning disable EF1001 // Internal EF Core API usage.
    public UndefinedDefaultConverter() : base(
      v => v,
      v => v ?? false,
      convertsNulls: true
    )
#pragma warning restore EF1001 // Internal EF Core API usage.
    { }
  }

Note, this doesn't help with querying since IsDefined() isn't handled by EF. I haven't found a way other than writing your own query to properly check for IS_DEFINED(), null, or the default value.

@mnguyen36
Copy link

@AndriySvyryd #13131 doesn't really explain what EF core does or should do to support this use case (supporting non-nullable values by allowing defaults)
#17722 and #17746 also don't directly enable this.

Is there not a simple way to allow EFCore to let the natural c# way of initializing defaults occur without error?
i.e.

public class Todo
{
    public string Id { get; set; }
    public string Title { get; set; }
    public bool IsDone { get; set; } = false; 
}

In the case where you need to query by these values i.e. _context.ToDo.Where(td => td.IsDone), any values who aren't there should be treated just as cosmos handles it, but itself would be a separate responsibility than the one saying that you can't have nulls. I was looking at adding the IS_DEFINED function into EF.Functions but still can't connect how this will actually enable the big problem of EFCore not handling gracefully new values

@imerzan
Copy link

imerzan commented Feb 26, 2023

Running into this problem now. Can't believe a Microsoft product doesn't support automatic property generation that's baked into the C# lang... :/

public bool IsDone { get; set; } = false;

This would be ideal. If the Document doesn't contain the property, it will be automatically populated, and later when I save changes it will be created in the document.

Otherwise I have to go through and update every document to add a new property (cumbersome).

@AndriySvyryd
Copy link
Member

@mnguyen36 #17722 would only provide a workaround for some use cases. To support a more natural use we would need to design the API to configure it. We can't use the value that the property is initialized to as it's not easily accessible when we are building the model

@leo-schick
Copy link

IMO, this issue can be handled by a migration script which runs over the existing documents and gives them their default value. In your case you should do something similar to:

UPDATE ToDo
SET IsDone = false
WHERE IsDone is null

Now you just have to put this into NoSQL and run this via a custom migration script. Maybe take a look at my Kapok.Module.MigrationEngine. But it might be confusing since it uses an abstraction layer to EF Core putting everything into data domains ...

@thomaslevesque
Copy link
Member Author

Except CosmosDB doesn't support the UPDATE statement, so you have to fetch each document and update it. Or you could write a stored procedure to do it (which is not trivial due to the callback-based async API).

Anyway, migrating the documents is fine when you have just a few documents, but when you have millions, it's not really a good option.

@leo-schick
Copy link

@thomaslevesque that is true. In that case I would suggest to look into big data tools like Azure Databricks which has a Azure Cosmos DB Connector which supports read/write operations. Combined with a collection (=index) on the IsDone property and maybe a two step migration (adding the property as nullalbe, removing nullable after the processing is done) you should get to where you want to be...

Anyways, you have to make a trade off between one time migration or bad reading performance...

I don't think that someone from the Microsoft team will implement this. It looks to be more business logic than ORM logic to me ... and EF Core is the ORM ...

The fact is, that your documents just don't have the new property ... and therefore you either should add it to them or you have to deal with that fact that the property is sometimes not there (= null).

I see this as quite error prone to implement and the reason to implement this seems to me quite weak ...

For example, how would you make sure that EF Core makes sure this works with joins or pushdowns? (that is what happens when a Linq query gets send over to the database for execution) Then you have to travel through the whole Linq expression and extend the query with an OR operator or ISNULL / COALESCE scala operation (-> which decreases the query performance)

@thomaslevesque
Copy link
Member Author

Oh, I'm not saying it would be easy to do. Just that it would be useful and would make a pretty common scenario much easier.

Anyway, I don't use Cosmos DB anymore, so it doesn't really matter to me whether it's implemented or not.

@dbalikhin
Copy link

Anyway, I don't use Cosmos DB anymore, so it doesn't really matter to me whether it's implemented or not.

This! +1

I suspect they addressed this issue by introducing CosmosDB for PostgreSQL. It can scale and it is relational.

@NickSt
Copy link

NickSt commented Aug 19, 2023

Might be better to add something like this by allowing the user to set JSON serializer options and add custom serializers. Then it would cover almost all scenarios. Its almost impossible to use EF Core and Cosmos DB together in production without the ability to use a custom serializer. There's too many compromises you must make in your schema to make it viable.

@ajcvickers
Copy link
Member

@NickSt It would be really helpful if you could outline the compromises you have had to make.

@wangengzheng
Copy link

so sad

@maschall
Copy link

maschall commented Aug 5, 2024

+1 to this. We just ran into this issue as well.

@garrynewman
Copy link

+1 what the hell

@johnkslg
Copy link

So many things wrong here:

  • The error text is wrong! The problem is that a non-nullable value is being assigned null, not that 'nullable needs value'. This alone cost me many hours.
  • It doesnt say what property is the problem. I'm querying millions of entries with hundreds of properties. I have no idea which is the issue. I dont even have a way to see what the failing entry is.

Its just flabbergasting that global internet infrastructure like this can be broken for one of the most basic use cases, for years, AND have a information-free + plain wrong error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests