Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultValueHandling to Reduce JSON Size #534

Closed
arukris-cdw opened this issue Jul 16, 2019 · 23 comments
Closed

DefaultValueHandling to Reduce JSON Size #534

arukris-cdw opened this issue Jul 16, 2019 · 23 comments
Assignees
Labels

Comments

@arukris-cdw
Copy link

arukris-cdw commented Jul 16, 2019

hi Team,

I am looking for a solution to reduce the document size being written to mongo db. Need to insert a new record to mongodb, and if the element value is same as default value in the json schema , those elements needs to be dropped from being inserted.

This is achieved in .NET using a DefaultValueHandling function, below link has some details around it.
https://www.newtonsoft.com/json/help/html/ReducingSerializedJSONSize.htm

Do we have similar feature in loopback?.

eg:

Payload:

{ 
    "title" : "Movies1", 
    "releaseDate" : "07-JUL-2019",
   "runtime" : 0
}

Model Json:

{
	"definitions": {
		
	},
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "http://example.com/root.json",
	"name": "movies",
	"type": "object",
	"title": "The movies Schema",
	"options": {
		"validate": false,
		"forceId": false
	},
	"properties": {
		"title": {
			"type": "string"
		},
		"releaseDate": {
			"type": "Date"
		}
		"runtime": {
			"type": "number",
			"default": 0
		}
	}
}

In this example, "runtime" element is having a value of 0, which is also configured as default value in model json. In this case, i need the following to be inserted to MongoDB (without runtime element).

Expected JSON document in mongodb:

{ 
    "title" : "Movies1", 
    "releaseDate" : "07-JUL-2019"
}
@dhmlau
Copy link
Member

dhmlau commented Jul 22, 2019

@hacksparrow, could you please take a look? Thanks.

@hacksparrow
Copy link
Contributor

@arukris-cdw We don't have this feature.

It is understandable to not send the default values and fill them in the client-side representation of the model, but omitting in the write to the database is not supported in MongoDB, since MongoDB does not support default values.

@hacksparrow
Copy link
Contributor

There is a discussion about this feature at https://jira.mongodb.org/browse/SERVER-24430.

@arukris-cdw
Copy link
Author

@hacksparrow
Thanks for checking. I agree that this is not supported in MongoDB, but this seems to be more like a JSON parsing feature. We might have this feature while creating the JSON itself. In .NET and JAVA world this requirement is achieved while creating the JSON payload. (not while writing to MongoDB).

In our implementation, Loopback is enforcing all datatypes to the JSON payload based on model json file. I think that is done using the datasource juggler, can this be used to enforce the settings we are looking for?

thanks
Arun s.

@hacksparrow
Copy link
Contributor

@arukris-cdw

Hi Arun, my understanding is based on this statement, "I am looking for a solution to reduce the document size being written to mongo db."

I am interpreting that as "I want MongoDB to fill in the defaults by itself". The feature cannot be implemented in LoopBack (or any framework) since MongoDB does not support default values as of now.

However, if you mean, "I want LoopBack to fill in the details for me"; it is a supported feature in LoopBack 3, it is not yet supported in LoopBack 4.

Which version of LoopBack are you using?

I think that is done using the datasource juggler, can this be used to enforce the settings we are looking for?

It is enabled at the model level in LB3, not supported in LB4, yet.

@arukris-cdw
Copy link
Author

@hacksparrow

Yes, I meant "I want LoopBack to fill in the details for me".

Can you please let me know the steps to enable it in LB3 (based on below version).

C:\ProgramData\IBM\MQSI\node_modules>npm ls loopback
C:\ProgramData\IBM\MQSI
`-- [email protected]

@hacksparrow
Copy link
Contributor

@arukris-cdw

This is an example of a model's JSON file (people.json) with default values:

{
  "name": "people",
  "base": "PersistedModel",
  "idInjection": true,
  "options": {
    "validateUpsert": true
  },
  "properties": {
    "name": {
      "type": "string",
      "required": true,
      "default": "Krishna"
    }
  },
  "validations": [],
  "relations": {},
  "acls": [],
  "methods": {}
}

If no name is specified in the request body JSON, it will default to "Krishna" for all the models.

When you execute lb model at the command line, you will be prompted for the default value, you can specify it there.

@arukris-cdw
Copy link
Author

@hacksparrow

My requirement is just the opposite.. I want the elements having default values in the json payload to be ignored/excluded.

kindly go through below link, this is the same I want to achieve using loopback. (Members.Ignore capability)
https://www.newtonsoft.com/json/help/html/T_Newtonsoft_Json_DefaultValueHandling.htm

Thanks
Arun S.

@hacksparrow
Copy link
Contributor

@arukris-cdw I see, so you are looking for the Include, Ignore, Populate, and IgnoreAndPopulate abilities, correct?

That will have to be implemented at the model level.

The library you are referring looks like a client-side library, ignoring properties at the client-side makes sense. However, if we ignore properties while writing to the database, we will end up with missing data and incorrect query results.

@arukris-cdw
Copy link
Author

@hacksparrow

We are using Loopback as the client side application that enforces all datatypes and writes data to mongo db.

Recently Biniam applied a solution in loopback to handle NumberDecimal datatypes which was not supported earlier. We give the json data as string to loopback and it converts it into NumberDecimal and writes to mongo db.

Can we have a similar solution for my requirement. In the modeljson file if we have some properties that says Ignore, ignore all default values and insert the rest. ?.

All application that are going to read the data from mongodb also will use the same json schema with defaults in it. Hence there will not be any missing data issue.

Please let me know. If required we can have a quick call to discuss this problem (i work from Chicago US Central time 9am - 5pm CST).

thanks
Arun S.

@hacksparrow
Copy link
Contributor

There seems to be a big gap in the understanding somewhere, let's discuss on call. Monday, 9:00 am CT is good for me. Please share the Zoom or Skype or your preferable app's details at [email protected].

@hacksparrow
Copy link
Contributor

@bajtos @raymondfeng need your input here.

CDW wants the ability to omit model properties with default values from being written to the database.

While this may seem like a problematic feature, since it will affect the ability to query using those properties; it can actually be a desirable feature to have, if there are a thousands of properties with default values, and thousands such models, which will be queries only their IDs.

Thoughts?

@bajtos
Copy link
Member

bajtos commented Jul 30, 2019

I can see how this feature could be useful to reduce the amount of data stored in MongoDB. I would envision the following implementation in juggler or LB4 repository class (behind a feature flag):

  • Allow model properties to define a default value. I believe this is should be already supported in LB4 via @property({default: SOME_VALUE}.

  • When serializing data to be stored to the database, store properties set to their default values to undefined instead. (Replace 'SOME_VALUE' with undefined).

  • When loading data from the database, replace properties with no values (undefined) with the default value. (Replace undefined with SOME_VALUE).

  • When querying data, modify the conditions matching a property against its default value to use a database-specific operator to match properties that are not defined.

    Let's say we have @property({default: 'Krishna'}) name: string and the user is running the query {where:{name: 'Krishna'}}. This query needs to be converted to the following MongoDB query: {where: {name: {$exists: false}}.

I find this feature it problematic too. I am concerned about handling evolution of the database schema:

  • What if a property initially has no default value, but as the project evolves, we add one? How are we going to query records created before the change?
  • What if a property does have a default value, but later we decide to remove the default? How are we going to query records created before the change?
  • What if a property has a default value X, but later we decide to change the default to Y? How are we going to query records created before the change?

We need to carefully consider how to address these situations to ensure application works as expected.

It may be enough to come up with a database migration guide and update the documentation to make it easy for users of the new feature flag to find this guide. For example, when changing the default value from X to Y, the migration can consists of two steps:

  • Update all records where the property does not exist and set the property value to X.
  • Update all records where the property is set to Y and change the property value to undefined.

Implementation wise, I'd prefer to implement this feature as an extension (e.g. a Repository mixin on LB4 level). To enable such extension, we need DefaultCrudRepository class to expose hooks allowing mixins to change the load/save/query behavior (think of Operation Hooks in LB3, see also loopbackio/loopback-next#1919). We already have toEntity method acting as load hook. In loopbackio/loopback-next#3446, we will be introducing fromEntity method to serve as a save/persist hook, and normalizeFilter which I think can be adapted to serve as a query/access hook.

@hacksparrow
Copy link
Contributor

hacksparrow commented Aug 2, 2019

@arukris-cdw in LB3 using Operation Hooks, you can programmatically remove the properties you want to omit.

In Movies.js:

module.exports = function(Movies) {
  Movies.observe('before save', async function(ctx) {
    if (ctx.instance.runtime === 0) ctx.instance.unsetAttribute('runtime');
  });
};

Similarly you can call unsetAttribute() on other attributes you want to remove based on your requirements.

@arukris-cdw
Copy link
Author

@hacksparrow . Thanks for getting back.

As you are aware i have around 1800 attributes (with multiple nested json array attributes) and how to programmatically unset every attribute. Will you be able to provide me some dynamic function that can check the data for all attributes at different levels and do the unsetAttribute automatically?.

thanks
Arun S.

@hacksparrow
Copy link
Contributor

@arukris-cdw we are considering the possibilities.

@hacksparrow
Copy link
Contributor

@arukris-cdw we have decided to go ahead and implement the ignore feature.

@arukris-cdw
Copy link
Author

@hacksparrow , thanks for considering this as a new feature.

Just want to make sure we have the same understanding. Please let me know if you also agree to below explanation for this feature?.

Ignore Feature:
Ignore members where the member value is the same as the member's default value when serializing objects so that it is not written to JSON. This option will also ignore all default values (e.g. null for objects and nullable types; 0 for integers, decimals and floating point numbers; and false for booleans). The default value ignored can be changed by placing the DefaultValueAttribute on the property.

thanks
Arun S.

@hacksparrow
Copy link
Contributor

This option will also ignore all default values (e.g. null for objects and nullable types; 0 for integers, decimals and floating point numbers; and false for booleans).

We'll provide the ability to ignore properties if the value is the default; however, developers will have to set their own default values. We cannot hard-code 0 as default for integers, for example.

You can do something like:

"count": {
  "type": "number",
  "default": 0,
  "defaultValueHandling": "ignore"
}

Will this solve your problem? If no, then we'll have to reconsider the decision.

The default value ignored can be changed by placing the DefaultValueAttribute on the property.

Can you elaborate this?

@arukris-cdw
Copy link
Author

arukris-cdw commented Aug 7, 2019

@hacksparrow

That should be fine. We will set the default value for individual elements.
Please include a solution to handle empty object or empty array elements as below . To make it more clear, if all the elements under an object/array are having default values and gets ignored by this solution, we want its corresponding object/array to be ignored as well.
eg: {
element1 :{
} ,
elementArray : [
]
}

The default value ignored can be changed by placing the DefaultValueAttribute on the property.
-- I was expecting the same solution that you suggested.. If we want to ignore 1 for a number field I just want to set it as "default": 1 in the model json.

thanks
Arun S.

@hacksparrow
Copy link
Contributor

@arukris-cdw we have introduced a new property applyDefaultOnWrites in the PR loopbackio/loopback-datasource-juggler#1770, using which you can disable default values being written to the database. Look at the test file for usage example.

The PR has landed and will be available in the next minor version of loopback-datasource-juggler, which will very likely be released today.

@dhmlau
Copy link
Member

dhmlau commented Aug 19, 2019

I just released new versions of juggler today: 4.12.0 and 3.33.0.

@dhmlau
Copy link
Member

dhmlau commented Aug 22, 2019

Closing as done.

@dhmlau dhmlau closed this as completed Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants