-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert NULLBOOL type to ATOM type #3
Conversation
This extension is backward compatible and allow for dictionary encoded compression in various context. For instance, in SSB-DB2, we could have a dictionary for object keys, a dictionary of Authors pub keys, types, etc. That offer a better compression and faster seekPath.
I like what you are doing here with dictionaries, but I'm wondering if this isn't the case for the extended type. Maybe convert this into an example of how that could be used would be great I think. |
For the extended tag, I was planning to propose something else for less usual cases. The benefit of reuse the BOOLNULL tag and convert it to ATOM tag is that reduce the overhead and if you think about it, with today use for only null, true and false, we lose a lot of bits for just 3 values.#2 |
Example of usage for encoding with a key dict: the json equivalent is
Note: currently I have on implemented a naive KeyDict for the object field keys. In the context of SSB larger gains in space and scan speed can be achieved using an Dict for some specific fields like Author, types, etc. Some simulations to compare memory usage:
Some performance metrics on encoding/decoding:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant about this change, because it overloading like this feels hacky, or reduces the simplicity of BIPF. But I think it's okay to add this.
I think it will be important to highlight that when an ATOM is used for anything other than null/false/true, then it should always be application-internal semantics. So there should be a contract that application-internal ATOM values should never be sent to peers over the network. It should be considered invalid. I think it's important to mention this clearly.
This can be used in interoperability cases too is the meaning of atoms other than null, true and false is shared by some way. Example 1: like in Erlang OTPFor instance, between 2 connected peers via TCP (or any tech that guarantee the ordering of messages) that exchange messages in BIPF. Each peer create a sending cache and a receiving cache (let say an array of 2048 entry) When A send a message to B. To encode the message it will use the sending cache as a kind of dictionary for object keys Similarly, you can have additional caches for some paths in messages where value is repeated often during the communication between 2 peers. Like for instance the Author field Example 2: predefined schemas.Just using message schemas that everybody knows with numbered fields like with Protobuf, keys are atoms with the number. |
Not my domain of expertise. Will differ to bipf authors, UNLESS you want an outside perspective. In which case pull me back in |
Can someone with write access merge it ? |
author wants to merge, and someone else approved, so i'll just merge |
This extension is backward compatible and allow for dictionary encoded compression in various context.
For instance, in SSB-DB2, we could have a dictionary for object keys, a dictionary of Authors pub keys, types, etc. That offer a better compression and faster seekPath.