-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #690 -- blob packing/unpacking of native python bool, int, float, and complex. #709
Conversation
…t, float, and complex
datajoint/blob.py
Outdated
|
||
@staticmethod | ||
def pack_int(v): | ||
return b"\x0a" + np.array(v, dtype='int64').tobytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we did not utilize decimal packing here? Python int
are essentially boundless (memory-dependent). I believe decimal
packing would be a closer representation as the length would be encoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified to support unbounded int
|
||
@staticmethod | ||
def pack_float(v): | ||
return b"\x0d" + np.array(v, dtype='float64').tobytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we did not utilize decimal packing here? Python float
have a precision of 53 bits which means we would be storing unnecessary additional data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like for us to consider utilizing decimal
packing so that we may store all int
bits and only the necessary bits to properly represent other new types. Also, we should be careful to add documentation that this upgrade might require to be conducted as system-wide/user-wide. Consider the following scenario:
If users are relying on DJ to infer the data types, then if a current query is inserting a list such as [1,2,3]
then previously this would be inserted as list(np.int64(1),np.int64(2),np.int64(3))
. Now with this update it would inserted as list(int(1),int(2),int(3))
. Since the update is backward compatible, all new users would be good with fetching data, however, users utilizing the previous DJ version would receive errors on a fetch using their same query as blob data now contains mixed packing. Since the error is on a previous version of DJ, the error message is somewhat vague e.g.
Unknown data structure code "
"
|
||
@staticmethod | ||
def pack_complex(v): | ||
return b"\x0c" + np.array(v, dtype='complex128').tobytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could utilize decimal
packing here for the same reasons as float
below. Python seems to capture the first 53 bits for each the real
part and the complex
part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here Python is not doing anything special and just uses the standard IEEE 754 encoding.
@dimitri-yatsenko Can you update |
Update release details
… and complex