Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #690 -- blob packing/unpacking of native python bool, int, float, and complex. #709

Merged
merged 16 commits into from
Jan 14, 2020

Conversation

dimitri-yatsenko
Copy link
Member

… and complex

@dimitri-yatsenko dimitri-yatsenko changed the title fix #690 -- blob packing/unpacking of native python bool, int, float,… fix #690 -- blob packing/unpacking of native python bool, int, float, and complex. Nov 22, 2019

@staticmethod
def pack_int(v):
return b"\x0a" + np.array(v, dtype='int64').tobytes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we did not utilize decimal packing here? Python int are essentially boundless (memory-dependent). I believe decimal packing would be a closer representation as the length would be encoded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified to support unbounded int


@staticmethod
def pack_float(v):
return b"\x0d" + np.array(v, dtype='float64').tobytes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we did not utilize decimal packing here? Python float have a precision of 53 bits which means we would be storing unnecessary additional data.

Copy link
Collaborator

@guzman-raphael guzman-raphael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like for us to consider utilizing decimal packing so that we may store all int bits and only the necessary bits to properly represent other new types. Also, we should be careful to add documentation that this upgrade might require to be conducted as system-wide/user-wide. Consider the following scenario:

If users are relying on DJ to infer the data types, then if a current query is inserting a list such as [1,2,3] then previously this would be inserted as list(np.int64(1),np.int64(2),np.int64(3)). Now with this update it would inserted as list(int(1),int(2),int(3)). Since the update is backward compatible, all new users would be good with fetching data, however, users utilizing the previous DJ version would receive errors on a fetch using their same query as blob data now contains mixed packing. Since the error is on a previous version of DJ, the error message is somewhat vague e.g.

Unknown data structure code "
"


@staticmethod
def pack_complex(v):
return b"\x0c" + np.array(v, dtype='complex128').tobytes()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could utilize decimal packing here for the same reasons as float below. Python seems to capture the first 53 bits for each the real part and the complex part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here Python is not doing anything special and just uses the standard IEEE 754 encoding.

@guzman-raphael
Copy link
Collaborator

@dimitri-yatsenko Can you update datajoint-python/docs-parts/intro/Releases_lang1.rst?

@guzman-raphael guzman-raphael merged commit a9aad89 into datajoint:master Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants