-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #690 -- blob packing/unpacking of native python bool, int, float, and complex. #709
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
681fb97
fix #690 -- blob packing/unpacking of native python bool, int, float,…
dimitri-yatsenko a4e5382
minor
dimitri-yatsenko e348426
reduce encoding length for native python types in blobs
dimitri-yatsenko 9c2e419
Merge branch 'master' of https://github.com/datajoint/datajoint-python
dimitri-yatsenko 86a2c2c
ensure that np.number is encoded as a numpy scalar
dimitri-yatsenko 231efe2
Merge branch 'master' of https://github.com/datajoint/datajoint-python
dimitri-yatsenko 106239c
add support for unbounded integers in blob serialization
dimitri-yatsenko eadde37
add test for unbounded integer
dimitri-yatsenko f1e6da6
update CHANGELOG and version for release 0.12.4
dimitri-yatsenko 392d56a
correct computation of number of bits for unbounded integers in blobs
dimitri-yatsenko 61362e7
fix unbounded integer encoding in blobs
dimitri-yatsenko 4a56d42
fix bug in LNX-docker-compose.yml
dimitri-yatsenko 876d62a
improve tests for adapted attributes
dimitri-yatsenko 8a3c9a1
update comment to use general data types rather than python-focused
dimitri-yatsenko 92f56ab
Update release details.
guzman-raphael 9be1115
Merge pull request #7 from guzman-raphael/pr709
dimitri-yatsenko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
""" | ||
(De)serialization methods for python datatypes and numpy.ndarrays with provisions for mutual | ||
(De)serialization methods for basic datatypes and numpy.ndarrays with provisions for mutual | ||
compatibility with Matlab-based serialization implemented by mYm. | ||
""" | ||
|
||
|
@@ -115,21 +115,25 @@ def read_blob(self, n_bytes=None): | |
"P": self.read_sparse_array, # matlab sparse array -- not supported yet | ||
"S": self.read_struct, # matlab struct array | ||
"C": self.read_cell_array, # matlab cell array | ||
# Python-native | ||
"\xFF": self.read_none, # None | ||
"\1": self.read_tuple, # a Sequence | ||
"\2": self.read_list, # a MutableSequence | ||
"\3": self.read_set, # a Set | ||
"\4": self.read_dict, # a Mapping | ||
"\5": self.read_string, # a UTF8-encoded string | ||
"\6": self.read_bytes, # a ByteString | ||
"F": self.read_recarray, # numpy array with fields, including recarrays | ||
"d": self.read_decimal, # a decimal | ||
"t": self.read_datetime, # date, time, or datetime | ||
"u": self.read_uuid, # UUID | ||
# basic data types | ||
"\xFF": self.read_none, # None | ||
"\x01": self.read_tuple, # a Sequence (e.g. tuple) | ||
"\x02": self.read_list, # a MutableSequence (e.g. list) | ||
"\x03": self.read_set, # a Set | ||
"\x04": self.read_dict, # a Mapping (e.g. dict) | ||
"\x05": self.read_string, # a UTF8-encoded string | ||
"\x06": self.read_bytes, # a ByteString | ||
"\x0a": self.read_int, # unbounded scalar int | ||
"\x0b": self.read_bool, # scalar boolean | ||
"\x0c": self.read_complex, # scalar 128-bit complex number | ||
"\x0d": self.read_float, # scalar 64-bit float | ||
"F": self.read_recarray, # numpy array with fields, including recarrays | ||
"d": self.read_decimal, # a decimal | ||
"t": self.read_datetime, # date, time, or datetime | ||
"u": self.read_uuid, # UUID | ||
}[data_structure_code] | ||
except KeyError: | ||
raise DataJointError('Unknown data structure code "%s"' % data_structure_code) | ||
raise DataJointError('Unknown data structure code "%s". Upgrade datajoint.' % data_structure_code) | ||
v = call() | ||
if n_bytes is not None and self._pos - start != n_bytes: | ||
raise DataJointError('Blob length check failed! Invalid blob') | ||
|
@@ -146,13 +150,21 @@ def pack_blob(self, obj): | |
|
||
# blob types in the expanded dj0 blob format | ||
self.set_dj0() | ||
if not isinstance(obj, (np.ndarray, np.number)): | ||
# python built-in data types | ||
if isinstance(obj, bool): | ||
return self.pack_bool(obj) | ||
if isinstance(obj, int): | ||
return self.pack_int(obj) | ||
if isinstance(obj, complex): | ||
return self.pack_complex(obj) | ||
if isinstance(obj, float): | ||
return self.pack_float(obj) | ||
if isinstance(obj, np.ndarray) and obj.dtype.fields: | ||
return self.pack_recarray(np.array(obj)) | ||
if isinstance(obj, np.number): | ||
return self.pack_array(np.array(obj)) | ||
if isinstance(obj, (bool, np.bool, np.bool_)): | ||
return self.pack_array(np.array(obj)) | ||
if isinstance(obj, (float, int, complex)): | ||
if isinstance(obj, (np.bool, np.bool_)): | ||
return self.pack_array(np.array(obj)) | ||
if isinstance(obj, (datetime.datetime, datetime.date, datetime.time)): | ||
return self.pack_datetime(obj) | ||
|
@@ -209,7 +221,7 @@ def pack_array(self, array): | |
if is_complex: | ||
array, imaginary = np.real(array), np.imag(array) | ||
type_id = (rev_class_id[array.dtype] if array.dtype.char != 'U' | ||
else rev_class_id[np.dtype('O')]) | ||
else rev_class_id[np.dtype('O')]) | ||
if dtype_list[type_id] is None: | ||
raise DataJointError("Type %s is ambiguous or unknown" % array.dtype) | ||
|
||
|
@@ -251,6 +263,36 @@ def pack_recarray(self, array): | |
def read_sparse_array(self): | ||
raise DataJointError('datajoint-python does not yet support sparse arrays. Issue (#590)') | ||
|
||
def read_int(self): | ||
return int.from_bytes(self.read_binary(self.read_value('uint16')), byteorder='little', signed=True) | ||
|
||
@staticmethod | ||
def pack_int(v): | ||
n_bytes = v.bit_length() // 8 + 1 | ||
assert 0 < n_bytes <= 0xFFFF, 'Integers are limited to 65535 bytes' | ||
return b"\x0a" + np.uint16(n_bytes).tobytes() + v.to_bytes(n_bytes, byteorder='little', signed=True) | ||
|
||
def read_bool(self): | ||
return bool(self.read_value('bool')) | ||
|
||
@staticmethod | ||
def pack_bool(v): | ||
return b"\x0b" + np.array(v, dtype='bool').tobytes() | ||
|
||
def read_complex(self): | ||
return complex(self.read_value('complex128')) | ||
|
||
@staticmethod | ||
def pack_complex(v): | ||
return b"\x0c" + np.array(v, dtype='complex128').tobytes() | ||
|
||
def read_float(self): | ||
return float(self.read_value('float64')) | ||
|
||
@staticmethod | ||
def pack_float(v): | ||
return b"\x0d" + np.array(v, dtype='float64').tobytes() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason why we did not utilize decimal packing here? Python |
||
|
||
def read_decimal(self): | ||
return Decimal(self.read_string()) | ||
|
||
|
@@ -269,7 +311,7 @@ def pack_string(s): | |
|
||
def read_bytes(self): | ||
return self.read_binary(self.read_value()) | ||
|
||
@staticmethod | ||
def pack_bytes(s): | ||
return b"\6" + len_u64(s) + s | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
__version__ = "0.12.3" | ||
__version__ = "0.12.4" | ||
|
||
assert len(__version__) <= 10 # The log table limits version to the 10 characters |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could utilize
decimal
packing here for the same reasons asfloat
below. Python seems to capture the first 53 bits for each thereal
part and thecomplex
part.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here Python is not doing anything special and just uses the standard IEEE 754 encoding.