Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [Errno 22] Invalid argument #520

Closed
yjarosz opened this issue Jul 25, 2018 · 3 comments · Fixed by #521
Closed

OSError: [Errno 22] Invalid argument #520

yjarosz opened this issue Jul 25, 2018 · 3 comments · Fixed by #521

Comments

@yjarosz
Copy link

yjarosz commented Jul 25, 2018

Hello,

I'm testting the asdf library at the moment and it seems that I have it some kind of shape limit:

Here the snippet:

sequence = np.ndarray(shape=(10 ** 6, 1000), dtype=int)
tree = {
    'key': 'value',
    'name': 'Test data',
    'type': 'Numpy matrix',
    'sequence': sequence,
}

af = asdf.AsdfFile(tree)
af.set_array_storage(sequence, 'external')
af.write_to('example.asdf', all_array_storage='external')

My env

(asdf) ➜  benchmarks pip freeze                                                                                                                                             10:28:57
appnope==0.1.0
asdf==2.0.1
backcall==0.1.0
certifi==2018.4.16
decorator==4.3.0
ipython==6.4.0
ipython-genutils==0.2.0
jedi==0.12.1
jsonschema==2.6.0
numpy==1.15.0
parso==0.3.1
pexpect==4.6.0
pickleshare==0.7.4
prompt-toolkit==1.0.15
ptyprocess==0.6.0
Pygments==2.2.0
PyYAML==3.13
semantic-version==2.6.0
simplegeneric==0.8.1
six==1.11.0
traitlets==4.3.2
wcwidth==0.1.7

Traceback:

(asdf) ➜  benchmarks python run.py                                                                                                                                          10:30:47
Traceback (most recent call last):
  File "run.py", line 5, in <module>
    asdf_test.write()
  File "/Users/yj/LCSB/programmation/benchmarks/benchmarks/asdf_test/write.py", line 26, in write
    af.write_to('example.asdf', all_array_storage='external')
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/asdf.py", line 1059, in write_to
    self._serial_write(fd, pad_blocks, include_block_index)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/asdf.py", line 822, in _serial_write
    self.blocks.write_external_blocks(fd.uri, pad_blocks)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/block.py", line 353, in write_external_blocks
    asdffile.write_to(subfd, pad_blocks=pad_blocks)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/asdf.py", line 1059, in write_to
    self._serial_write(fd, pad_blocks, include_block_index)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/asdf.py", line 821, in _serial_write
    self.blocks.write_internal_blocks_serial(fd, pad_blocks)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/block.py", line 294, in write_internal_blocks_serial
    block.write(fd)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/block.py", line 1085, in write
    fd.write_array(self._data)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/generic_io.py", line 754, in write_array
    _array_tofile(self._fd, self._fd.write, arr)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/generic_io.py", line 112, in _array_tofile
    return _array_tofile_chunked(write, array, OSX_WRITE_LIMIT)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/site-packages/asdf/generic_io.py", line 101, in _array_tofile_chunked
    write(array[i:i + chunksize].data)
  File "/Users/yj/.miniconda3/envs/asdf/lib/python3.7/tempfile.py", line 481, in func_wrapper
    return func(*args, **kwargs)
OSError: [Errno 22] Invalid argument

However, if I use sequence = np.ndarray(shape=(10 ** 5, 1000), dtype=int), I have no issue at all.

@drdavella
Copy link
Contributor

drdavella commented Jul 25, 2018

Hi @yjarosz, thanks very much for the report.

This is possibly related to https://bugs.python.org/issue24658. In your code when you change the array size from 10 ** 6 * 1000 to 10 ** 5 * 1000, you are downsizing from a roughly 8GB file to an 800MB file, which crosses the 2GB threshold reported in the Python issue.

It's possible that this can be worked around on the ASDF end by changing the size of the partial writes. I'll look into it.

@drdavella
Copy link
Contributor

@yjarosz see #521 for the workaround solution in ASDF. This will probably be merged later today, and I can try to get a bugfix release out by tomorrow or Friday. If it's practical for you, you can install the development version of ASDF in the meantime once this is merged.

@yjarosz
Copy link
Author

yjarosz commented Jul 26, 2018

Indeed, that works perfectly :) Thanks @drdavella

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants