Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative size passed to PyBytes_FromStringAndSize #4

Open
elliottd opened this issue Feb 4, 2019 · 5 comments
Open

Negative size passed to PyBytes_FromStringAndSize #4

elliottd opened this issue Feb 4, 2019 · 5 comments

Comments

@elliottd
Copy link
Contributor

elliottd commented Feb 4, 2019

I experience the following problem when trying to download the validation split using the most recent version of the code. It seems to be related to Shelve library, and it may be a known problem on my platform (OS X).

Python 3.7.1 (default, Oct 23 2018, 14:07:42)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
python download_data.py 
Opening Validation_GCC-1.1.0-Validation.tsv Data File...
Processing 15840  Images:
Generating parts... | 0/15840 [00:00<?, ?it/s]158 parts. 100 per part. Using 32 processes
Downloading: 100%|███████| 15840/15840 [07:46<00:00, 33.95it/s]

Finished Downloading.

Generating Dataframe from results...

Traceback (most recent call last):
  File "download_data.py", line 131, in <module>
    df = df_from_shelve(chunk_size=images_per_part, func=download_image, dataset_name=data_name)
  File "download_data.py", line 119, in df_from_shelve
    keylist = sorted([int(k) for k in results.keys()])
  File "download_data.py", line 119, in <listcomp>
    keylist = sorted([int(k) for k in results.keys()])
  File "/Users/lvx122/miniconda3/lib/python3.7/_collections_abc.py", line 720, in __iter__
    yield from self._mapping
  File "/Users/lvx122/miniconda3/lib/python3.7/shelve.py", line 95, in __iter__
    for k in self.dict.keys():
SystemError: Negative size passed to PyBytes_FromStringAndSize
@igorbrigadir
Copy link
Owner

I haven't got this one yet - I've Python 3.6.4 on Ubuntu 16.04

That error in that place will prevent you from writing the report at the end, but images should still be downloaded, (even though some resuming relies on the shelve thing too).

It might work if you delete shelve and see if it recreates again without that error?

It's safe to delete the .bak, .dat, .dir files associated with shelve, and when the download runs again it should skip downloading images based on files, and recreate the shelve for a report (downloaded_validation_report.tsv.gz will be overwritten too)

@elliottd
Copy link
Contributor Author

elliottd commented Feb 5, 2019

It might work if you delete shelve and see if it recreates again without that error?

It gives the same error but it seems like this is an upstream problem.

@igorbrigadir
Copy link
Owner

Ah good to know! Thanks!

@lovecambi
Copy link

I had the same error on OS X but I got a workaround.

  1. Create two empty files, validation_download_image_100_results.tmp.dat and validation_download_image_100_results.tmp.dir.
  2. Run the download_data.py
  3. In this way, the python shelve will generate *.dat, *.bak, *.dir rather than *.db.

@KawaiiNotHawaii
Copy link

KawaiiNotHawaii commented Jul 7, 2021

It is an error with shelve. My guessing is that there is an overflow for the 32 bit integer as I tried to download only the first 20 images and it worked fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants