Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError cause by system encoding #13

Closed
WEGFan opened this issue Jul 19, 2020 · 7 comments · Fixed by #17
Closed

UnicodeDecodeError cause by system encoding #13

WEGFan opened this issue Jul 19, 2020 · 7 comments · Fixed by #17
Labels
bug something broken

Comments

@WEGFan
Copy link

WEGFan commented Jul 19, 2020

When I try to export to a static image, the following error occurs but I still can get the image correctly.

Exception in thread Thread-4:
Traceback (most recent call last):
  File "D:\Program Files\Python36\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "D:\Program Files\Python36\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "D:\My Documents\GitHub\github-readme-codestats-widget\venv\lib\site-packages\kaleido\scopes\base.py", line 74, in _collect_standard_error
    val = self._proc.stderr.readline().decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb9 in position 77: invalid start byte

So I debugged the line at val = self._proc.stderr.readline().decode('utf-8') and found that it should be decode as gb2312 because this is my system encoding.

I think this line should be changed to

os_encoding = locale.getpreferredencoding()
val = self._proc.stderr.readline().decode(os_encoding)
@jonmmease jonmmease added bug something broken language: Python labels Jul 20, 2020
@jonmmease
Copy link
Collaborator

Hi @WEGFan, thanks a lot for letting us know. This section of code is using a background thread to read the standard error stream produced by Chromium. I'll need to dig into this bit more deeply, but your suggestion makes sense.

We are also explicitly using the UTF-8 encoding when writing JSON image export requests to standard in for the Kaleido C++ executable, and when reading the resulting JSON export results from standard out. Are you running into any issues with the resulting images? Have you tried the SVG format? Thanks!

@WEGFan
Copy link
Author

WEGFan commented Jul 21, 2020

The result images have no issues and exporting SVG format still getting the error.

@jonmmease
Copy link
Collaborator

Ok, great. Thanks for the info!

@jonmmease
Copy link
Collaborator

@WEGFan do get the same encoding from sys.stderr.encoding? This is a pattern I've seen some other projects use:

import sys
import locale
try:
    encoding = sys.stderr.encoding
except Exception:
    encoding = locale.getpreferredencoding()

getpreferredencoding actually has side-effects by default that I don't fully understand the implications of (https://stackoverflow.com/questions/23743160/locale-getpreferredencoding-why-does-this-reset-string-letters).

@WEGFan
Copy link
Author

WEGFan commented Jul 23, 2020

Nope.

>>> import sys
>>> import locale
>>> sys.stderr.encoding
'utf-8'
>>> locale.getpreferredencoding()
'cp936'

@WEGFan
Copy link
Author

WEGFan commented Jul 23, 2020

I took a look at Python 3 open() document and it says

In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

And locale.getpreferredencoding() on Python 2

If invoking setlocale is not necessary or desired, do_setlocale should be set to False.

So maybe using locale.getpreferredencoding(False) is safe.

@jonmmease
Copy link
Collaborator

Thanks, I just opened #17 which should take care of it. It delays encoding until we actually need to display to the user, then it attemps to decode using sys.stderr.encoding and locale.getpreferredencoding(False), with exception handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants