wordcounter assumes stream is bytes #76

huard · 2019-01-10T21:32:21Z

Description

In the _handler, we have

        def words(f):
            for line in f:
                for word in wordre.findall(line.decode('UTF-8')):
                    yield word

which assumes line has a decode method, but the supported_format (TEXT) does not explicitly specify that the encoding is utf-8. So if the content is passed as an embedded string in the request, with no encoding information, the process fails.

The text was updated successfully, but these errors were encountered:

cehbrecht added the bug label Jan 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wordcounter assumes stream is bytes #76

wordcounter assumes stream is bytes #76

huard commented Jan 10, 2019

wordcounter assumes stream is bytes #76

wordcounter assumes stream is bytes #76

Comments

huard commented Jan 10, 2019

Description