Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseIOStream.write(): support typed memoryviews #2996

Merged
merged 1 commit into from
Mar 20, 2021

Conversation

madsbk
Copy link
Contributor

@madsbk madsbk commented Mar 2, 2021

This PR implements support of memoryviews with an item size greater than 1 byte.

Currently, Tornado uses len(x) to retrieve the number of bytes of x, which works fine when x is bytes or a memoryview with an item size of 1. However, with an item size greater than 1 we cannot assume that len(x) == x.nbytes.

Making sure that ``len(data) == data.nbytes`` by casting
memoryviews to bytes.
@madsbk
Copy link
Contributor Author

madsbk commented Mar 9, 2021

Sorry for the close and re-opening of the PR. We have fixed the issue in Dask/Distributed by always casting to bytes before communication: dask/distributed#4555 but I think this PR is still relevant for other downstream projects that might use typed memoryviews.

Copy link
Member

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I hadn't noticed this aspect of memoryview before (I thought it was just a python-level exposure of the buffer protocol).

It looks like we could alternatively convert the input to a memoryview (unconditionally) and use nbytes instead of len. Is there any particular reason to prefer one form over the other? (I like replacing the union with a concrete type as early as possible, although len() is the natural idiom so choosing a type whose len() is problematic is error-prone).

@madsbk
Copy link
Contributor Author

madsbk commented Mar 15, 2021

It looks like we could alternatively convert the input to a memoryview (unconditionally) and use nbytes instead of len. Is there any particular reason to prefer one form over the other? (I like replacing the union with a concrete type as early as possible, although len() is the natural idiom so choosing a type whose len() is problematic is error-prone).

I think both approaches make sense but if you are not planning to support non-contiguous buffers or fancy data compression, casting memoryviews to cast("B") seems fine than you can use nbytes and len interchangeable.

@bdarnell bdarnell merged commit 4d47f9d into tornadoweb:master Mar 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants