Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't pass binary data to command via stdin #325

Closed
polygon opened this issue Aug 12, 2016 · 2 comments
Closed

Can't pass binary data to command via stdin #325

polygon opened this issue Aug 12, 2016 · 2 comments
Labels

Comments

@polygon
Copy link

polygon commented Aug 12, 2016

The following code does not work as expected:

import sh
data = b'124343'
print(sh.cat(_in=data))

I'd expect it to pass the content of data via stdin to cat and hence see the output 124343, however:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/threading.py", line 911, in _bootstrap_inner
    self.run()
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/threading.py", line 859, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/site-packages/sh.py", line 1453, in input_thread
    done = stdin.write()
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/site-packages/sh.py", line 1799, in write
    self.log.debug("got chunk size %d: %r", len(proc_chunk),
TypeError: object of type 'int' has no len()

This is likely because determine_how_to_read_input(input_obj) does not check for a bytes type and will resort to the default iter_chunk_reader which will iterate over each element one-by-one. Contrary to a str where each element is still a str, iterating over bytes will return a series of integers with the result above. I think the problem can be fixed by special handling of bytes types inside that function. I might add a PR later for that.

For now, a usable workaround is to use a BinaryIO buffer so that the file_chunk_reader is being used.

import sh
import io

data = b'124343'
buffer = io.BytesIO(data)
print(sh.cat(_in=buffer))

The result will work as expected.

@amoffat
Copy link
Owner

amoffat commented Aug 12, 2016

Thanks for reporting, and for the workaround @polygon

@amoffat
Copy link
Owner

amoffat commented Oct 6, 2016

will go out in the 1.2 release

0-wiz-0 added a commit to NetBSD/pkgsrc-wip that referenced this issue Dec 12, 2016
*   added `_out` and `_out_bufsize` validator [#346](amoffat/sh#346)
*   bugfix for internal stdout thread running when it shouldn't [#346](amoffat/sh#346)

*   regression bugfix on timeout [#344](amoffat/sh#344)
*   regression bugfix on `_ok_code=None`

*   further improvements on cpu usage

*   regression in cpu usage [#339](amoffat/sh#339)

*   fd leak regression and fix for flawed fd leak detection test [#337](amoffat/sh#337)

*   support for `io.StringIO` in python2

*   added support for using raw file descriptors for `_in`, `_out`, and `_err`
*   removed `.close()`ing `_out` handler if FIFO detected

*   composed commands no longer propagate `_bg`
*   better support for using `sys.stdin` and `sys.stdout` for `_in` and `_out`
*   bugfix where `which()` would not stop searching at the first valid executable found in PATH
*   added `_long_prefix` for programs whose long arguments start with something other than `--` [#278](amoffat/sh#278)
*   added `_log_msg` for advanced configuration of log message [#311](amoffat/sh#311)
*   added `sh.contrib.sudo`
*   added `_arg_preprocess` for advanced command wrapping
*   alter callable `_in` arguments to signify completion with falsy chunk
*   bugfix where pipes passed into `_out` or `_err` were not flushed on process end [#252](amoffat/sh#252)
*   deprecated `with sh.args(**kwargs)` in favor of `sh2 = sh(**kwargs)`
*   made `sh.pushd` thread safe
*   added `.kill_group()` and `.signal_group()` methods for better process control [#237](amoffat/sh#237)
*   added `new_session` special keyword argument for controlling spawned process session [#266](amoffat/sh#266)
*   bugfix better handling for EINTR on system calls [#292](amoffat/sh#292)
*   bugfix where with-contexts were not threadsafe [#247](amoffat/sh#195)
*   `_uid` new special keyword param for specifying the user id of the process [#133](amoffat/sh#133)
*   bugfix where exceptions were swallowed by processes that weren't waited on [#309](amoffat/sh#309)
*   bugfix where processes that dupd their stdout/stderr to a long running child process would cause sh to hang [#310](amoffat/sh#310)
*   improved logging output [#323](amoffat/sh#323)
*   bugfix for python3+ where binary data was passed into a process's stdin [#325](amoffat/sh#325)
*   Introduced execution contexts which allow baking of common special keyword arguments into all commands [#269](amoffat/sh#269)
*   `Command` and `which` now can take an optional `paths` parameter which specifies the search paths [#226](amoffat/sh#226)
*   `_preexec_fn` option for executing a function after the child process forks but before it execs [#260](amoffat/sh#260)
*   `_fg` reintroduced, with limited functionality.  hurrah! [#92](amoffat/sh#92)
*   bugfix where a command would block if passed a fd for stdin that wasn't yet ready to read [#253](amoffat/sh#253)
*   `_long_sep` can now take `None` which splits the long form arguments into individual arguments [#258](amoffat/sh#258)
*   making `_piped` perform "direct" piping by default (linking fds together).  this fixes memory problems [#270](amoffat/sh#270)
*   bugfix where calling `next()` on an iterable process that has raised `StopIteration`, hangs [#273](amoffat/sh#273)
*   `sh.cd` called with no arguments no changes into the user's home directory, like native `cd` [#275](amoffat/sh#275)
*   `sh.glob` removed entirely.  the rationale is correctness over hand-holding. [#279](amoffat/sh#279)
*   added `_truncate_exc`, defaulting to `True`, which tells our exceptions to truncate output.
*   bugfix for exceptions whose messages contained unicode
*   `_done` callback no longer assumes you want your command put in the background.
*   `_done` callback is now called asynchronously in a separate thread.
*   `_done` callback is called regardless of exception, which is necessary in order to release held resources, for example a process pool
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants