Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent channel usage not thread safe? #322

Open
kyrias opened this issue Jul 19, 2024 · 1 comment
Open

Concurrent channel usage not thread safe? #322

kyrias opened this issue Jul 19, 2024 · 1 comment

Comments

@kyrias
Copy link

kyrias commented Jul 19, 2024

While trying to determine whether bk-rs/ssh-rs#50 is a bug in async-ssh2-lite, in ssh2-rs, or in libssh2 I tried to convert my reproducing example to be sync/using ssh2 directly, and doing so I hit another issue which to me smells like it could be symptoms of the crate's usage of libssh2 not being entirely thread-safe.

I suspect that the issue here is that on drop libssh2_channel_free is called without locking the session, and which then tries to close the channel internally since it wasn't explicitly closed first, while some other thread has the lock held and fiddles with the session. Adding an explicit session.close().unwrap() seems to resolve it.

Reproducing

The setup is the same as in the linked issue except with the following code instead:

Client code
use std::{
    io::{Read, Write},
    net::TcpStream,
    path::PathBuf,
    sync::Arc,
    thread::{self, sleep},
    time::Duration,
};

use ssh2::Session;

fn main() {
    let mut session = Session::new().unwrap();
    session.set_tcp_stream(TcpStream::connect("<SERVER-IP>:22").unwrap());
    session.handshake().unwrap();

    let private_key = PathBuf::from("/whatever/path");
    session
        .userauth_pubkey_file("<SERVER-USER>", None, &private_key, None)
        .unwrap();

    let session = Arc::new(session);

    thread::scope(|scope| loop {
        println!();
        scope.spawn(|| send_request(&session, "A", 8001, "a"));
        scope.spawn(|| send_request(&session, "A", 8001, "a"));
        scope.spawn(|| send_request(&session, "B", 8002, "b"));
        sleep(Duration::from_secs(5));
    });
}

fn send_request(session: &Session, name: &'static str, port: u16, expected: &str) {
    let mut stream = session
        .channel_direct_tcpip("127.0.0.1", port, None)
        .expect("couldn't open direct-tcpip channel");

    stream
        .write_all("GET / HTTP/1.0\r\n\r\n".as_bytes())
        .unwrap();

    let mut buf = String::new();
    stream
        .read_to_string(&mut buf)
        .expect("failed to read response");
    let res = buf.split_once("\r\n\r\n").expect("got invalid response").1;
    eprintln!("{name}: {res}");
    assert_eq!(res, expected);
}

Expected

You get a stream of output where each A line has a response of a and each B line has a response of b.

Actual

A: a
B: b
thread '<unnamed>' panicked at src/bin/sync.rs:44:10:
failed to read response: Custom { kind: Other, error: "transport read" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@ZiCog
Copy link

ZiCog commented Aug 19, 2024

I have been seeing that error and others using ssh2-rs v0.9.4. Running on Nvidea Jetson with Ubuntu. The errors I have seen are:

Assertion failed: (session), function _libssh2_channel_free, file channel.c, line 2736.
rust_ssh_test(93700,0x16c13f000) malloc: *** error for object 0x6000030e75c0: pointer being freed was not allocated
rust_ssh_test(93700,0x16c13f000) malloc: *** set a breakpoint in malloc_error_break to debug
161101506:malloc_consolidate(): invalid chunk size

My program is a bit big to present here. Basically similar. It gets a season to a server then opens four channels used by 4 sync scoped threads. It also catches SIGTERM and uses that to tell the threads to exit via a crossbeam channel message.

I have a "chaos monkey" program that runs that program and then requests it to shutdown with SIGTERM after a random delay of one to 10 seconds.

These kind of errors have shown up every few thousand runs. Always around the point where the threads are exiting and the channels dropped.

I'll try adding session.close().unwrap() to see if that helps.

Edit: Oh wait, there is no such thing as session.close(). I will try with adding:

    channel.send_eof()?;
    channel.wait_eof()?;
    channel.close()?;
    channel.wait_close()?;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants