Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the run.rs scripts of our dataflow examples using xshell #495

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

phil-opp
Copy link
Collaborator

@phil-opp phil-opp commented May 2, 2024

The xshell crate provides a more convenient way to build and run a Command, which is more similar to traditional bash scripts. Using this crate, we can simplify our run.rs scripts while still being platform-independent and not requiring any external dependencies. We can also still run the examples using cargo run --example.

Alternative to #454 .

…hell`

The `xshell` crate provides a more convenient way to build and run a `Command`, which is more similar to traditional bash scripts. Using this crate, we can simplify our `run.rs` script while still being platform-independent and not requiring any external dependencies. We can also still run the examples using `cargo run --example`.
@phil-opp phil-opp changed the base branch from main to robust-up May 2, 2024 11:37
@phil-opp phil-opp changed the title Simplify the run.rs script of the rust-dataflow example using xshell Simplify the run.rs scripts of our dataflow examples using xshell May 2, 2024
We don't modify the PATH for the current executable anymore, so we cannot use the `get_python_path`/`get_pip_path` functions anymore.
Base automatically changed from robust-up to main May 3, 2024 09:30
@phil-opp
Copy link
Collaborator Author

Do we want to proceed with this approach? Personally, I think this is a considerable improvement because it changes the previous Command builder code to something that resembles normal shell commands:

From:

let cargo = std::env::var("CARGO").unwrap();
    let mut cmd = tokio::process::Command::new(&cargo);
    cmd.arg("run");
    cmd.arg("--package").arg("dora-cli");
    cmd.arg("--").arg("build").arg(dataflow);
    if !cmd.status().await?.success() {
        bail!("failed to build dataflow");
    };

To:

let dora = prepare_dora(&sh)?;

cmd!(sh, "{dora} build dataflow.yml").run()?;

Possible alternatives:

  • Remove the run.rs completely and only explain the build instructions in the README.
    • Drawback: No more automated testing on CI.
    • Drawback: No cargo run --example <example_name>
  • Use a some cross-platform scripting language such as just or nu to create a run.sh
    • Drawback: Users need to install and learn a third-party tool to run our examples
    • Automated CI testing and cargo run --example <name> support still require a run.rs wrapper
  • Use a standard Unix tool for building, such as a Makefile or bash script.
    • Simplifies Python venv_ setup as we can use the standard activate script
    • Drawback: Windows users need to install Unix tools, which can be difficult

@haixuanTao
Copy link
Collaborator

Looks good to me !

@phil-opp
Copy link
Collaborator Author

@haixuanTao I get a ModuleNotFoundError: No module named 'dora' error again with the new python-operator-dataflow example. The venv is created before dora up, so it doesn't seem like the same issue as #491 (comment). Any idea what it could be?

@phil-opp
Copy link
Collaborator Author

We decided to wait a bit with this PR until #551 is ready. We hope that we can create a proper integration test framework then instead of using the example as integration tests. Perhaps we can find a even better solution for running the examples then (after the testing part is moved to the test framework).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants