Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore missing workspace members when building #14566

Open
jephthia opened this issue Sep 19, 2024 · 10 comments
Open

Option to ignore missing workspace members when building #14566

jephthia opened this issue Sep 19, 2024 · 10 comments
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.

Comments

@jephthia
Copy link

Problem

Given the following virtual workspace:

[workspace]
resolver = "2"

members = [
  "apps/bin1",
  "apps/bin2",
  "apps/bin3",
  "libs/libA",
  "libs/libB",
  ...
]

Building bin1 with cargo build -p bin1 should be allowed even if bin2 is missing from the file system.

Use case: In docker when building a binary, only the relevant files are sent to the build context. For example, consider the following ignore file for building bin1

# Exclude everything
*

# Include relevant files only
!apps/bin1
!libs
!Cargo.toml
!Cargo.lock

The binary is then built with:
cargo build -p bin1 --locked --release

This fails with error: failed to load manifest for workspace member /code/app/bin2
because the other binaries which are members of the workspace aren't present, but they are irrelevant to this build so should not be needed.

Proposed Solution

An option to prevent a build from failing because of a missing member.
This could be done either:

through a flag:
cargo build -p bin1 --ignore-missing-members (with a better flag name)

or directly in the manifest?

[workspace]
ignore-missing-members = true

Notes

No response

@jephthia jephthia added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels Sep 19, 2024
@weihanglo
Copy link
Member

Every member is relevant and contributes to the workspace's Cargo.lock. Removing any of them may affect the dependency resolution.

Similar feature requests have been made:

and #6179 is the major one tracking it.

That being said, this docker use case is slightly different from others. Could you expand a bit about your workflow? Also wonder where the ignore file came from.

@weihanglo weihanglo added S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request. A-workspaces Area: workspaces and removed S-triage Status: This issue is waiting on initial triage. labels Sep 19, 2024
@epage
Copy link
Contributor

epage commented Sep 19, 2024

Most of those conditional ones are for dealing with targets, cfgs, etc. This is related to docker caching from what I understoof of the text. Docker fingerpints the inputs to a layer and rebuilds if they change. If you have a layer that is meant for only a subset of your dependency tree, then you only want that fingerprinted. This is more similar to #2644 but in the workspace, rather than dealing with dependencies.

https://crates.io/crates/cargo-chef is the primary place for docker layer caching experiments. I'd recommend seeing if that meets or your needs or if it could be made to meet your needs.

@jephthia
Copy link
Author

That being said, this docker use case is slightly different from others. Could you expand a bit about your workflow? Also wonder where the ignore file came from.

Let's say we have a virtual workspace with just two binaries, server1 and server2

members = [
  "apps/server1",
  "apps/server2",
]

and the following file structure:

project/
├── apps/
│   ├── server1/
│   │   ├── src/
│   │   │   └── main.rs
│   │   └── Cargo.toml
│   └── server2/
│       ├── src/
│       │   └── main.rs
│       └── Cargo.toml
├── config/
│   └── docker/
│       ├── dev/
│       └── prod/
│           ├── Server1.Dockerfile
│           ├── Server1.Dockerfile.dockerignore
│           ├── Server2.Dockerfile
│           └── Server2.Dockerfile.dockerignore
├── Cargo.toml
├── Cargo.lock
└── docker-compose.yaml

Docker is used to build and run these binaries, the first workflow is for production and the second is for local development (both suffer from the same issue)

To build for production the following is used:
File: config/docker/prod/Server1.Dockerfile

FROM rust:1.80.1-bullseye AS build

RUN --mount=type=bind,source=apps/server1,target=apps/server1,rw \
    --mount=type=bind,source=Cargo.toml,target=Cargo.toml \
    --mount=type=bind,source=Cargo.lock,target=Cargo.lock \
    --mount=type=cache,target=target/ \
    --mount=type=cache,target=/usr/local/cargo/registry/ \
    <<EOF
set -e
cargo build -p server1 --locked --release <--- fails: "missing server2 member"
EOF

...

File: config/docker/prod/Server1.Dockerfile.dockerignore

# Ignore everything
*

# Include relevant files only
!apps/server1
!Cargo.toml
!Cargo.lock

Attempting to build this with
docker build -t server1 -f config/docker/prod/Server1.Dockerfile .
will fail since the source of server2 aren't included in the build context.

For the local development workflow, it's similar but with docker-compose.yaml being used instead:

  server1:
    build:
      context: .
      dockerfile: config/docker/dev/Server1.Dockerfile
    working_dir: /code
    volumes:
      - ./apps/server1:/code/apps/server1
      - ./Cargo.toml:/code/Cargo.toml
      - ./Cargo.lock:/code/Cargo.lock
      - server1_cargo_target:/code/target/
      - server1_cargo:/usr/local/cargo
    ports:
      - "8080:8080"
    command: ["cargo", "watch", "-x", "run", "-p", "server1"] <--- Fails: missing server2 member

Running docker compose up server1 fails.


This is related to docker caching

Ah caching isn't the concern for this, (we've long given up on getting caching to work correctly in CI 😅 at least for now)
We can accept slower builds, as long as it at least builds, but with this issue, the builds aren't even possible.

This came up because our current setup does not make use of cargo workspaces and instead treats each binary separately with their own target/ and Cargo.lock but as the monorepo grew, this became less than ideal so we decided to switch to cargo workspaces but hit this issue.

Are there issues that we're not considering that would come out of building a specific binary in a workspace without the presence of other binaries? Our thinking was that this wouldn't cause issues since each binary is unrelated to the other.

https://crates.io/crates/cargo-chef is the primary place for docker layer caching experiments. I'd recommend seeing if that meets or your needs or if it could be made to meet your needs.

Since caching isn't the issue for this we're a bit hesitant on taking on a dependency to solve this. As a workaround we're currently thinking of having a config/cargos/ directory that would store different Cargo.toml files for each necessary binary (e.g Cargo-server1.toml, Cargo-server2.toml) and during the mount, we'd pick the specific Cargo-xyz.toml needed for the build. But this extra maintenance would feel like a hack that goes against the point of a workspace.

@epage
Copy link
Contributor

epage commented Sep 19, 2024

So this would also be a problem for caching iiuc, even if that isn't your problem.

This came up because our current setup does not make use of cargo workspaces and instead treats each binary separately with their own target/ and Cargo.lock but as the monorepo grew, this became less than ideal so we decided to switch to cargo workspaces but hit this issue.

Maybe I missed it but could you expand on why you don't mount the entire source?

@jephthia
Copy link
Author

So this would also be a problem for caching iiuc, even if that isn't your problem.

Hmm yes, most likely

Maybe I missed it but could you expand on why you don't mount the entire source?

This is a recommended docker best practice as far as I know, we have a monorepo which contains many unrelated things, android app, ios app, non-rust libraries, many rust binaries, etc. Our common workflow is to simply exclude everything and only include the project being built.

From: https://docs.docker.com/build/concepts/context/#dockerignore-files

This helps avoid sending unwanted files and directories to the builder, improving build speed, especially when using a remote builder.

@Rusty-Weasel
Copy link

Rusty-Weasel commented Sep 20, 2024

why u do not use thinks like?

[workspace]
resolver = "2"

members = [
    "apps/*",
    "libs/*",
]

If no folder exists, then there is no member,
if folder exists, then there is a member.

@jephthia
Copy link
Author

why u do not use thinks like?

We have a monorepo which contains many unrelated things, android app, ios app, non-rust libraries, many rust binaries, etc.

[workspace]
resolver = "2"

members = [
   "apps/*",
   "libs/*",
]

This works when you have a simple flat structure and "rust only" apps/libs.
This fails as soon as you have anything else, non-rust directories that do not contain a Cargo.toml, nested structure, etc, which then requires finding the right combination of glob patterns to exclude which isn't even possible right now.

It is much simpler and easier to list all members directly which is how cargo new works by default, it automatically adds the new member to the list so we don't even need to do this ourselves.

@epage
Copy link
Contributor

epage commented Sep 20, 2024

non-rust directories that do not contain a Cargo.toml

I wonder if we should loosen this restriction. #4593 is the issue for this and #11405 is wider discussion on this feature.

nested structure

If we had #4593, would this then be resolved? Would anything in #11405 help?

It is much simpler and easier to list all members directly which is how cargo new works by default, it automatically adds the new member to the list so we don't even need to do this ourselves.

But if you use globs, then that isn't a problem anyways.

@jephthia
Copy link
Author

If we had #4593, would this then be resolved? Would anything in #11405 help?

If #4593 was resolved with the addition of #6009 to exclude cargo directories when needed then yes using the glob approach would work. But for us to consider that approach a "clean" solution, we'd then want a way to disable the default behavior of cargo new because right now whenever we add new binaries or libraries with cargo new the member is automatically added to the list with its full path.

i.e, if we had

members = [
    "apps/*",
    "libs/*",
]

and ran cargo new apps/binA we would end up with

members = [
    "apps/*",
    "libs/*",
    "apps/binA",
]

each time something is added, and so you'd now have to remember to go to the root Cargo.toml and remove the new entry that was added which would make this approach feel slightly "broken" and we can imagine developers coming to the project (especially if they are new to rust) wondering why we don't just use the default approach of letting cargo new add the new member as a full path.

This is what we're used to in other worlds like yarn workspaces (javascript) where a workspace such as:
File; package.json

{
    "private": true,
    "workspaces": [
        "apps/web_admin",
        "doesnt_exist"
    ]
}

can still be built even if a member isn't present.

May I ask what is the rationale behind the restriction of refusing to build if a listed member is not present?
If the member is missing but not referenced in this build wouldn't it just be a no-op, whereas,
if the member is missing but referenced, the build would end up failing appropriately once it gets to the place where the member is referenced but cannot be found.

i.e, isn't the outcome the expected one in either case? Is there something we're not considering?

@Rusty-Weasel
Copy link

Is there something we're not considering?

yes, this -> We have a monorepo which contains many unrelated things, android app, ios app, non-rust libraries, many rust binaries, etc.

On the one hand, you wish to have a clean solution,
on the other hand you expect things from cargo for what it is not build.

Cargo is for rust and it's structures, and not there to handle other unrelated things outside of the scope of rust.

When u mix different programming langugage tools/package managers/projects/etc it is all time a challenge,
if they are not cleanly seperated and configs proper adjusted, as example:

IDE: RustRover + IntelliJ
Repo: mono
Difficulty: 8/9 of 10 😄
Fun Level: 👎

Rust workspaces with rust libs/bins + Flutter (android/ios apps) + vcpgk + docker + github workflow can work smoothly together,
if you manually take care of different points at many places.

Then u got a monorep with rust + dart + kotlin + c/c++ and perhaps some usefull python scripts,
but this all is not a cargo issue, its a project structure or tools compatiblity issue.

As more different tools or languages you have in use, as higher the difficulty factor grows up.

My best solution for this is to reorganise your monorepo and perhaps to adjust many configs,
to get a clean and compatible structure, thats in my eyes, your real challenge

we'd then want a way to disable the default behavior of cargo new because right now whenever we add new binaries or libraries with cargo new the member is automatically added to the list with its full path.

with RustRover the problem does not occur, which ide do you use? or are you using vim/neovim?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.
Projects
None yet
Development

No branches or pull requests

4 participants