Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modules #2121

Closed
wants to merge 4 commits into from
Closed

Modules #2121

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions 0000-modules/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
- Feature Name: modules
- Start Date: 2017-08-07
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

This is a redesign of the Rust module system, intended to improve its
ergonomics, learnability, and locality of reasoning. Because this is a
relatively large proposal, it has been broken into multiple text files.

# Table of Contents

* **[Motivation][motivation]** - why we propose to make this change
* **[Overview][overview]** - a high level overview of what it will be like to
use the new system as a whole
* **Detailed design** - the details of the proposal, broken into multiple
sections:
* **[Loading Files][loading-files]**
* **[The `local` keyword][local]**
* **[Use, mod, and export][use-mod-export]**
* **[Migration][migration]** - this proposal involves migrating from one system
to another, and this section describes it in detail.

Each of the detailed design subsections contains its own description of
drawbacks and alternatives.

[motivation]: motivation.md
[overview]: overview.md
[loading-files]: detailed-design/loading-files.md
[local]: detailed-design/local.md
[use-mod-export]: detailed-design/use-mod-export.md
[migration]: migration.md
231 changes: 231 additions & 0 deletions 0000-modules/detailed-design/loading-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
# Loading Files

When building a Rust project, rustc will load and parse some files as Rust code
in addition to the root module. These will be used to construct a module tree.
By default, cargo will generate the list of files to load in this way for you,
though you can generate such a list yourself and specify it in your
`Cargo.toml`, or you can generate the list in another way for your non-cargo
build system.

This eliminates the need to write `mod` statements to add new files to your
project. Instead, files will be picked up automatically as a part of your
module hierarchy.

## Detailed design

### Processing the `--modules` list (rustc)

rustc takes a new argument called `modules`, which takes a space separated list
of files. Each file will be treated as a module, and rustc will attempt to open
and parse every file listed, reporting errors if it is unable to. It will mount
these files as a tree of Rust modules using rules which mirror the current
rules for looking up modules.

It will not attempt to open or parse files where:

* The file name is not a valid Rust identifier followed by `.rs`.
* The file is not in a subdirectory of the directory containing the root
module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how this "subdirectory of" check is implemented it could get pretty tricky:

  • Do the prefixes given to --modules have to match? If not:
  • How are paths normalised? How are soft/hard links handled?

In most cases this probably does not matter, but it may still be worth considering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are implementation details that don't need to be decided at the RFC stage. This is just a sanity check that we can transform the directory into paths.

* Any of the subdirectories of the root module in the path to this file are not
valid Rust identifiers.

Cargo's default system will not pass any files that would be ignored by these
conditions, but if they are passed by some other system, they are ignored
regardless. For example, in a cargo managed crate with no dependencies, this
would be a valid way to invoke rustc by hand:

```
rustc src/lib.rs --modules src/*.rs src/**/*.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For clarity, I think this should either be --modules "src/*.rs src/**/*.rs" or --modules src/*.rs --modules src/**/*.rs (otherwise src/**/*.rs would be a "free" argument).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example, are the arguments expanded by rustc or the shell? If they are expanded by rustc, I think the reasoning behind the following statement from the alternatives makes less sense:

We could also put the file-lookup in rustc, instead of cargo, and have rustc perform its own directory walk. We believe this would be a bad choice of layering.

Since rustc is already doing a directory walk, why not have only rustc do the directory walk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no they are expanded by the shell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be pointed out in the RFC IMO. I've stumbled across the same question when reading the rendered doc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@withoutboats Expansion by shell would never work since there is hard limit on the number arguments you can pass to a program. 128KB worth or 1/4 of your stack size.

```

Rust will mount files as modules using these rules:

* If a file is named `mod.rs`, it will mount it as the module for the name of
directory which contains it (the directory containing the root module cannot
contain a `mod.rs` file; this is an error).
* Otherwise, it will mount it at a module with the name of the file prior to
the `.rs`.

All modules mounted this way are visible to the entire crate, but are not (by
default) visible in the external API of this crate.

If, during parsing, a `mod` statement is encountered which would cause Rust to
load a file which was a part of the `--modules` list, this statement will be
used to control the visibility of that module. If the module was not a part of
the `--modules` list, it will be loaded in the same way that it is loaded
today.

If a module is mounted multiple times, or there are multiple possible files
which could define a module, that continues to be an error.

Another result of this design is that the naming convention becomes slightly
more flexible. Prior to this RFC, if a module file is going to have submodule
files, it must be located at `mod.rs` in the directory containing those
submodules - e.g. `src/foo/mod.rs`. As a result of this RFC, users can instead
locate it at `src/foo.rs`, but still have submodules in the `foo` directory.
Some users have requested this functionality because their tooling does not
easily support distinguishing files with the same name, such as all of their
`mod.rs` files.

In fact, in this design, it is not necessary to have a `foo.rs` or `foo/mod.rs`
in order to have modules in the `foo` directory. Without such a file, `foo`
will just have no items in it other than the automatically loaded submodules.
For example:

```
/foo
bar.rs
baz.rs
lib.rs
```

This mounts a submodule `foo` with two items in it: submodules `bar` and `baz`.
There is no compiler error.

#### The `#[ignore]` attribute

Additinally, modules can be annotated with the `ignore` attribute. This
attribute will be treated as a kind of unsatisfiable cfg attribute - a module
tagged `#[ignore]` will not be compiled.

The ignore attribute can take any number of attribute arguments, which are
paths. These are relative paths to items (usually modules) which should be
ignored. Without an argument, `#[ignore]` is `#[ignore(self)]`. But you could
also write:

```rust
#![ignore(foo, bar::baz)]
```

To ignore both `foo` and `bar::baz` submodules of this module, and all of their
submodules.

### Gathering the `--modules` list (cargo)

#### Library and binary crates

When building a crate, cargo will collect a list of paths to pass to rustc's
`--modules` argument. It will only gather files for which the file name
has the form of a valid Rust identifier, followed by the characters `.rs`.

cargo will recursively walk the directory tree, gathering all appropriate
files, beginning with the directory which contains the crate root file. It will
ignore these files and directories:

* The crate root file itself.
* Any directory with a name which is not a valid Rust identifier.
* If the crate root is in the `src` subdirectory of the Cargo manifest
directory, and there is a directory called `src/bin`, cargo will ignore that
subdirectory.

In short, cargo will include all appropriately named files inside the directory
which contains the crate root, except that it will ignore the `src/bin`
directory.

Packages containing multiple crates which wish to use the default module list
will need to make sure that they do not have multiple crates rooted in the same
directory, or within a subdirectory of another crate. The most likely
problematic crates today are those which have both a `src/lib.rs` and a
`src/main.rs`. We recommend those crates move their binary crate to the
`src/bin` directory solution.

While gathering the default module list, cargo will determine if any other
crate is rooted in a directory which would be collected by the default module
list, and will instead not pass a `--modules` list and issue a warning in
that case, informing users that they need to rearrange their crates or provide
a list of modules themselves.

(**Note:** These projects will receive a warning, but will not be broken,
because the `mod` statements they already contain will continue to pick up
files.)

#### Tests, examples, and benchmarks

Test, example, and benchmark crates follow a different set of rules. If the
crate is located in the appropriate top-level directory (`tests`, `examples`,
and so on), no `--modules` list will be collected by default. However,
subdirectories of these directories will be treated as individual binary
crates: a `main.rs` file will be treated as the root module, and all other
appropriately named files will be passed as `--modules`, using the same
rules described above.

So if you have an examples directory like this:

```
examples/
foo.rs
bar/
main.rs
baz.rs
```

This contains two examples, a `foo` example and a `bar` example, and the `bar`
crate will have `baz.rs` as a submodule.

The reason for this is that today, cargo will treat every file in `tests`,
`examples`, and `benches` as independent crates, which is a well-motivated
design. Usually, these are small enough that a single file makes sense.
However, today, cargo does not make it particularly easy to have tests,
examples, or benchmarks that are multiple files. This design will create a
pattern to enable users to do this.

#### The `load-modules` target flag

Target items in the Cargo.toml have a `load-modules` flag, which is set to true
by default. Setting it to false causes cargo not to pass a `--modules` list at
all.

For example, a crate with just a library that does not want cargo to calculate
a modules list would have a toml like this:

```toml
[package]
name = "foo"
authors = ["Without Boats <[email protected]>"]
version = "1.0.0"

[lib]
load-modules = false
```

In practice, setting this flag to false will make mod statements necessary for
loading additional files in the project.

## Drawbacks

The RFC authors believe that making mod statements unnecessary is a *net* win,
but we must acknowledge that it is not a *pure* win. There are several
advantages that mod statements bring which will not be fully replicated in the
new system.

Some workflows have been convenienced by the fact that statements need to be
added to the source code to add new modules to files. For example, it makes it
easier for users to leave their src directories a little bit dirty while
working, such as through an incomplete `git stash`. If users wish to comment
out a module, it can be easier to comment out the `mod` statement than to
comment out the module file. In general, it enables users to leave code which
would not compile in their src directory without explicitly commenting it out.

Some users have expressed strong concerns that by deriving the module structure
from the file system, without making additional syntactic statements, they will
not be able to as easily find the information they need to navigate and
comprehend the codebases they are reading or working on. To partly ease their
concern, the RFC allows users to explicitly specify their module lists at the
build layer, instead of the source layer. This has some disadvantages, in that
users may prefer to not have to open the build configuration either.

This will involve migrating users away from `mod` statements toward the new
system.

## Alternatives

An alternative is to do nothing, and continue to use `mod` statements.

We could also put the file-lookup in rustc, instead of cargo, and have rustc
perform its own directory walk. We believe this would be a bad choice of
layering.

During the design process, we considered other, more radical redesigns, such as
making all files "inlined" into their directory and/or using the filename to
determine the visibility of a module. We've decided not to steps that are this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided not to steps

radical right now.
Loading