Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy: Use of Minirust/DSLs to specify Semantics #55

Open
chorman0773 opened this issue Apr 19, 2024 · 14 comments
Open

Policy: Use of Minirust/DSLs to specify Semantics #55

chorman0773 opened this issue Apr 19, 2024 · 14 comments
Labels
C-meta Category: Meta discussion about the repository itself. We should refine each use of the policy label

Comments

@chorman0773
Copy link
Contributor

I'd like for us to have a full discussion about the use of Minirust and other DSLs as a normative part of the specification - how much of the spec should be written in Minirust or another DSL vs. prose, and when it's allowed to be used in place of prose.

The Operational Semantics Team (or members thereof) should be involved in this discussion as applied to the Dynamic Semantics chapter.

@JoelMarcey JoelMarcey added the C-meta Category: Meta discussion about the repository itself. We should refine each use of the policy label label Apr 24, 2024
@mattheww
Copy link

If MiniRust ends up being the authoritative description of the operational semantics, that leaves the question of what English text will accompany it.

At one end of the scale we could be telling users of the spec that they're expected to read through MiniRust's code, and there's a little natural language commentary to guide that and supply a few non-normative notes.

At the other end of the scale there could be a complete English restatement of what MiniRust's code does, with any divergence between them treated as a bug in the spec.

@joshlf
Copy link

joshlf commented Apr 26, 2024

My 2 cents: In my experience, the English prose used in the standard library documentation and Reference has historically suffered from a lack of agreement about what terms actually mean. [1] Worse, it's often not clear whether an author has intended to use an English word in its colloquial or technical meaning. While it's in principle possible to create a complete glossary, there's no mechanism other than human review to ensure that words are always used consistent with that glossary. By contrast, code specifications are clear with respect to intent. There's no ambiguity about whether a particular term is meant in its colloquial or technical meaning since there is no colloquial meaning.

For that reason, I'd personally prefer that whatever English prose accompanies the MiniRust be non-normative. That doesn't have any bearing on how much prose there is, just that the MiniRust is always the source of truth.


[1] For example, see this thread, in which multiple people intimately familiar with discussions regarding Rust's semantics realize that they don't agree on how to use the terms "live", "active", and "invalidated".

@chorman0773
Copy link
Contributor Author

We'd need normative prose to define Minirust in the first place, and anything that we can't use Minirust to define.

@joshlf
Copy link

joshlf commented Apr 26, 2024

I take it MiniRust is not yet feature-complete in the sense that the entire spec could not be written in terms of MiniRust? Is the intention for MiniRust to eventually get to that point in the future?

@chorman0773
Copy link
Contributor Author

The entire spec can't even be written in Minirust - Minirust is a tool for expressing the operations on the abstract machine performed by a fully parameterized Rust program. It doesn't handle, for example, the whether or not an input rust program is well-formed. We also need prose to express the relationship between a given Rust program, and the corresponding Minirust program.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2024

I think there's an appeal to the approach of having English text and a formal DSL (a-mir-formality / MiniRust) side-by-side. This takes some pressure off of the English text, meaning we can lean a bit more on the side of readability, while also reducing ambiguity. Divergence between the two should be treated as a spec bug.

There's also various ways to incorporate light-weight DSLs for other parts of the spec, e.g. I think that #15 could benefit quite a bit from defining terms like "representation" and "layout" in a more formal way. Even just specifying, e.g., that "representation relation" is a function of type fn(representation: List<AbstractByte>, value: Value) -> bool, e.g. a relation between (byte-level) representations and (abstract) values, already helps a lot. In particular, it forces one to define what AbstractByte and Value are. Even if the concrete representation relations are then described purely in prose, having these key concepts defined more formally is a huge step up from e.g. C and C++ where such basic terms are still the subject of disagreement and debate.

@mattheww
Copy link

mattheww commented May 1, 2024

Is MiniRust ready for serious work on that sort of English text, or does it need to cook a bit more first?

@RalfJung
Copy link
Member

RalfJung commented May 2, 2024

Unsized types are still missing, I hope to have a student working on that later this year. They might require some far-reaching changes. Other than that I think the core structure of MiniRust is pretty solid now.

I don't have any idea how to best manage the side-by-side with the English text though. MiniRust is written in "literate programming" style to support mixing it with text, but I am not sure that's the best approach -- in particular, the Ferrocene spec is extremely good at making everything linkable, and I don't know how to do the same with a mix of code and text like we may have here.

@mattheww
Copy link

Another part of the question is what to do about documenting the pseudo-Rust that MiniRust is written in.

I think that's not as lucid as it once was (for example, there's now a call to an extract() method in the encode definition for Enums that doesn't seem to be defined or described anywhere).

@chorman0773
Copy link
Contributor Author

So, FTR, I'd expect Minirust itself would be defined in prose - we'd then use the Minirust Language to explain the semantics of a Rust program.

@RalfJung
Copy link
Member

RalfJung commented May 21, 2024

I think that's not as lucid as it once was (for example, there's now a call to an extract() method in the encode definition for Enums that doesn't seem to be defined or described anywhere).

Good catch! That's a specr-transpile bug. I changed the code a bit to avoid the need for extract.

@chorman0773 I sure hope that Rust will have an authoritative machine-readable spec for its core language, to complement the prose. Prose is just too ambiguous as the sole source of truth, as experience confirms time and time again.

@digama0
Copy link

digama0 commented May 21, 2024

@chorman0773 I sure hope that Rust will have an authoritative machine-readable spec for its core language, to complement the prose. Prose is just too ambiguous as the sole source of truth, as experience confirms time and time again.

While I agree to an extent, there is a classic bootstrapping issue here: if you use a formal spec language to define the language, then you need another spec to define the formal spec language... You have to either bottom out at prose or hope that the root is "obvious enough as to not need explanation", and I'm not sure we can really get the latter. (You have the same issue with prose specs; usually this is addressed with some kind of initial "term definition" prologue but ultimately still boils down to natural language that the user is expected to know already.)

The fact that specr is so similar to rust is a downside here, as it means that specifying it may not be significantly simpler than specifying rust, so you won't have made progress toward the root of the bootstrap chain.

@chorman0773
Copy link
Contributor Author

And at some point, specifying a completely different language which is then used to specify Rust becomes more writing effort than just specifying Rust directly.

@RalfJung
Copy link
Member

specr is a small subset of Rust, in particular a subset without any of the nasty questions around evaluation order, drop, or UB. So I think it is fair to say that be expressing our intend in prose and in specr, we have made significant progress towards the bootstrap root.

I'm not saying that should replace the English text that explains what happens, but I think complementing the English prose with something less ambiguous would be extremely valuable.

So, specr doesn't need to be "obvious enough as to not need explanation", it just needs to be "obvious enough as to not need explanation when there's also English text next to the code that says what the code does".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-meta Category: Meta discussion about the repository itself. We should refine each use of the policy label
Development

No branches or pull requests

6 participants