Skip to content

Commit

Permalink
Fixes to fn pointer casts. Updated QUICKSTART.
Browse files Browse the repository at this point in the history
  • Loading branch information
FractalFir committed Aug 5, 2024
1 parent 728788e commit a91f881
Show file tree
Hide file tree
Showing 8 changed files with 94 additions and 243 deletions.
251 changes: 45 additions & 206 deletions QUICKSTART.md
Original file line number Diff line number Diff line change
@@ -1,237 +1,76 @@
This guide will show you how to:

1. Build the project
2. Use the project to build a `Hello Wolrd!` example.
3. Show how to debug/catch miscompilations
4. Explain some inner workings of the project

# Building the project
## Install dependencies
The project has only 2 external dependencies, which usually come packaged together.
1. ILASM - necessary for the project to be able to produce .NET assemblies. Both the `mono` and `dotnet` flavors are supported. Should be included with your runtime.
2. The .NET runtime - needed for running tests and assemblies. Preferably the newest version, install from [here](https://dotnet.microsoft.com/en-us/download).

The older `mono` runtime does not support 128 bit ints, so they + debug builds will not work.

## Update `rustc`

The project will ONLY work with the newest, nightly `rustc`. Run `rustup update` to get the newest `rustc` version.

## Switch to nightly rust

The codegen can only be used in nightly rust.

Run `rustup default nightly` to switch to nightly rustc.

## Download the project

Run `git clone [email protected]:FractalFir/rustc_codegen_clr.git --depth 1` to download the newest version of the project.
Using `git` makes it easier to update.

## Run `cargo test`

This will automatically build all the components of the project: the codegen and the linker. It will also enable you to quickly check if everything is working correctly. As of the writing of this guide, 62 tests should pass. This number may go up and
down as the project develops, but if more than 50 tests pass, the build process succeeded.

# Building a `Hello World!` example

## Enabling the codegen

Currently, the project must be enabled for each shell session(command prompt). This means that your Rust installation is not changed in any way: the project is enabled temporarly, and will be disabled when you close your shell(command prompt window).

To start using the project, run this command: `cargo run --bin rustflags`. This will run a small utility program, which will tell you the proper configuration for your system.

You should see a couple of helpful instructions, and after the phrase:
Clone the repo using this command:
```
You may use the following command to quickly set the required environment variables:
git clone [email protected]:FractalFir/rustc_codegen_clr.git
```
you should see a command, which when typed into your terminal(command prompt) will enable the project. From now, when you run `cargo build` or `cargo run` in that particular shell sesion(window), the codegen will be invoked, and will compile your
Rust code for .NET. This change will not affect any other shell session(command prompt window), and will be undone as soon as you close the window.

## Creating a `hello world!` example

Run

`cargo new rust_dotnet_example`

to create the example project.

Create a directory named `.cargo` inside `rust_dotnet_example`. Inside `.cargo`, create a file named `config.toml`. This will be used tell cargo to build a .NET version of the standard library.
Open `config.toml` and set its contents to the following:

```toml
[build]
# Keep it as "x86_64-unknown-linux-gnu", there is currently no .NET-specific target triple.
target = "x86_64-unknown-linux-gnu"
[unstable]
# Will build `core`,`alloc` and `std` for .NET. "panic_abort" disables unwinding.
build-std = ["core","alloc","std","panic_abort"]
## Instal the newest nightly rust
```

This example will not use the Rust/C# interop layer, so we will be using unsafe functions.

Set the contents of `main.rs` to:

```rust
extern "C" {
fn puts(msg: *const u8);
}
fn main() {
// A heap-allocated string!
let mut string = String::with_capacity(100);
string.push('H');
string.push('e');
string.push('l');
string.push('l');
string.push('o');
string.push('.');
string.push('\n');
string.push('\0');
unsafe{puts(string.as_ptr())};
// String literals work too.
unsafe{puts("Rust + NET = <3!\n\0".as_ptr())};
}
rustup install nightly
```
## Running the project

*Make sure the project is enabled in this console window.*
On Linux, just type:

`cargo run`

You should see a bunch of error/warning/debug messages appear. They are expected, and most of them serve as reminders about things that could cause issues in the future.

On other unix-es, `cargo run` should just work too. On windows, `cargo run` may(?) work, but I it probably will not. Just search for the `.exe` file within the `target` directory, and run it by typing `dotnet C:/PATH_TO_YOUR_FILE_HERE/FILE.exe`.
## Note about std and core

While a large chunk of `core` works, most of `std` does not. Using almost any std function will cause the final executable to crash.

## Interop capabilites
Run `cargo add mycorrhiza`. This should add the `mycorrhiza` interop layer to your project. You can replace `main.rs` with:

```rust
fn main() {
// C#'s String Builder!
let sb = mycorrhiza::system::text::StringBuilder::empty();
// We assemble a managed string(C# String)
sb.append_char('H');
sb.append_char('e');
sb.append_char('l');
sb.append_char('l');
sb.append_char('o');
sb.append_char(' ');
sb.append_char('W');
sb.append_char('o');
sb.append_char('r');
sb.append_char('l');
sb.append_char('d');
sb.append_char('!');
// Calls `StringBuilder.ToString()`
let mstr = sb.to_mstring();
// Calls `Console.WriteLine()`
mycorrhiza::system::console::Console::writeln_string(mstr);
}
## Check that the project is complatible with the version of Rust you installed.
```
cargo check
```
If the project does not compile due to error messages, try updating your rust version.
```
rustup update
```
After you run the project again, you should see `Hello World!` appear on the screen.

This example uses the builtin interop types, but you can call ANY C# method, and store references to ANY C# type.
## Writing your own interop code: Constructors + Calling any C# function

*TODO: Showcase custom interop - method,properties, constructors, virtual functions*

# How to debug/catch miscompilations
Quite often, the assembly crated form your code will not work.
Here are some steps which can help you pinpoint the issue.
## Kinds of errors
### A statement failed to compile
If your executable throws an exception with the message: `Tired to run a statement STATEMENT which failed to compile with error message MESSAGE`, this means that the codegen could not properly compile a MIR statement.

You can search the compilation logs to get the exact cause.
### Tried to invoke missing method
This message means that the codegen could not compile a whole function. This can be a sign of either:
1. A missing intrinsic/libc function(function has a short, clear name)
2. A Rust method which used some unsupported type/calling convention(function has a long, mangled name)
3. Call target was not properly resolved(function has a long, mangled name, and is generic)

You can search the compilation logs to get the exact cause.
### Called `panic` or `abort` or a NullReferenceException is thrown
If this occurs only in code produced by `rustc_codegen_clr`, this is a sign of a miscompiled statement(a statement compiled, but not how it should). It is hard to pinpoint the cause of this kind of issue, since it may be far from the source of the
exception.
### InvalidProgramException

The codegen has compiled a function incorrectly, and the runtime can't JIT it. Usually caused by DSTs, I am working on fixing this issue.

## Stack traces and demangling

When an exception occurs, you will get a stack trace. You can see the mangled name of the function where the exception was thrown. To demangle the name:
1. Replace all occurrences of `\_ds\_`(DolarSign) with $.
2. Replace all occurrences of `\_dd\_`(DoubleDot) with ::.
3. Use rustfilt to demangle the symbol. Example:

`rustfilt _ZN3std3sys6common14small_c_string18run_path_with_cstr17hf87a5f08beda9d2bE` -> `std::sys::common::small_c_string::run_path_with_cstr`

This should enable you to pinpoint the offending function.
If this does not solve the problem, open an issue. The project is only compatible with a narrow set of `rustc` versions, and tries to always use the newest nigthly build.
## Install the .NET runtime

## Enviroment varaibles related to debugging.
This project requires the .NET runtime to be used. You can find [instalation instructions here](https://dotnet.microsoft.com/en-us/download).

You can set the following environment variables to aid you in debugging:
After installing .NET, run ` dotnet --info` to confoirm it is installed propely.

`TRACE_STATEMENTS` - prints debug info for each executed MIR statement. Useful for diagnosing miscompilations. Also disables optimization. Set to `1` or `true` to enable.
`INSERT_MIR_DEBUG_COMMENTS` - similar to `TRACE_STATEMENTS`, but the debug info will be only present in the `.il` file.
`ALWAYS_INIT_LOCALS` - Makes all local variables be zero-initialized. Set to `1` or `true` to enable.
## Install `ilasm`

After changing the environment variables, you will need to run `cargo clean` and recompile for the changes to take effect.
### Windows

# The structure of the project - Guide for contributors
`ilasm` comes [installed with Visual Studio](https://learn.microsoft.com/en-us/dotnet/framework/tools/ilasm-exe-il-assembler). So, you propably already have it. If not, install visual stuido.

### Linux or MacOS

I will first try to explain why the project is structured the way it is. By showcasing how I came to the current design, I hope to make it easy to understand how everything fits together. This design is not final, and may be adjusted as things
change.
This tool supports both the "Core" and "Mono" flavours of ILASM. While you *can* install / build `ilasm` separeately, installing the [mono runtime](https://www.mono-project.com/download/stable/) is the easiest option.

## Functional
### Checking the dependencies

The project is designed in a functional fashion: most of the functions are pure(take immutable arguments) and will always return the same output for the same input. This is especially true for the MIR-to-CIL translator. I want this process
to be highly predictable and consistent.
After you installed `dotnet` and `ilasm`, run `cargo run --bin rustflags` to check if you installed `ilasm` and `dotnet` correctly. This program will check your enviroment, build the project, and print the flags you need to use it.

This makes local variable allocation a bit trickier(it is implemented using special variants of the CILOp enum), but helps to catch bugs.
### Running tests

An exception to this purely-functional style is `TyCache` - it is taken by a mutable reference, but only to enable caching of type conversion results. A `TyCache` will always behave the same: empty or full.
As one last step, you can run `cargo test ::stable` to check if the codegen runs as expected. This can take a minute or two, but it should ensure the project is working as expected.

## Where to start looking
# Using the project

### Understanding CIL
The project is *relatively* simple to use. You will still have to do some setup to get `core` and `std` working, but it is a straigtforward process.

Understanding the project does require having a very basic familiarity with CIL. It is stack - based, which means that all the operations occur on an imaginary stack. For example, look at this method:
At the root of your crate, create a directory named `.cargo`. In this directory, create a file named `config.toml`, with the following contents.

C#
```csharp
int Add(int a,int b){
return a + b;
}
```
CIL
```cil
.method static int32 Add(int32 a,int32 b){
// Load argument 0 onto the imaginary stack
ldarg.0
// Load argument 1 onto the imaginary stack
ldarg.1
// Add the top 2 values form the imagnary stack, and push the result.
add
// Return the top value from the stack
ret
}
[build]
target = "x86_64-unknown-linux-gnu" # Change to the host target.
[unstable]
build-std = ["core","alloc","std","panic_abort"]
```
You can take a look at [this website](https://sharplab.io/). It allows you to see how C# code translates to CIL, which should help you understand CIL better.

## Simplest case: `binop.rs`
Then, run the codegen utility `rustflags` again, by running `cargo --bin rustflags --release` **in the directory `rustc_codegen_clr`**.

A good place to start looking is the `binop.rs` file. It shows how to load a MIR operand on the stack, and how to preform an arithmetic operation on it. It also shows how the project deals with types.
It should provide you with the commands necesary for enabling the codegen. They should look something like this:

## The main MIR handling code: `statement.rs`
On Linux:
```bash
export RUSTFLAGS="-Z codegen-backend=/home/USER/rustc_codegen_clr/target/release/librustc_codegen_clr.so -C linker=/home/USER/rustc_codegen_clr/target/release/linker -C link-args=--cargo-support "
```
On Windows:
```powershell
$Env:RUSTFLAGS = '-Z codegen-backend=C:\Users\USER\rustc_codegen_clr\target\release\librustc_codegen_clr.dll -C linker=\Users\USER\rustc_codegen_clr\target\release\linker.exe -C link-args=--cargo-support '
```

`statement.rs` is the function which takes a MIR statement, and calls the relevant functions to translate MIR. `rvalue.rs` is used to evaluate the right hand side of a statement.
You can then run this command in your shell session (aka. command prompt window). This command will enable the codegen for that session(command prompt window) only!
The project makes **no pernament changes** to your instaltation, and simply closing the shell session(command prompt window) will disable it.

## Any questions?
After this, simply run `cargo run` to compile & run your app inside the .NET runtime. Other cargo commands should work to. There may be quite a few error / warning messages dispalyed, but they will not stop compilation.

If you have any questions, feel free to ask them on the github discussions page. I will do my best to try and answer your questions about the project.
NOTE: the project currently does not support any `proc-macros`, due to techincal liminations. Supporting them is possible, but it is currently more effort than it is worth.
33 changes: 20 additions & 13 deletions cilly/src/asm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -192,23 +192,30 @@ impl Assembly {
);
}

let cs = method.call_site();

self.functions.insert(cs, method);
let main_module = self.inner.main_module();
let def = crate::v2::MethodDef::from_v1(&method, &mut self.inner, main_module);
self.inner.new_method(def);
}

/// Optimizes all the methods witin the assembly.
pub fn opt(&mut self) {
self.functions.iter_mut().for_each(|method| {
let (_site, method) = method;
let mut method = method.clone();
method.opt();
//crate::opt::opt_method(&mut method, self);
});
}
/// Adds a definition of a type to the assembly.
pub fn add_typedef(&mut self, type_def: TypeDef) {
super::v2::ClassDef::from_v1(&type_def, &mut self.inner);
let dotnet_type: DotnetTypeRef =
std::convert::Into::<DotnetTypeRef>::into(type_def.clone())
.with_valuetype(type_def.extends().is_none());
let cref = super::v2::ClassRef::from_v1(&dotnet_type, &mut self.inner);

let cref = self.inner.alloc_class_ref(cref);
if !self.inner().contains_def(cref) {
let def = super::v2::ClassDef::from_v1(&type_def, &mut self.inner);
assert_eq!(
*def,
cref,
"{} != {}",
self.inner.class_ref(*def).display(self.inner()),
self.inner.class_ref(cref).display(self.inner())
);
assert!(self.inner().contains_def(cref));
}
}

/// Sets the entrypoint of the assembly to the method behind `CallSite`.
Expand Down
4 changes: 0 additions & 4 deletions cilly/src/method.rs
Original file line number Diff line number Diff line change
Expand Up @@ -550,10 +550,6 @@ impl Method {
pub fn attributes(&self) -> &[Attribute] {
&self.attributes
}

pub(crate) fn set_blocks(&mut self, blocks: impl Into<Vec<BasicBlock>>) {
self.blocks = blocks.into();
}
}

/// A wrapper around mutably borrowed [`BasicBlock`]s of a method. Prevents certain bugs.
Expand Down
19 changes: 11 additions & 8 deletions cilly/src/v2/asm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use lazy_static::*;
use serde::{Deserialize, Serialize};
use std::any::type_name;
#[derive(Default, Serialize, Deserialize, Eq, PartialEq, Clone, Debug)]
struct IStringWrapper(IString);
pub(super) struct IStringWrapper(pub(super) IString);
impl std::hash::Hash for IStringWrapper {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
calculate_hash(&0xC0FE_BEEFu32).hash(state);
Expand Down Expand Up @@ -176,14 +176,13 @@ impl Assembly {
let cref = def.ref_to();
let cref = self.alloc_class_ref(cref);

if let Some(dup) = self.class_defs.insert(ClassDefIdx(cref), def.clone()) {
// TODO: this is costly, consider skipping in release.
// HACK: cvoid is doing *something* bizzare, but it seems harmless, so I ignore it I guess???
if !self.get_string(dup.name()).contains("core.ffi.c_void") {
assert_eq!(dup, def, "{}", self.get_string(dup.name()));
if self.class_defs.contains_key(&ClassDefIdx(cref)) {
if self.get_string(def.name()).contains("core.ffi.c_void") {
return ClassDefIdx(cref);
}
panic!()
}

self.class_defs.insert(ClassDefIdx(cref), def.clone());
ClassDefIdx(cref)
}
pub fn main_module(&mut self) -> ClassDefIdx {
Expand Down Expand Up @@ -623,7 +622,7 @@ impl Assembly {
let class_ref = self.alloc_class_ref(translated.ref_to());
match self.class_defs.entry(ClassDefIdx(class_ref)) {
std::collections::hash_map::Entry::Occupied(mut occupied) => {
occupied.get_mut().merge_defs(translated)
occupied.get_mut().merge_defs(translated, &self.strings)
}
std::collections::hash_map::Entry::Vacant(vacant) => {
vacant.insert(translated);
Expand All @@ -641,6 +640,10 @@ impl Assembly {
pub(crate) fn method_defs(&self) -> &FxHashMap<MethodDefIdx, MethodDef> {
&self.method_defs
}

pub(crate) fn contains_def(&self, cref: ClassRefIdx) -> bool {
self.class_ref_to_def(cref).is_some()
}
}
/// An initializer, which runs before everything else. By convention, it is used to initialize static / const data. Should not execute any user code
pub const CCTOR: &str = ".cctor";
Expand Down
Loading

0 comments on commit a91f881

Please sign in to comment.