Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize #include #1

Open
stefnotch opened this issue Apr 25, 2024 · 4 comments
Open

Standardize #include #1

stefnotch opened this issue Apr 25, 2024 · 4 comments

Comments

@stefnotch
Copy link

stefnotch commented Apr 25, 2024

Would you be interested in helping with standardizing the #include syntax? (separate from the official WGSL specification)
After all, it'd be amazing if a number of tools could interoperate flawlessly! How neat would it be to be able to use wgsl-linker at runtime in a browser, while getting syntax checking from vscode-wgsl

To that goal, I think we'd best start by checking out the status quo, figuring out the concrete goals, and going from there.
e.g.
Q: Why is a simple text-inserting version of #include not satisfactory?
A: One could import a function with the same name. One also runs into problems when a file is included multiple times. Therefore we need a syntax-aware importing mechanism
Q: Which path syntax would make sense for an importing mechanism?
A: ...?
Q: How does one unambiguously parse includes? And deals with cases such as "a #include inside a comment shouldn't be parsed"
Q: Name mangling?
and so on.

My personal concrete goals would really be

  • Standardizing an importing syntax
    • Other preprocessor tasks would be a future endeavor.
  • Getting buy-in from multiple tools, and potentially contributing an implementation of it

I've also started a discussion at
bevyengine/naga_oil#83

@mighdoll
Copy link
Owner

Yep, it'd be great to converge on syntax extensions for wgsl.

Some starter comments in response to your questions:

  • If we can find an agreement on an import syntax, would folks want #include as well? i.e. if we have the equiv of use in Rust or import in JavaScript/TypeScript, would we still want a C style #include?

  • re paths, in syntax like #import foo from module-specifier I currently support two types of specifiers: relative paths like "../util.wgsl", and module name specifiers like "MyPackage.MyModule". The two module specifier types are analagous to javascript import syntax. Enabling packaging modules as actual JavaScript modules would be nice too but still TBD.

    I've been experimenting with import syntax, but I find myself tacking closer to JS syntax with each rev.

    Note that there's a bit of tension in relying on path names for web wgsl, because there's not always a natural file path available. Many authors encourage embedding wgsl as strings in javascript source. And dynamic wgsl code generation is also a thing. We could ask authors to make up synthetic path names in those cases, but that's a bit icky.

  • I currently allow syntax extensions inside line comments. e.g. you can #export or // #export. That allows extension unaware tools to ignore the directives. It's a trick, but perhaps useful as transition strategy even for standardized extensions.

  • While I understand that linkers/bundlers need to mangle names to avoid duplicates, does the name mangling needs to be standardized between different tools?

@stefnotch
Copy link
Author

Yep, it'd be great to converge on syntax extensions for wgsl.

Some starter comments in response to your questions:

  • If we can find an agreement on an import syntax, would folks want #include as well? i.e. if we have the equiv of use in Rust or import in JavaScript/TypeScript, would we still want a C style #include?

Very good question. I think not, since C style #includes come with a plethora of secondary problems, like "you can't include a file twice, so you have to resort to include guards".

  • re paths, in syntax like #import foo from module-specifier I currently support two types of specifiers: relative paths like "../util.wgsl", and module name specifiers like "MyPackage.MyModule". The two module specifier types are analagous to javascript import syntax. Enabling packaging modules as actual JavaScript modules would be nice too but still TBD.
    I've been experimenting with import syntax, but I find myself tacking closer to JS syntax with each rev.
    Note that there's a bit of tension in relying on path names for web wgsl, because there's not always a natural file path available. Many authors encourage embedding wgsl as strings in javascript source. And dynamic wgsl code generation is also a thing. We could ask authors to make up synthetic path names in those cases, but that's a bit icky.

Ah, that's good to know. So file paths have a reasonably decent argument against them.
naga_oil currently has Rust-like imports. Which is also module based, albeit with a different syntax.

One non-obvious detail when using modules is where does a language server look?
As in, a language server cannot scan all possible locations. Instead, it needs a very clear-cut lookup rule, which ideally involves as few file reads as possible.

  • I currently allow syntax extensions inside line comments. e.g. you can #export or // #export. That allows extension unaware tools to ignore the directives. It's a trick, but perhaps useful as transition strategy even for standardized extensions.

I haven't even thought about #export s yet. Does a minimum viable prototype need them, or should we focus on imports with everything being public first?

And for #import, we can't really place them in comments. After all, // #import, that would lead to extension unaware tools ignoring the import and then wondering why a struct or a function doesn't exist. It's an error either way.

  • While I understand that linkers/bundlers need to mangle names to avoid duplicates, does the name mangling needs to be standardized between different tools?

It's not a strict requirement, but it'd be very helpful to have a recommended mangling strategy. Ideally one that is

  • easy to implement
  • definitely correct, without cursed corner cases

One place where the name mangling could leak into user code is when creating a shader. When one creates a render pipeline, one needs to specify entryPoint: "name of the main function"

@mighdoll
Copy link
Owner

Ah, that's good to know. So file paths have a reasonably decent argument against them.

There's arguments for paths as well. Relative paths to files make sense on the web. And relative paths make sense to me for 'inside my wgsl package' references. But forcing references to other packages to be magic paths seems awkward. e.g. importing from /node_modules/other-wgsl/util.wgsl is awkward because of the magic /node_modules path, better to import from other-wgsl-package/util.wgsl where other-wgsl-package is a module name not a path.

naga_oil currently has Rust-like imports.

I'll have to look more closely..

I see #define in there. The lack of support for #define in wgsl-analyzer bites pretty often. Any chance your standardization push will address #define too?

One non-obvious detail when using modules is where does a language server look? As in, a language server cannot scan all possible locations. Instead, it needs a very clear-cut lookup rule, which ideally involves as few file reads as possible.

Good question. A first step might be to just have an array of globs where the the wgsl files are to be found, like https://www.typescriptlang.org/tsconfig/#include. For web projects, give that field a name and stick in package.json and expect that the language server would find it there.

I haven't even thought about #export s yet. Does a minimum viable prototype need them, or should we focus on imports with everything being public first?

Hah, and I hadn't thought about not having them :-). Seems like basic functionality would work with everything exported by default..

Of course, most languages have a way to distinguish public vs private parts of the interface. I imagine we'd want that too eventually.

The wgsl-linker also has an experimental feature where you can pass arguments from imports to exports, so you can do things like #import workgroupScan(i32) from utils as scanI32. That feature relies on having an #export that takes a parameter to match up with the #import parameter.

I didn't see an approach to generics glancing through naga_oil. Is there something? If not, something similar to import/export parameters might prove useful there too.

// #import`, that would lead to extension unaware tools ignoring the import and then wondering why a struct or a function doesn't exist. It's an error either way.

It's a bit odd to put extensions in comments for sure. The charm is that some tools e.g. wgsl code formatters can remain blissfully extension unaware. And it is constraining to try and maintain wgsl compatibility. Whether maintaining wgsl compatibility is preferable on balance is unclear.

One place where the name mangling could leak into user code is when creating a shader. When one creates a render pipeline, one needs to specify entryPoint: "name of the main function"

To avoid this particular issue, wgsl-linker doesn't mangle the names in the main module. (Generally, it tries not to mangle names unless necessary, to ease debugging.) Is that sufficient to dodge the need for standard mangling rules? I bet there are some other reasons you are thinking about mangling standards, though..

@stefnotch
Copy link
Author

Ah, that's good to know. So file paths have a reasonably decent argument against them.

There's arguments for paths as well. Relative paths to files make sense on the web. And relative paths make sense to me for 'inside my wgsl package' references. But forcing references to other packages to be magic paths seems awkward. e.g. importing from /node_modules/other-wgsl/util.wgsl is awkward because of the magic /node_modules path, better to import from other-wgsl-package/util.wgsl where other-wgsl-package is a module name not a path.

Relative paths definitely make sense. There's even an open request for bevy/naga_oil to include a relative import syntax, like
#import super::moduleB for importing a module that lives in the same folder.

naga_oil currently has Rust-like imports.

I'll have to look more closely..

On that note, feel free to join the Bevy Discord server
https://discord.com/invite/bevy
There's a thread there with a few involved people https://discord.com/channels/691052431525675048/1234966628224077885 , including the person behind the new https://github.com/dannymcgee/vscode-wgsl/
(Not to be confused with https://github.com/PolyMeilex/vscode-wgsl and or https://github.com/wgsl-analyzer/wgsl-analyzer )

I see #define in there. The lack of support for #define in wgsl-analyzer bites pretty often. Any chance your standardization push will address #define too?

That'd be a reasonable future proposal. One interesting part there is that WGSL is planning on getting some features that take care of a few #define use-cases. For example, generics and function overloading gpuweb/gpuweb#876 . There's also constant evaluation built-in into the language.

One non-obvious detail when using modules is where does a language server look? As in, a language server cannot scan all possible locations. Instead, it needs a very clear-cut lookup rule, which ideally involves as few file reads as possible.

Good question. A first step might be to just have an array of globs where the the wgsl files are to be found, like https://www.typescriptlang.org/tsconfig/#include. For web projects, give that field a name and stick in package.json and expect that the language server would find it there.

Sounds reasonable, that's what almost everyone seems to expect. A config file that says "look here" and can optionally also say "look at the package.json/cargo.toml", depending on which ecosystem a tool is built for.

(For example, it currently wouldn't make sense for https://github.com/ScanMountGoat/wgsl_to_wgpu to support npm. But it would absolutely make sense for it to support the cargo package manager.)

I haven't even thought about #export s yet. Does a minimum viable prototype need them, or should we focus on imports with everything being public first?

Hah, and I hadn't thought about not having them :-). Seems like basic functionality would work with everything exported by default..

Of course, most languages have a way to distinguish public vs private parts of the interface. I imagine we'd want that too eventually.

True. And re-exporting a module would also be useful.
So it might just make sense to initially support it, I'm not entirely certain yet. Since you've already implemented this, do you have any idea regarding how much complexity it would add, and how much convenience it offers?

The wgsl-linker also has an experimental feature where you can pass arguments from imports to exports, so you can do things like #import workgroupScan(i32) from utils as scanI32. That feature relies on having an #export that takes a parameter to match up with the #import parameter.

I didn't see an approach to generics glancing through naga_oil. Is there something? If not, something similar to import/export parameters might prove useful there too.

I'm not entirely sure. naga_oil has #defines, and the ability to override functions. I think those might not cover the same use-cases though.

// #import`, that would lead to extension unaware tools ignoring the import and then wondering why a struct or a function doesn't exist. It's an error either way.

It's a bit odd to put extensions in comments for sure. The charm is that some tools e.g. wgsl code formatters can remain blissfully extension unaware. And it is constraining to try and maintain wgsl compatibility. Whether maintaining wgsl compatibility is preferable on balance is unclear.

I suppose this would need some testing to figure out what exactly we want. If most shaders that come with imports also make use of a preprocessor, then there wouldn't be much of a reason to maintain wgsl compatibility.
Meanwhile if a lot of pure wgsl shaders are out there, then it absolutely could make sense to have certain annotations (especially exports) be comments.

One place where the name mangling could leak into user code is when creating a shader. When one creates a render pipeline, one needs to specify entryPoint: "name of the main function"

To avoid this particular issue, wgsl-linker doesn't mangle the names in the main module. (Generally, it tries not to mangle names unless necessary, to ease debugging.) Is that sufficient to dodge the need for standard mangling rules? I bet there are some other reasons you are thinking about mangling standards, though..

Most mangling approaches are pure functions. For example C++ mangling takes a name, and mostly just slaps the full (and unique) path of the module in front of it. This saves one the trouble of keeping a global HashMap of all the used names.

Of course with that approach, in a worst case scenario, almost everything needs to be mangled.
For example, given 3 modules main.wgsl, foo.wgsl and bar.wgsl

// bar.wgsl

#export
fn rand() -> u32 {
  return 4; // very random, I know
}
// foo.wgsl

#import rand from bar.wgsl

fn extra_rand() -> u32 {
  return rand() + rand() - rand()
}
// main.wgsl

#import extra_rand from foo.wgsl

fn bar_rand() {
  return 3;
}

fn main() {
  let rand = extra_rand();
}

Then to put everything into one module, we have to mangle the names.

  • rand becomes bar_rand
  • extra_rand becomes foo_extra_rand
  • bar_rand becomes main_bar_rand. It has to, otherwise it could conflict with already used names.

Although, I suppose this wouldn't be much of an issue. If the mangling scheme for the entry point is simple enough, then it shouldn't be too bad to ask the user to write entryPoint: "main_main" instead of entryPoint: "main".

Another reason why I'm proposing this is because wgsl_to_wgpu asked for a way to easily un-mangle a name. I believe it wants to look at the generated source code, and generate sensible Rust equivalents. And the generated Rust equivalents should be namespaced, hence the "unmangling" request.

Finally, I thought this was interesting:
There's yet another tool that implements WGSL imports with a different feature-set https://usegpu.live/docs/guides-shaders

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants