-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data section offsets #163
Comments
How about just continue the pointer like concept?
|
regarding the multiple memories, how about just a reference?
|
Here's a suggestion that involves changing the parser: const hello1: i32 = 'Hello World!'@1024#1; // static string at position 1024 of memory # 1
const hello1b: i32 = 'Hello World!'@1024; // static string at position 1024 of memory # 1 as well
const array: i32[] = [1, 2, 3, 4]@4096#7; // static array/raw data region of memory # 7 Instead, maybe it's better to have offset info near the type declaration, like this: const hello1: i32@1024#1;
hello1 = 'Hello World!'; // static string at position 1024 of memory # 1
const hello1b: i32@1024;
hello1b = 'Hello World!'; // static string at position 1024 of memory # 1 as well
const array: i32[]@4096#7;
array = [1, 2, 3, 4]; // static array/raw data region of memory # 7 |
And here's another suggestion... const memory: Memory = Memory.allocate({ initial: 1, number: 1 });
const hello32: i32[] = memory.view({offset: 40}); // starts at position 40
const array64: i64[] = memory.view(); // starts at 0
hello32 = 'Hello!';
array64 = [1, 2, 3, 4]; |
This project takes a low level approach, but for strings and arrays, I think this C-style practice of "we'll track the offset as an integer, you're responsible for the length" is just a tiny bit too low-level and is going to elicit a WTF reaction among the project's intended users, who aren't familiar with the C approach. Tracking lengths manually is a burden familiar to C, but alien to JS. There should be sugar for this, or JS programmers are going to reinvent zero-terminated strings. Maybe there should be five special higher-level sugary types available for use: Each of these types would be a struct holding a pair of scalars: an i64 length and an i32[], i64[], f32[], or f64[] pointer. It doesn't need to bother enforcing an index range (what would it do anyway?) so a programmer should be able to ignore a length if he wants. But it should be made available, at least, whether it's mutable or not, or people will get upset. So it would look like this:
|
I like it, but how about some sugar where both mean the same thing
|
The first one is a pointer to a struct having a pointer and a length. It basically maps to this:
(Instead of This is how Go does it. There's an array type in Go but they don't want you to pay attention to it. They want you to use their slice type everywhere instead. It's similar to ArrayBuffer and TypedArrays in JS. In the second, there is no struct or length information, just a pointer. It would map to this I think-
The ampersand might be useful if it can be prepended to things other than |
I've been thinking about this a bunch and mostly the bits about the fact that we would need a sugary type for strings/slices. The way things are shaping out, it seems like the existing syntax + types are not expressive enough. I don't think I want to keep adding sugary types without allowing the user to do something similar (w/o compiler changes). Even though adding a new type to the compiler directly would be very trivial. I think (and this idea isn't fully baked yet) I'll have to pivot on the types a bit and expose operator overloading (among other things) to the user, so that things like In some ways it would be better (less magic in the compiler), in some other ways it would be worse since the syntax would be even farther from JS/flow, at least when dealing with types. Then again, most of this stuff can be built-in/std-lib-ed so that most users don't need to worry about it. For example indexing into a string could be done via better primitives like so type String = ({ length: i32 }, i32); // can be used as an i32 or object with field .length
// syntax TBD
// " -> { " denotes a Block not a function, kind of important distinction
operator String[] = (target : String, index : i32) -> {
// do t.length sanity check perhaps
return i32.load(t + ((1 + index) << 2));
}; Current array logic would also be "implemented" in the same way, technically it already is, just inside the compiler not in user-land. Along these lines. The goal at the end of the day is to allow the user to use wasm however. And as much as I'd like to avoid creating a new language/type system there seems to be no way around it without limiting usability. I'll likely open up a new discussion on the topic as it's a bigger issue than "how to set data sections". |
As you can see with with my comment history, I never really cared about the "seems like js" feature. I think C is close enough, and this is why I proposed C like solutions. We have a C like memory management problem here! As for your operator/block idea, you can simplify the syntax by not implementing an operator keyword, but predefining operators as syntax sugar that maps to a named function. For example,
But this would imply you need to implement function overloading or generics |
Hacky idea: preprocessor directives at the top that include type definitions:
Or you could go really nuts with a module system for the compiler that makes it like Babel:
|
Yup, the compiler already supports extensions. All of the current features are written as internal (enabled by default) language extensions and grammar. There is a reference implementation of closures as a plugin to demonstrate how a complex extension could be made and injected into a compiler. I'll probably make a package (similar to babel presents) for all |
Problem
There are currently two methods of encoding a Data Section entry into the binary. These are
const hello: i32 = 'Hello World!';
- static stringsconst array: i32[] = [1, 2, 3, 4];
- static array/raw data regionsIn both cases the data is encoded into the data section such that it'll take up the first available offset in memory.
The data sections in the WASM spec allow for an explicit offset. An example of this can be seen in the reference spec tests for data section here.
Goal
Define and implement a syntax for allowing an explicit offset to be defined when defining data sections in Walt.
Possible syntax:
A pseudo function call
This would require for altering the grammar to allow for top-level function calls, as well as some guards on calling non
memory.data()
function calls.A static object
This would work great for strings but not so well for static arrays. Also, having an object property(
offset
) not be part of the object is very odd.???
Maybe there is an additional way to define this which would make sense, but it does seem like having the
memory
involved in some way makes the most sense. Especially since in the future it will be possible to have N > 1 memories in a single binary.The text was updated successfully, but these errors were encountered: