Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Allow compile-time fetching of function arguments in higher-order functions #3493

Closed
SpexGuy opened this issue Oct 21, 2019 · 8 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@SpexGuy
Copy link
Contributor

SpexGuy commented Oct 21, 2019

There are times when it's useful to generate a call to a function whose arguments are known at compile-time, but not explicit in the written type.
Said another way, I'm looking for a compile-time equivalent to JavaScript's apply(). Zig doesn't have variant type arrays, even at compile time, so a direct port is not possible. But having some analogue could be helpful.
As an example, consider the following construct:

fn getItemData(comptime T: type, index: u32) *T {
	// looks up the value of type T at the given index
	// code omitted
}

fn forEachRow(comptime func: var) void {
	const FuncType = @typeOf(func);
	const funcInfo = switch (@typeInfo(FuncType)) {
		.Fn, .BoundFn => |fnInfo| fnInfo,
		else => @compileError("parameter func must be a function"),
	}

	const numItems = ...;
	var index: u32 = 0;
	while (index < numItems) : (index += 1) {
		// invoke func(index, getItemData(@typeOf(param[1])), getItemData(@typeOf(param[2]).Child), ...)
		// no way to do this currently
	}
}

fn updateVelocity(index: u32, dt: DeltaTime, vel: *Velocity, acc: Acceleration) void {
	vel.x += acc.x * dt.seconds;
	vel.y += acc.y * dt.seconds;
	vel.z += acc.z * dt.seconds;
}

It's currently possible to use a struct to work around this, giving us:

fn forEachRow(comptime func: var) void {
	const FuncType = @typeOf(func);
	const funcInfo = switch (@typeInfo(FuncType)) {
		.Fn, .BoundFn => |fnInfo| fnInfo,
		else => @compileError("parameter func must be a function"),
	}

	// this could be more robust, but is made simple for the example.
	const StructType = funcInfo.args[1].arg_type.?;
	const structInfo = switch(@typeInfo(StructType)) {
		.Struct => |struct| struct,
		else => @compileError("parameter func must have its second argument be a struct of requisite data");
	}

	const numItems = ...;
	var index: u32 = 0;
	while (index < numItems) : (index += 1) {
		var data: StructType = undefined;
		inline for (structInfo.fields) |field| {
			@field(data, field.name) = getItemData(field.field_type.Child, index);
		}
		func(index, data);
	}
}

fn updateVelocity(index: u32, data: struct {
	dt: *const DeltaTime,
	vel: *Velocity,
	acc: *const Acceleration,
}) void {
	data.vel.x += data.acc.x * data.dt.seconds;
	data.vel.y += data.acc.y * data.dt.seconds;
	data.vel.z += data.acc.z * data.dt.seconds;
}

This workaround has a few drawbacks:

  • The compiler cannot choose whether to pass pointers or values for the fields of the struct
  • The syntax is awkward for the user of the API

It would be nice if I could write some sort of inline for over the parameters of the function to specify their values, as I can for a struct.

One potential way to solve this would be to have function types also generate a struct type with the arguments, and supply a macro @apply that takes a function and this struct.
The struct type would be available as builtin.TypeInfo.Fn.params_struct_type: ?type. If the function has at least one parameter of type var, no struct can be generated.
The names of the fields in the struct are param0, param1, ...
This would allow the first example to be written as:

fn forEachRow(comptime func: var) void {
	const FuncType = @typeOf(func);
	const funcInfo = switch (@typeInfo(FuncType)) {
		.Fn, .BoundFn => |fnInfo| fnInfo,
		else => @compileError("parameter func must be a function"),
	};

	const Params: builtin.TypeInfo.Struct = funcInfo.params_struct_type.?;
	const paramsInfo = switch (@typeInfo(Params)) {
		.Struct => |structInfo| structInfo,
		else => unreachable,
	};

	const numItems = ...;
	var index: u32 = 0;
	while (index < numItems) : (index += 1) {
		var params: Params = undefined;
		params.param0 = index;
		inline for (structInfo.fields[1..]) |field| {
			@field(params, field.name) = if (isPointer(field.field_type))
				getItemData(field.field_type.Child, index)
			else
				*getItemData(field.field_type, index);
				
		}
		@apply(func, params);
	}
}

fn updateVelocity(index: u32, dt: DeltaTime, vel: *Velocity, acc: Acceleration) void {
	vel.x += acc.x * dt.seconds;
	vel.y += acc.y * dt.seconds;
	vel.z += acc.z * dt.seconds;
}

Of course, in order for this to be able to make the value->pointer optimization, there would need to be restrictions on the use of the struct. There's probably some other formulation that allows for that expression without having arbitrary limits.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Oct 22, 2019

An alternate approach would be to provide a way to generate code from within comptime blocks.
That would allow for things like this:

fn forEachRow(comptime func: var) void {
    const FuncType = @typeOf(func);
    const funcInfo = switch (@typeInfo(FuncType)) {
        .Fn, .BoundFn => |fnInfo| fnInfo,
        else => @compileError("parameter func must be a function"),
    };

    const numItems = ...;
    var index: u32 = 0;
    while (index < numItems) : (index += 1) {
        comptime {
            const args = funcInfo.args[1..];

            // variables declared outside of the comptime block can be referenced by name
            builtin.generate("func(index,");

            for (args) |arg, i| {
                const argType = arg.arg_type.?;

                // variables within the comptime block can be passed like format parameters
                if (isPointer(argType))
                    gen("getItemData({}, index),", argType.Child);
                else
                    gen("*getItemData({}, index),", argType);
            }

            builtin.generate(");");

            // when the comptime block ends, the generated code is validated.
            // To keep things simple, there are a couple of rules about this usage:
            // You can't generate an unmatched set of parens or braces.
            // The generated code must be a drop-in replacement for the comptime block.
            // The comptime block must be lexically valid without generating the code.
            // This limits you to generating a statement for comptime {...}
            // and generating an expression for var x = comptime {...};
            // which ensure that language editing tools can work well more easily,
            // and makes the language more understandable.
        }
    }
}

@JesseRMeyer
Copy link

JesseRMeyer commented Oct 22, 2019

Looking forward, package maintainers will suffer a great deal in the attempt to provide reproducible builds if Zig code can generate Zig code during the compilation phase. Why don't we leverage the build phase for metaprogramming instead? Both package maintainers and Zig programmers get to have their cake and eat it too.

In terms of this proposal, your comptime block would be executed as part of the build system. How exactly requires another proposal, but a few ideas spring to mind:

A message system that allows for annotating scope blocks or functions with a unique identifier which would be captured and acted on by the build system. JAI has an example of annotating functions with #test which are compiled and executed by its build system.

A buildtime {} block that is executed at build time, by the build system, that replaces the block's contents with the results of the block (for now, assuming a string of Zig code, but we shouldn't be so restricted in our thinking in the long term). The code inside these blocks would have access to Zig source metadata, provided by the Build system, and so would not resemble blocks of surrounding code, and would not be recognized by the Zig compiler directly.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Oct 22, 2019

That's a good point. Perhaps code generation from strings is more than is needed here. In a more limited sense, Zig is already capable of generating code at compile-time. Consider this example:

fn bakeParameter(comptime func: var, comptime param: var) fn () void {
    // generate code that calls the given function with the given parameter.
    // this code is compiled as a function and has an address.  A pointer
    // to it can be used at runtime.
    const Codegen = struct {
        fn bakedFn() void {
            func(param);
        }
    };
    return Codegen.bakedFn;
}

fn printNumber(n: u32) void {
    std.debug.warn("{} ", n);
}

export fn printDigitsInRandomOrder() void {
    var functions: [10]fn () void = undefined;
    comptime {
        var i: usize = 0;
        while (i < functions.len) : (i += 1) {
            functions[i] = comptime bakeParameter(printNumber, i);
        }
    }

    var functionsSlice = functions[0..];

    var rng = std.rand.DefaultPrng.init(42);
    rng.random.shuffle(fn () void, functionsSlice);
    for (functionsSlice) |func| {
        func();
    }
}

test "random" {
    printDigitsInRandomOrder();
}

This technique could almost be used to solve my use case:

fn forEachRow(comptime func: var) void {
    const FuncType = @typeOf(func);
    const funcInfo = switch (@typeInfo(FuncType)) {
        .Fn, .BoundFn => |fnInfo| fnInfo,
        else => @compileError("parameter func must be a function"),
    };

    const fetchFunc = switch (funcInfo.args.len) {
        0 => @compileError("func must take index as its first parameter"),
        1 => func,
        2 => comptime fetch1(func),
        3 => comptime fetch2(func),
        // ...
        10 => comptime fetch9(func),
        else => @compileError("func takes too many parameters, limit is 10"),
    };

    const numItems = ...;
    var index: u32 = 0;
    while (index < numItems) : (index += 1) {
        @inlineCall(fetchFunc, index);
    }
}

fn fetch1(comptime func: var) fn (u32) void {
    const FuncType = @typeOf(func);
    const funcInfo = switch (@typeInfo(FuncType)) {
        .Fn, .BoundFn => |fnInfo| fnInfo,
        else => @compileError("parameter func must be a function"),
    };

    const Codegen = struct {
        fn fetchArgs(index: u32) void {
            func(
                index,
                getItemData(funcInfo.args[1].arg_type.?, index),
            );
        }
    };

    return Codegen.fetchArgs;
}

fn fetch2(comptime func: var) fn (u32) void {
    const FuncType = @typeOf(func);
    const funcInfo = switch (@typeInfo(FuncType)) {
        .Fn, .BoundFn => |fnInfo| fnInfo,
        else => @compileError("parameter func must be a function"),
    };

    const Codegen = struct {
        fn fetchArgs(index: u32) void {
            func(
                index,
                getItemData(funcInfo.args[1].arg_type.?, index),
                getItemData(funcInfo.args[2].arg_type.?, index),
            );
        }
    };

    return Codegen.fetchArgs;
}

// ...

fn fetch9(comptime func: var) fn (u32) void {
    const FuncType = @typeOf(func);
    const funcInfo = switch (@typeInfo(FuncType)) {
        .Fn, .BoundFn => |fnInfo| fnInfo,
        else => @compileError("parameter func must be a function"),
    };

    const Codegen = struct {
        fn fetchArgs(index: u32) void {
            func(
                index,
                getItemData(funcInfo.args[1].arg_type.?, index),
                getItemData(funcInfo.args[2].arg_type.?, index),
                getItemData(funcInfo.args[3].arg_type.?, index),
                getItemData(funcInfo.args[4].arg_type.?, index),
                getItemData(funcInfo.args[5].arg_type.?, index),
                getItemData(funcInfo.args[6].arg_type.?, index),
                getItemData(funcInfo.args[7].arg_type.?, index),
                getItemData(funcInfo.args[8].arg_type.?, index),
                getItemData(funcInfo.args[9].arg_type.?, index),
            );
        }
    };

    return Codegen.fetchArgs;
}

What I'm really looking for is a way to write the generalized fetchN(comptime func: var). If I wanted to do something like make the leading index parameter optional, in the current solution that would create a prohibitive explosion of cases to write. But with a generalized function that becomes one if statement.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Oct 22, 2019

A similar problem exists with generic types. Perhaps there is a single solution that solves both cases.

fn Tuple1(comptime T0: type) type {
    return struct {
        // Fields
        val0: T0,
        
        // Generic Use Functions
        fn TypeAt(comptime index: usize) type {
            return switch (index) {
                0 => T0,
                else => @compileError("index is out of range for this tuple"),
            };
        }
        
        fn get(self: *@This(), comptime index: usize) *TypeAt(index) {
            return switch (index) {
                0 => &self.val0,
                else => @compileError("index is out of range for this tuple"),
            };
        }
    };
}
fn Tuple2(comptime T0: type, comptime T1: type) type {
    return struct {
        // Fields
        val0: T0,
        val1: T1,
        
        // Generic Use Functions
        fn TypeAt(comptime index: usize) type {
            return switch (index) {
                0 => T0,
                1 => T1,
                else => @compileError("index is out of range for this tuple"),
            };
        }
        
        fn get(self: *@This(), comptime index: usize) *TypeAt(index) {
            return switch (index) {
                0 => &self.val0,
                1 => &self.val1,
                else => @compileError("index is out of range for this tuple"),
            };
        }
    };
}

fn Tuple(comptime types: []const type) type {
    // can't write this easily.  The switch-on-length works if you
    // can set a maximum bound on number of arguments.
    // It's technically possible to write the general case, but you
    // need to compute all of the offsets and alignment and use a
    // lot of @ptrCast and @alignCast.  Basically you re-implement
    // the compiler's struct layout and access code, and put an aligned
    // [_]u8 in the struct.  I'm not an expert but this approach might
    // generate false aliasing since everything originates as a [*]u8.
}

(edit: s/[]/[_]/)

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Oct 23, 2019

Ah, looks like the struct case is addressed by #383.

@SpexGuy
Copy link
Contributor Author

SpexGuy commented Oct 23, 2019

This is somewhat related to #208, but where that is concerned about function calls where the argument types are known to the caller but not the callee, this proposal concerns the problem of when the argument types are known to the callee but not the caller.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Oct 23, 2019
@andrewrk andrewrk added this to the 0.7.0 milestone Oct 23, 2019
@JesseRMeyer
Copy link

JesseRMeyer commented Oct 23, 2019

All sound points. Traditionally, source substitution via Macros was the 'path of least resistance' to address this. Zig does not support Macros, but there is a need to programatically substitute or generate source at some level. I do not know if that means to make Zig more self-aware so that code files have the capacity to somehow modify themselves, or if that task is on us as Zig programmers who know the context of our programs -- to author a Zig program that alters our Zig source.

That is why I suggested empowering the build system -- metaprogramming becomes a coherent workflow across the board. It is the natural place to put this power, since a self-authored Zig metaprogram is essentially just a pre-processor for the already existing build system. And that is not a modern advancement from C in many respects.

@andrewrk
Copy link
Member

andrewrk commented Dec 9, 2019

I believe this use case is now addressed with @call.

@andrewrk andrewrk closed this as completed Dec 9, 2019
@andrewrk andrewrk modified the milestones: 0.7.0, 0.6.0 Feb 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

3 participants