Enum Arrays #793

thejoshwolfe · 2018-02-25T17:23:51Z

This was originally proposed by @raulgrell here: #770 (comment)

What do you think of an enum array? It has a length equal to the member count of the enum and can only be indexed with an enum value.

const Axis = enum { X, Y, Z};
const vec3 = [Axis]f32 {0.0, 0.0, 0.0};
vec3[Axis.X] == 0.0;

You can get close to this with status quo Zig by specifying the tag type and casting

const Value = enum(u2) {  Zero, One, Two };
const vals = [3]i32{3, 4, 5};
vals[u2(Value.Zero)] == 3;

But then if you change the number of elements, the backing type of the enum or override the values, you need to change a lot of code. And you could still access it with arbitrary integers, so if at any point the index into the array was hardcoded, it would have to be found.

An enum array basically becomes a comptime-checked map!

It could be approximately implemented in userland with something like this if we had a memberIndex built-in or something:

fn EnumArray(comptime T: type, comptime U: type) -> type {
    return struct {
        data: [@memberCount(T)]U,

        const Self = this;

        fn get(&self: Self, tag: t) U {
            return data[@memberIndex(t)];
        }

        fn set(&self: Self, tag: t, value u) void {
            data[@memberIndex(t)] = u;
        }
    }
}

const value_map = EnumArray(Value, i32) {
    .data = []i32{3, 4, 5}
}

value_map.get(Value.Zero) == 3;
value_map.set(Value.Two, 10);

thejoshwolfe · 2018-02-25T17:30:20Z

I use enum arrays frequently in C++. Here are some examples:

https://github.com/thejoshwolfe/legend-of-swarkland/blob/46b1916af06e9d7f6829cb6ff552049e5cd38f28/src/thing.hpp#L196

https://github.com/thejoshwolfe/legend-of-swarkland/blob/46b1916af06e9d7f6829cb6ff552049e5cd38f28/src/serial.cpp#L389

The silly hacks (like check_indexed_array()) I do there to make sure I'm doing it right are begging for a language feature. I think this proposal has a lot of value.

raulgrell · 2018-02-26T00:15:55Z

Would we have a [Enum]null T too? What would you expect to get in a for? Would we expose the tag? the index? both?

const Axis = enum { X, Y, Z};
const vec3 = [Axis]f32 {0.0, 0.0, 0.0};
vec3[Axis.X] == 0.0;

for (vec3) | val, tag | println("{}: {}", val, tag);
for (vec3) | val, index | println("{}: {}", val, index);
for (vec3) | val, index, tag | println("{}: {}", val, index);

tiehuis · 2018-02-27T08:36:01Z

This would be a nice addition for me. I used this approach quite frequently for a non-dynamically allocated game which relied heavily on tables for data and looking up by id. I actually found that when trying to represent similar ideas in rust using enums it was pretty rough going and I switched to just using integer values for the cost of some compile-time safety.

Regarding iteration, would it suffice to just have the following:

for (vec3) |val, tag| {
    const value = usize(tag)
}

This also keeps the type returned between normal arrays and enum arrays consistent (item, index_type).

I would expect you would far more often than not use the tag for indexing associated data. There is the argument that you would need to know the appropriate integer size to cast to but if you wanted to do math or other things on the index, I would think you would probably have a good idea on what to pick here.

Hejsil · 2018-02-27T09:29:52Z

I would love this feature as well, but how restricted should this feature be? Is this allowed?

const E = enum { First = 1, Last = 1000 };
const arr = [E]i64 { -1, 1 };

My guess is no, but what about a more reasonable example?

const E = enum { First = 5, Second = 6, Last = 7 };
const arr = [E]i64 { -1, 0, 1 };

Here, the compiler would be able to give us an array of size 3, and when indexing, subtract with the 5 offset.

EDIT:
Just reread the proposal. I don't think this @memberIndex(t) exists (it's not in the docs) but if it does, I guess my first example could be allowed.

raulgrell · 2018-02-27T10:26:03Z

@Hejsil both examples would work, even without the memberIndex function. You'd have to specify the tag type though.

The following examples include memberIndex and memberTag builtins only for illustrative purposes.

const E = enum(u8) { First = 1, Second = 2, Third = 3, Last = 100 };
const arr = [E]i64 { -1, -2, -3, -100 };

// -1, E.First, 0, 1
// -2, E.Second, 1, 2
// -3, E.Third, 2, 3
// -100, E.Last, 3, 100
for(arr) | val, tag | {
    warn("{}, {}, {}, {}", val, tag, @memberIndex(tag), u8(tag);
}

// Would slicing an enum array result in a normal slice or would we have enum slices?
// How would we refer to other parts of the array?

// Enum slice
arr[E.First .. E.Third] == [E.First .. E.Third]i64 {-1, -2, -3};

for(arr[E.First .. E.Third]) | val, tag | {
    warn("{}, {}, {}, {}", val, tag, @memberIndex(tag), u8(tag);
   
    const next_tag = @memberTag(@memberIndex(tag) + 1);
    warn("{}, {}, {}, {}", arr[next_tag], @memberIndex(tag) + 1, next_tag, u8(next_tag);
}

// Regular slice
arr[E.First .. E.Third] == []i64 {-1, -2, -3};

for(arr[E.First .. E.Third]) | val, index | {
    const curr_tag = @memberTag(E, index);
    warn("{}, {}, {}, {}", val, index, curr_tag, u8(curr_tag);

    const next_tag = @memberTag(E, index + 1)
    warn("{}, {}, {}, {}", arr[next_tag], index + 1, next_tag, u8(next_tag);
}

I'm not sure how I feel about it. Ideally, it would provide a third binding:

// Both
for(arr) | val, index, tag | {
    warn("{}, {}, {}, {}", val, tag, index, u8(tag);
}

for(arr[E.First .. E.Third]) | val, index, tag | {
    warn("{}, {}, {}, {}", val, index, tag, u8(tag);

    const next_tag = @memberTag(E, index + 1)
    warn("{}, {}, {}, {}", arr[next_tag], index + 1, next_tag, u8(next_tag);
}

See also: #358

raulgrell · 2018-02-27T11:55:31Z

While we're at it, we should consider switch statements too. See also: #359

const E = enum(u8) { First = 1, Second = 2, Third = 3, Last = 100 };
const e = E.Third;

switch (e) {
    E.First => "Head",
    E.Second ... E.Last => "Tail"
}

Hejsil · 2018-02-27T12:39:29Z

Well, as long as:

const E = enum(u8) { A = 0, B = 1, C = 2 };
const arr = [E]i64{ -1, 0, 1 };
arr[E.A];

compiles down to the same as:

const E = enum(u8) { A = 0, B = 1, C = 2 };
const arr = [3]i64{ -1, 0, 1 };
arr[u8(E.A)];

Then I'm happy with the feature. I was just pointing out, that the compiler has to add a hidden cost to enum arrays when the enum tags do not map to array index directly.

Hejsil · 2018-02-27T12:41:23Z

As for the extra for "payload". I can't see the use case. Maybe we can add enum arrays into the language, see if any real code needs this extra payload, and then talk about it again.

raulgrell · 2018-02-27T13:25:44Z

@Hejsil Since the index of the member can be different to the value in the enum, I'd expect that to compile to:

const E = enum(u8) { A = 0, B = 1, C = 2 };
const arr = [3]i64{ -1, 0, 1 };
arr[@memberIndex(E.A)];

In your example, if you changed A to 10, you'd have an index out of bounds error, since the underlying array has length 3. You then wouldn't be able to refer to the first element of the array, since the array can only be indexed using an enum value, and it 'contains' an invalid index.

Then I'm happy with the feature. I was just pointing out, that the compiler has to add a hidden cost to enum arrays when the enum tags do not map to array index directly

All this would happen at compile time, the enum tag's @memberIndex would map directly to the array index, and we'd also guarantee that every enum array access is valid. No runtime cost with added safety!

It would be possible to do stuff like:

// Generate an array from an enum
const PowersOfTwo = enum(u32) { One = 1, Two = 2, Four = 4, Eight = 8 };
const powers_of_two: [E]u32 = undefined;
for (arr) | *val, index, tag | *val = u32(tag);

Hejsil · 2018-02-27T13:34:55Z

@raulgrell
I think you misunderstood me. I meant if the enum is a direct mapping to array indexes, then the compiler should output equivalent code and I would be happy.

Wait, so enum arrays are only useful for compile-time known indexes? What about user input?

const E = enum { A = 2, B = 100 };
var arr = [E]i64 { 0, 1 };
arr[try strToE(some_user_string)];

In this case, the compiler has to output code that can do the mapping from enum -> array index.

raulgrell · 2018-02-27T14:05:47Z

I had only really considered compile-time use, but that's a really good point.

Yeah, in this situation it's not really a zero-cost abstraction, but if you were using this kind of pattern without language support you'd have to do the mapping anyway. So yeah, either way I'm happy.

Hejsil · 2018-02-27T14:38:17Z

Hmmm. Here is another question. Is there any value in having an enum array be ordered by tag declaration order?

const A = enum { A = 2, B = 1, C = 0 };
const B = enum { A = 0, B = 1, C = 2 };

// 0 2, 1 1, 2 0, 
for ([A]u8 { 0, 1, 2 }) |i, tag| { debug.warn("{} {}, ", i, u8(tag)); }

// 0 0, 1 1, 2 2, 
for ([B]u8 { 0, 1, 2 }) |i, tag| { debug.warn("{} {}, ", i, u8(tag)); }

If so, then indexing [A]u8 is slower than indexing [B]u8. If we let enum arrays be unordered, then the compiler is free to optimize [A]u8 to have the same data layout as [B]u8.

Hejsil · 2018-02-27T14:39:29Z

Also, maybe this feature requires special initialization syntax:

[A]u8 { .A = 0, .B = 1, .C = 2 }

This ensures that if someone reorders members of A, then the program doesn't break.

raulgrell · 2018-02-27T15:35:11Z

In terms of ordering, that's not really an issue - I'm not sure that indexing [A]u8 is any different to indexing [B]u8. I'm still not sure we're on the same page. Could you make an example where the tag value has no relation to the indices, and the child type of the enum array is of a non-indexing type like f32?

Also, maybe this feature requires special initialization syntax. This ensures that if someone reorders members of A, then the program doesn't break.

This is a great suggestion. Somewhere in the documentation there is mention of both enums and packed enums.

// Say the compiler reorders the enum fields by their tag value
const E = enum(u8) { A = 12, B = 2, C = 6}; // @memberIndex(E.A) == 2
// You'd have to initialize it like @Heijsil suggested
const e = [E]f32 { .A = 0.66, .B = 22.1, .C = 42.0 };

// The compiler wouldn't do that in a packed enum
const PE = packed enum(u8) { A = 10, B =12, C =3 }; // @memberIndex(E.A) == 0
// And we could thus initialize it like an array
const pe = [PE]f32 { 0.66, 22.1, 42.0 };

// Structs
const S = struct { a: u8, b: u16, c: u8 }; // @memberIndex(S.b) == 2
const PS = packed struct { a: u8, b: u16, c: u8 }; // @memberIndex(S.b) == 1

// Is this restriction on initialization already the case with structs?
const s = S { .A = 10, .B = 257, .C = 8.};
const ps = PS {10, 257, 8  };

Hejsil · 2018-02-27T16:23:39Z

Hmmmm. Just look up the packed enums and they're just TODO'ed in the docs, so I have no idea what they are used for. Maybe they'll get endian properties like structs, discussed in #649?

I like the idea of compiler reordering of enums but not packed enums. It has this nice consistency with structs. That also gives the option for ordered vs unordered enum arrays, not that I have a real use case for ordered enum arrays.

Hejsil · 2018-03-07T14:54:42Z

Actually, order matters if you intend to ever treat your enum array as a normal array.

const A = enum { A = 2, B = 1, C = 0 };
const arra = [A]u8 { .A = 5, .B = 3, .C = 27 };

const B = packed enum { A = 2, B = 1, C = 0 };
const arrb = [B]u8 { .A = 5, .B = 3, .C = 27 };

// []u8{ 27, 3, 5 }
const bytesa = ([]u8)(arra);

// []u8{ 5, 3, 27 }
const bytesb = ([]u8)(arrb);

This is useful for shared memory and stuff like that.

nodefish · 2019-04-14T07:32:12Z

@memberIndex can be done in userland now:

fn memberIndex(comptime T: type, tag: T) usize {
    const enum_info = @typeInfo(T);
    const enum_fields = enum_info.Enum.fields;
    inline for (enum_fields) |field, i| {
        if (std.mem.eql(u8, field.name, @tagName(tag))) {
            return i;
        }
    }
    unreachable;
}

test "memberIndex" {
    const Value = enum(u2) {
        Zero,
        One,
        Two,
        Three,
    };

    std.debug.warn("\n{}\n", memberIndex(Value, .Three));
    std.debug.warn("{}\n", memberIndex(Value, .One));
    std.debug.warn("{}\n", memberIndex(Value, .Two));
}

Output:

3
1
2

nodefish · 2019-04-14T07:45:48Z

EnumArray:

fn EnumArray(comptime T: type, comptime U: type) type {
    return struct {
        data: [@memberCount(T)]U,

        fn get(self: *const @This(), tag: T) U {
            return self.data[memberIndex(T, tag)];
        }

        fn set(self: *@This(), tag: T, value: U) void {
            self.data[memberIndex(T, tag)] = value;
        }
    };
}

test "EnumArray" {
    const Axis = enum {
        X,
        Y,
        Z,
    };
    var vec3 = EnumArray(Axis, f32){ .data = undefined };
    vec3.set(.X, 0.54);
    vec3.set(.Y, -0.21);
    vec3.set(.Z, 0.55);

    std.debug.warn("\n{}\n", vec3.get(.X));
    std.debug.warn("{}\n", vec3.get(.Y));
    std.debug.warn("{}\n", vec3.get(.Z));
}

Note that a branching penalty in memberIndex will be incurred if tag is not comptime-known. This can be eliminated by using a better data structure in memberIndex e.g. comptime-built hash table with a perfect hash function.

MageJohn · 2021-12-15T16:19:48Z

I think it might be reasonable to restrict this proposal to untagged enums, or tagged enums that don't override any ordinal values.

The use case for an enum array is (as far as I understand), to give names to the indexes of an array. The use case for specifying the ordinals of an enum is giving names to specific ordinal values. If those two cases ever overlap, you could actually use a constant enum array to look up the custom ordinal value. For example:

const E = enum {
    hundred,
    thousand,
    million,
};
const ord_E = [Value]u32{ 
    hundred = 100, 
    thousand = 1000, 
    million = 1000000,
};

test "access enum ordinal value" {
    try expect(ord_E[.hundred] == 100);
    try expect(ord_E[.thousand] == 1000);
    try expect(ord_E[.million] == 1000000);
}

This is slightly more verbose, but I don't think that's necessarily a bad thing in this case. Using an enum with overridden ordinal values implies more overhead, and this makes that overhead explicit. In addition, the extra verbosity is only when defining the enum, because accessing ord_E is about the same amount of code as calling @enumToInt.

MageJohn · 2021-12-15T16:58:40Z

Considering the discussion on enum slices, I'm not sure they're a good idea. Enum tags do not necessarily have a strong ordering, so ranges between them could be confusing and fragile. If you have an enum called Color, what does colors[Color.Blue .. Color.Red] mean? Will it stay the same if someone comes along and decides the enum tags would be better ordered alphabetically rather than by the color wheel?

For similar reasons, I think it's a good idea to require struct initialisation syntax for enum arrays.

InKryption · 2021-12-15T17:49:50Z

@MageJohn After reading your suggestions, a question occurs to me: would it be better to treat this concept as a struct with uniform field types, as opposed to an array? Because thinking about it, if all an Enum-Indexed array would do is give names to offsets in a series of values, that's pretty much what structs are for, with the differences being:

There isn't any enforcement for a struct to have uniform field types.
You can't associate a numerical value with the struct's field names.

Though, these appear to be relatively trivial to resolve, compared to figuring out what should and shouldn't be allowed in Enum-Indexed arrays, imo. One approach for the first issue would be introducing something like array struct or uniform struct; and one approach for the second issue would be something to the tune of 'tagged structs': const S = struct(Enum) {};

I don't want to derail discussion here too much, in case this idea should be considered separately, or if it's already been considered, so I'll leave it at that, but it's something to consider.

MageJohn · 2021-12-15T22:35:02Z

@InKryption I think that's worth discussing, yeah. Implemented correctly, it could neatly sidestep some of the issues discussed so far by reframing the concept. I think there are a few problems though, which I can't see easy solutions for.

At the conceptual level, I'm not sure it maps as neatly onto existing concepts as an enum array. In my mental model the fundamental difference between a struct and an array is that an array is uniform while a struct isn't; this model breaks in the face of uniform struct, but I'm not sure it should because they would probably still compile down to arrays.

There are also questions of access syntax. An use case of tagged structs/enum arrays over structs would be access by runtime known values, which doesn't really work with dot access (e.g. foo.name). The obvious alternative is array access (foo[i]), but if they're supposed to be a special type of struct then using array access could be confusing.

matu3ba · 2022-02-16T14:34:57Z

One use case for an enum array is to comptime-compute maximum buffers size and use the enum to point to offsets into the buffer for returning the according slice or (better?) generate at comptime an array.

What I think is missing in the proposal is to clarify if the enum is integer backed, since one could replace it with the offset address into whatever data is pointed to.
Further, I am missing in the proposal, if and how contiguous the pointed data from the enum should be.

If the data itself is contiguous and the backed integer is not necessary (as space waste), then one should be able to implement the comptime-computation of the slice, since one can generalize over the enum fields at comptime.

Can you describe your use cases for non-contiguous data?
Does anybody have data on how much performance one can expect from field reordering at maximum/average etc?
I would split the backing buffer, if only on a few fields are hot/frequently accessed or manually change the order.

Sketch of implementation with current capabilities of the language:

comptime construction

input:

array of (enum field, string) ie as array of struct fields
(any other structure that can be comptime-traversed would do)
- backed enum integers being continuous increasing (for simplicity starting at 0)

output:

concatenated string
array with (offsets,len) into string
usage: buf[offsets[2*@enumToInt(enum)..2*@enumToInt(enum)+1]]

As of now, there is currently a miscompilation, which prevents things from working smoothly: https://gist.github.com/matu3ba/07ea081b07639e7ecceb13173083484f

posted in this issue #10920

Pyrolistical · 2022-11-01T17:05:48Z

Isn't this already done? https://ziglang.org/documentation/master/std/#root;enums.EnumArray

matu3ba · 2022-11-02T10:45:07Z

@Pyrolistical So far EnumArray is only an an enum available as indexed array. I would close this once indirect indexing at comptime works and has tests.

ericlangedijk · 2024-11-05T21:11:51Z

Indexing by a (simple) enum would be nice. I always loved that in Delphi. It can make code very clear.

thejoshwolfe mentioned this issue Feb 25, 2018

Pointer Reform #770

Closed

andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Feb 25, 2018

andrewrk added this to the 0.3.0 milestone Feb 25, 2018

andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018

andrewrk modified the milestones: 0.4.0, 0.5.0 Sep 28, 2018

Hejsil mentioned this issue Dec 14, 2018

Array/Slice custom index type #1830

Open

andrewrk modified the milestones: 0.5.0, 0.6.0 May 9, 2019

andrewrk modified the milestones: 0.6.0, 0.7.0 Oct 17, 2019

andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 10, 2020

zigazeljko mentioned this issue Dec 8, 2020

Proposal: Short Vector Primitives #7295

Open

ifreund mentioned this issue Mar 18, 2021

Add some enum utilities #8171

Merged

andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021

andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021

andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022

andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023

andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enum Arrays #793

Enum Arrays #793

thejoshwolfe commented Feb 25, 2018 •

edited by andrewrk

Loading

thejoshwolfe commented Feb 25, 2018

raulgrell commented Feb 26, 2018

tiehuis commented Feb 27, 2018

Hejsil commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

Hejsil commented Feb 27, 2018

Hejsil commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

Hejsil commented Feb 27, 2018

raulgrell commented Feb 27, 2018

Hejsil commented Feb 27, 2018

Hejsil commented Feb 27, 2018

raulgrell commented Feb 27, 2018 •

edited

Loading

Hejsil commented Feb 27, 2018

Hejsil commented Mar 7, 2018

nodefish commented Apr 14, 2019 •

edited

Loading

nodefish commented Apr 14, 2019 •

edited

Loading

MageJohn commented Dec 15, 2021 •

edited

Loading

MageJohn commented Dec 15, 2021

InKryption commented Dec 15, 2021

MageJohn commented Dec 15, 2021

matu3ba commented Feb 16, 2022 •

edited

Loading

Pyrolistical commented Nov 1, 2022

matu3ba commented Nov 2, 2022

ericlangedijk commented Nov 5, 2024

Enum Arrays #793

Enum Arrays #793

Comments

thejoshwolfe commented Feb 25, 2018 • edited by andrewrk Loading

thejoshwolfe commented Feb 25, 2018

raulgrell commented Feb 26, 2018

tiehuis commented Feb 27, 2018

Hejsil commented Feb 27, 2018 • edited Loading

raulgrell commented Feb 27, 2018 • edited Loading

raulgrell commented Feb 27, 2018 • edited Loading

Hejsil commented Feb 27, 2018

Hejsil commented Feb 27, 2018 • edited Loading

raulgrell commented Feb 27, 2018 • edited Loading

Hejsil commented Feb 27, 2018

raulgrell commented Feb 27, 2018

Hejsil commented Feb 27, 2018

Hejsil commented Feb 27, 2018

raulgrell commented Feb 27, 2018 • edited Loading

Hejsil commented Feb 27, 2018

Hejsil commented Mar 7, 2018

nodefish commented Apr 14, 2019 • edited Loading

nodefish commented Apr 14, 2019 • edited Loading

MageJohn commented Dec 15, 2021 • edited Loading

MageJohn commented Dec 15, 2021

InKryption commented Dec 15, 2021

MageJohn commented Dec 15, 2021

matu3ba commented Feb 16, 2022 • edited Loading

Pyrolistical commented Nov 1, 2022

matu3ba commented Nov 2, 2022

ericlangedijk commented Nov 5, 2024

thejoshwolfe commented Feb 25, 2018 •

edited by andrewrk

Loading

Hejsil commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

Hejsil commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

raulgrell commented Feb 27, 2018 •

edited

Loading

nodefish commented Apr 14, 2019 •

edited

Loading

nodefish commented Apr 14, 2019 •

edited

Loading

MageJohn commented Dec 15, 2021 •

edited

Loading

matu3ba commented Feb 16, 2022 •

edited

Loading