Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vtable abstraction of some kind. traits? oop? polymorphism? interfaces? #130

Closed
babariviere opened this issue Feb 29, 2016 · 36 comments
Closed
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@babariviere
Copy link

That will be good if we have traits like in rust.

@andrewrk
Copy link
Member

andrewrk commented Feb 29, 2016

Indeed, perhaps we need an abstraction that provides a vtable of some kind. Traits seem pretty reasonable. I still need to think it all over before deciding on a particular abstraction.

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Feb 29, 2016
@andrewrk andrewrk changed the title Polymorphism vtable abstraction of some kind. traits? oop? polymorphism? interfaces? Oct 27, 2016
@fsaintjacques fsaintjacques mentioned this issue Nov 3, 2016
@andrewrk andrewrk added this to the 0.1.0 milestone Jan 16, 2017
@andrewrk andrewrk modified the milestones: 0.1.0, 0.2.0 Apr 20, 2017
@tiehuis tiehuis added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 15, 2017
@kyle-github
Copy link

Structural polymorphism? I.e. if I have:

`
const A = struct {
foo: u16;
}

const B = struct {
bar: A;
baz: f32;
}
`

Can you pass a pointer to B where a pointer to A was needed?

@kyle-github
Copy link

It looks like if you allow fields to be reordered by the compiler, then this would not be possible. That will prevent a certain kind of polymorphism from working. This is fine as long as it is explicitly stated somewhere. See the points in issue #307.

@andrewrk
Copy link
Member

Can you pass a pointer to B where a pointer to A was needed?

do_something_with_a(&b.bar)

@kyle-github
Copy link

OK, so that would not work if for some reason the compiler could reorder fields (it might work, but it would not be guaranteed).

Even though types A and B are different, you do not have to cast the B pointer?

C lets you cast and guarantees that the order of the fields in the struct definition is the order in memory. So, in C, this always works.

I tried to cover this use case in issue #488.

andrewrk added a commit that referenced this issue Nov 7, 2017
I started working on #465 and made some corresponding std.io
API changes.

New structs:
 * std.io.FileInStream
 * std.io.FileOutStream
 * std.io.BufferedOutStream
 * std.io.BufferedInStream

Removed:
 * std.io.File.in_stream
 * std.io.File.out_stream

Now instead of &file.out_stream or &file.in_stream to get access to
the stream API for a file, you get it like this:

var file_in_stream = io.FileInStream.init(&file);
const in_stream = &file_in_stream.stream;

var file_out_stream = io.FileOutStream.init(&file);
const out_stream = &file_out_stream.stream;

This is evidence that we might not need any OOP features -
See #130.
@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Nov 14, 2017
@noonien
Copy link

noonien commented Jan 26, 2018

Please consider taking inspiration from Golang's interfaces. They greatly help with making code easy to write and understand.

For example, all types that can read/write buffers of data, and expose the functionality with methods with the correct name and signature are considered to implement io.Reader/io.Writer by default.

This makes working with, and abstracting streams(and other things) really easy.

@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018
@phase
Copy link
Contributor

phase commented Apr 23, 2018

Go's interfaces aren't the best way to do things. Structural typing seems to go against the Zen of Zig.

@alexnask
Copy link
Contributor

alexnask commented Jun 14, 2018

In my opinion, zig's metaprogramming features are strong enough that this can reasonably be written as a library feature.

I'm just going ahead and drop a link to a proof of concept implementation (written before pointer reform): https://gist.github.com/alexnask/1d39fbc01b42ce2b5b628828b6d1fb46

This implementation pretty much works like Rust traits, the interface struct either owns or points to the implementation object and the vtable pointer and the vtables are generated at compile time, one time for each new implementation type.

EDIT: The code has been updated to master zig.

@binary132
Copy link

binary132 commented Jun 15, 2018

Quick thought: having a really tiny feature set is actually a very strong language feature. Maybe this kind of abstraction should be in a library that can be imported to get extra abstraction macros? Then people can still use the core language without having to understand many features. (This echoes @alexnask's thought above.)

@phase Zig already has duck typing in comptime, doesn't it?

@cgrandfield
Copy link

Started reading a bit about Zig today, seems like a really cool project.

Looks like comptime mostly provides monomorphism/generic support and that this ticket is for runtime polymorphism support where it seems the chosen direction seems to be some sort of comptime method to generate vtables.

One thing I wonder is if it might be possible to do runtime polymorphism without vtables by placing methods at fixed offsets.

Using the Rust Animal trait structure as an example https://doc.rust-lang.org/rust-by-example/trait.html if you had Horse, Cow, Sheep objects and implemented fn "name" at offset 0x000, fn "noise" at offset 0x100, fn "talk" at offset 0x200 from the beginning of the object's executable region you'd be able to invoke the name, noise, or talk methods on a Horse/Cow/Sheep without doing a vtable lookup - instead you could just call into the beginning of the respective objects executable region + the offset. A potential intermediate level of indirection between comptime monomorphism and vtable lookup.

Not totally sure such an approach could work or provide much advantage (you still need to get the start of the object's execution region), but it does seem like the sort of low level wizardry Zig is grappling with. Regardless, good luck with the language.

@adontz
Copy link

adontz commented Feb 24, 2020

@andrewrk

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

Pleasant to develop GUI requires data binding and run-time composition, both are huge pain without some kind of rich reflection like in .Net and Java, or at least like in Qt/MFC. Also, Qt/MFC use a lot of ugly macros and still are much inferior to WPF/XAML. I don't think Zig is a good fit for such kind of task. Actually I'm pretty sure it's misfit by modern standards, which is not bad, just state of things. I did not see new MFC projects started for years and Qt shifted focus to QML/JavaScript.

MIME parsing is friendly to OOP task, various MIME headers can be subclasses of base header, various MIME parts can be subsclasses of base part. MIME will be required at some point of time for HTTP or email anyway. I wrote MIME parsers in C++ and C#. Maybe I could write it in Zig with some help.

Just please, don't do multiple inheritance :-)

Also, maybe aggregation can be easier?

const A = struct {
  fn doSomething(...) ... {};
}

const B = struct {
  a: A;
  fn doSomething = a.doSomething; // Note not declaring parameters or return value. Parameter forwarding code will be automatically generated.
}

@Sobeston
Copy link
Contributor

Just please, don't do multiple inheritance :-)

Don't worry, there won't be singular either

@adontz
Copy link

adontz commented Feb 25, 2020

Just please, don't do multiple inheritance :-)
@Sobeston

Don't worry, there won't be singular either

Why whould you need a vtable if a method cannot be overriden?

@ghost
Copy link

ghost commented Feb 25, 2020

Why whould you need a vtable if a method cannot be overriden?

vtables are not an exclusive feature of inheritance, they are just one way of doing dynamic dispatch. Just as useful for implementing traits/interfaces.

andrewrk added a commit that referenced this issue Mar 10, 2020
The main goal here is to make the function pointers comptime, so that we
don't have to do the crazy stuff with async function frames.

Since InStream, OutStream, and SeekableStream are already generic
across error sets, it's not really worse to make them generic across the
vtable as well.

See #764 for the open issue acknowledging that using generics for these
abstractions is a design flaw.

See #130 for the efforts to make these abstractions non-generic.

This commit also changes the OutStream API so that `write` returns
number of bytes written, and `writeAll` is the one that loops until the
whole buffer is written.
@andrewrk andrewrk mentioned this issue Mar 29, 2020
11 tasks
@marnix
Copy link

marnix commented Jun 8, 2020

In one of @andrewrk's live streams I recently binge-watched, this issue was briefly discussed, and I got interested. Allow me to try and contribute, by adding some knowledge I picked up over the years, which might be relevant to this issue. Let me also try to untangle some concepts that are often packaged up together (like inheritance and dynamic dispatch).

(Warning, long comment ahead. I am a Zig noob, and a programming languages amateur. So grains of salt apply. But over the years I did read a fair amount of language design papers. See relevant references at the end of this comment.)

Let me first give the terms I'll be using; hopefully I'm using them correctly.

  • virtual function. A single function signature whose implementation consists of multiple parts (methods) which may or may not exist in separate source files / modules / classes.
  • dynamic dispatch. Choosing which method to call, whenever a virtual function is called.
  • inheritance, interfaces, traits. Mechanisms that enable saying 'every X is-a Y', so that whatever you can do with a Y can be done with an X.
  • vtables. An implementation method for dynamic dispatch in the presence of inheritance/interfaces/traits, where a value (often called 'object') carries with it the list of methods that apply, one per virtual function.

The semantics

The core idea / suggestion is the following. One can think of the methods of a virtual function, as each having a predicate or condition on the parameters (like self isa Circle). And conditions can be compared to each other, in a partial order, in the sense that self isa Circle implies self isa Shape for all self, or x >= 10 implies x > 0 for all x, or b: u8 implies b: i32, for all b; in every case the former predicate is more specific than the latter. So the suggestion is to do dynamic dispatch by (1) finding out which methods apply for a given parameters tuple; (2) choosing the one with the unique most-specific predicate; and (3) failing at compile-time in case zero or multiple methods could be chosen.

This approach is called predicate dispatch. Note that this is a symmetrical way of doing dynamic dispatch: it treats all virtual function parameters equally. And using only the first parameter with subtyping, so self isa X, in the predicate results in traditional single dispatch; and using isa on multiple parameters gives multiple dispatch. So this is a generalization of multimethods.

Here is an example, using new hypothetical syntax marking methods with if:

fn fizzbuzz(n: u8) if (true) anyerror!void { // method #1
    try stdout.print("{}", .{n});
}
fn fizzbuzz(n: u8) if (n % 3 == 0) anyerror!void { // method #2
    try stdout.print("Fizz", .{});
}
fn fizzbuzz(n: u8) if (n % 5 == 0) anyerror!void { // method #3
    try stdout.print("Buzz", .{});
}
fn fizzbuzz(n: u8) if (n % 3 == 0 && n % 5 == 0) anyerror!void { // method #4
    try stdout.print("FizzBuzz", .{});
}

Note how we have no traits or whatever, but still there is a notion of dynamic dispatch: Every call to fizzbuzz() chooses one of methods. And it chooses only between methods that match the predicate.

Method #4 takes precedence over #2 and #3 (it "overrides" them), since, e.g., for all n, n % 3 == 0 && n % 5 == 0 implies n % 3 == 0; and similarly methods #2 and #3 each override #1.

If the method #4 were left out, then there are parameter values (e.g., 30, actually all n for which n % 3 == 0 && n % 5 == 0) where is there no unique most-specific method to be chosen: both #2 and #3 apply and neither is overrides the other. This should be a compile error.

If the method #4 were to use the equivalent predicate n % 15 == 0, then the compiler probably won't be smart enough to deduce that, e.g., n % 15 == 0 implies n % 5 == 0, so also in this case it would report that there is no method covering the n % 3 == 0 && n % 5 == 0 intersection.

Finally, adding more hypothetical syntax, one could perhaps use super to call an overridden method, and (in this case) one would have to explicitly choose between multiple overridden methods.

fn fizzbuzz(n: u8) if (n % 3 == 0 && n % 5 == 0) anyerror!void {
    try super[n % 3 == 0](n); // restricts to methods #1 and #2, always selects #2, prints "Fizz"
    try super[n % 5 == 0](n); // restricts to methods #1 and #3, always selects #3, prints "Buzz"
}

The implementation

Now about how this could be implemented. One way is 'vtables', where every value knows the list of methods that apply, one pointer-to-method per virtual function. And when, say the 7th virtual function is called, we do a indirect function call through the 7th pointer in that list/vtable.

These indirect function calls are notoriously difficult to optimize, requiring an analysis at each call site to see what values are possible, and optimizing if e.g. all values at that point are guaranteed to have the same vtable / list of pointers.

This 'vtables' approach corresponds to taking the following table row-by-row.

virtual_function_1 ... virtual_function_n
value_1 method ... method
value_2 method ... method
... ... ... ...
value_n method ... method

The key insight is that one can also take this table column-by-column.

That means that for every virtual function, we generate some code that does a couple of ifs on these methods' predicates, calling each of the methods in the various branches.

As an example, the above FizzBuzz code would become 4 ordinary functions, with a dispatch function that looks something like this:

fn fizzbuzz(n: u8) anyerror!void {
    if (n % 5 == 0) {
        if (n % 3 == 0) {
            try fizzbuzz_4(n);
        } else {
            try fizzbuzz_3(n);
        }
    } else {
        if (n % 3 == 0) {
            try fizzbuzz_2(n);
        } else {
            try fizzbuzz_1(n);
        }
    }
}

The great benefit here, it seems to me, is that that generated code is a lot easier to optimize, since the methods can trivially be inlined, and then all kinds of optimization can be done. (And the required compile-time knowledge about which predicates always-imply which, will lead to additional optimization opportunities.)

So ultimately the result would be equivalent to something like this:

fn fizzbuzz(n: u8) anyerror!void {
    var b3 = (n % 3 == 0);
    var b5 = (n % 5 == 0);
    if (b3) {
        try stdout.print("Fizz", .{});
    }
    if (b5) {
        try stdout.print("Buzz", .{});
    } else if (!b3) {
        try stdout.print("{}", .{n});
    }
}

(I don't know whether this alternative dual-to-vtables approach has a name in the literature.)

Summary

In summary, I think there are two messages in this comment. First, dynamic dispatch is separate from inheritance and interfaces and stuff, so one could already have the former without yet having the latter. Second, there is an alternative implementation to vtables, which seems more optimizer friendly, and which naturally allows multiple dispatch and even predicate dispatch.

(Oh, and I have no idea about how to implement this in Zig, and how much could be done in a library, and how much would need to go into the language, and whether some parts might benefit the language because they offer other new possibilities.)

References

The direct inspiration for this comment is the work done on predicate dispatch from the following papers:

as prototyped in the Cecil language (http://projectsweb.cs.washington.edu/research/projects/cecil/).

@anosovic
Copy link

anosovic commented Jun 9, 2020

how does the solution @marnix summarized solve the problem?

the reason it is desirable to have a function signature's implementation consist of multiple parts (methods) which may or may not exist in separate source files / modules / classes is to allow different programmers to write different implementations for the same api. classic oop example: you want to do logic with an animal generically, and developer A wrote a dog animal and developer B wrote a cat animal. with dynamic dispatch, you can talk to the dog code and the cat code in the same language: both animals speak, sleep, eat, etc and the burden is not on the programmer but the language to run developer A's code when a dog is used and developer B's code when a cat is used.

easy to see how this generalizes to @andrewrk's GUI widgets problem

without dynamic dispatch, at some point developer A and developer B would have to work on the same code, or the user would have to manually wire up each individual function for each api they talk to, in order to accomplish this feature for users of the animal api. this problem is why this thread exists, right? with dynamic dispatch, no one has to sort it out manually. each developer just conforms to the api spec and all users are instantly able to use it.

I wonder if we can maintain zig's goals of being a predictable language and also have dynamic dispatch. maybe, but does the solution @marnix summarized maintain that property completely?

is my summary of the problem not right? let me know!

@alexnask
Copy link
Contributor

alexnask commented Jun 9, 2020

I will just reiterate my 2018 comment on this issue which is that we can implement this in zig itself.
Here is a package that introduces interface types similar to Rust traits, with more control over the ownership and storage of the implementation object: https://github.com/alexnask/interface.zig (split off from my std.interface PR)

This should be a lot more optimizer friendly (I will compile a list of examples soon ™️ with reimplementation of std interfaces) than the current pattern used in the stdlib, e.g. for Allocator, and allows the implementation types to be completely separate from the interface type.

@data-man
Copy link
Contributor

data-man commented Jun 9, 2020

There is interface-experiment branch. ;)
I'll hope that this experiment will be implemented.

@anosovic
Copy link

anosovic commented Jun 9, 2020

additionally:

this (https://compileandrun.com/struggles-with-rust/) article is cited by @andrewrk in "Why Zig and Not Rust"?

the problem is:

[. . .] You can't add trait implementations of traits you don't own to types you don't own, and there is no workaround but newtypes. – Vladimir Matveev

does https://github.com/alexnask/interface.zig support this? does interface-experiment support this?

@alexnask
Copy link
Contributor

alexnask commented Jun 9, 2020

@anosovic
interface.zig will automatically make a vtable from the implementation type when initializing an interface object, so the type only needs to provide functions compatible with what the interface expects (to be exact, the arguments except for self need to coerce from one signature to the other as well as the return type).
You can also provide a manually written vtable if the functions don't match exactly.

In interface-experiment it seems like you provide the vtable (or at least a tuple of fn pointers) yourself when using @implement so it also detaches interfaces/traits from the implementation types.

@marnix
Copy link

marnix commented Jun 9, 2020

A quick note on the status of my earlier comment. I just wanted to provide information, not a solution.

I wanted to make clear that vtables are not the only way to do dynamic dispatch, and that this alternative is more optimizer-friendly.

This doesn't solve anything related to patterns for GUI-development, unless accompanied by some independent pattern or feature that allows to express 'X implements Y' or 'P is-a Q' (or perhaps even 'ExampleReader is-a Reader.IFace').

It also (on purpose) doesn't say anything about the interaction of separate methods with separate compilation.

Finally, I have no idea what smart tricks can be done to do part of 'predicate dispatch' outside of the language. I also don't know how you all want to trade off optimizer-friendly vs library-implementable vs easy-to-use.


So if you already have a mechanism like interface.zig, that allows values of different types to be used through a common interface, if those types cooperate, then that might be Good Enough(tm) to avoid the additional language complexity of a mechanism like predicate dispatch.

But then again, perhaps adding (some feature supporting) predicate dispatch is a small price to pay and unlocks nice new patterns with great optimization opportunities, for those people who say, "all those pointer indirections are too expensive for me, so I'll avoid those nice abstractions".

@andrewrk
Copy link
Member

andrewrk commented Oct 9, 2020

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

Hello, myself from 4 years ago. I have not done the thing you suggested for me to do regarding creating a GUI application with widgets and stuff. However I am confident in the language's ability to handle the use case without an additional language feature.

We still have #764 open which could cause this to get revisited. The generic reader/writer interface has its pros and cons. However, at this point the plan is to not add an additional dynamic dispatch language feature.

And so I am closing this proposal, with reasonable confidence in it not being re-opened.

@flavius
Copy link

flavius commented Jul 31, 2021

@andrewrk It's not quite clear, are traits/oop/polymorphism/interfaces not on the radar at all, not within scope, or not yet technically implementable, but on the radar for 1.0.0, for 2.0.0?

I think that a page summing it all up as suggested here would be useful: ziglang/www.ziglang.org#30

Personally, I think that having these features, even if opt-in, would make zig more palatable to busines-type of applications / web development, an area where rust comes rather with resistance, primarily because of performance concerns (which these kind of applications don't have). Without making this a discussion about rust: they do have the idea of TypeId, but it's not really useful in practice, maybe zig can fill that "niche"?

@vitiral
Copy link

vitiral commented Dec 7, 2022

I'm just reading up on Zig and really liking the language. I was curious about interfaces and have thought about this for my own minimalist language, so I just wanted to drop in a quick comment that the simplest possible V-table is something like below (in C)

// An Interface just a collection of methods and some data.
typedef struct {
  void (*add)(void* this, int a); // pointer to add method
  ... other methods
} MInterface;

typedef struct {
  const MInterface* m, // a pointer to methods
  void* d;         // a pointer to data of an unknown type
} Interface;

typedef struct {int a;} MyData;

void MyData_add(MyData* this, int a) {
  this->a += a;
}

const MInterface mData = (MInterface) {
  // Note: the C compiler doesn't actually let you do this and requires a cast
  .add = MyData_add,
};

Interface MyData_asInterface(MyData* d) { return (Interface) {.m = &mData, .d = d }; }

// Example use
MyData myData = (MyData) { .a = 4 };
Interface myInterface = MyData_toInterface(&myData);
myInterface.m->add(myInterface.d, 42);  // You probably make a macro for this
assert(46 == myData.a);

Note: this does not allow for a single variable implementing multiple interfaces -- something couldn't implement Debug and Eq. That's what a traditional V-table accomplishes! However, interface "single inheritance" could be done. If a sub-set of the interface's methods were identical to another interface, then the pointer conversion would be trivial. For instance, you could make:

typedef struct {
  void (*close)(void* d);
} MResource;

typedef struct {
  void (*read)(void *d);
  ...
  void (*close)(void* d);
} MFile;

And an MFile could be converted to an MResource by just bumping the pointer.

Obviously this isn't how you would represent inheritance in the syntax -- likely you would include MResource inside of MFile to make the inheritance clear.

@vitiral
Copy link

vitiral commented Dec 7, 2022

And... it looks like that's basically how things are implemented, if I'm reading things right:

https://github.com/ziglang/zig/blob/master/lib/std/mem/Allocator.zig

@CodePagesNet
Copy link

Only allow v-tables that use one level of indirection at the most! v-tables should be as close to the performance as a single function pointer as possible. Please! Don't be yet another language that messes this up.
https://engineering.purdue.edu/tgrogers/publication/zhang-asplos-2021/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests