Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design choices for a proc macro implementation #423

Closed
silvanshade opened this issue Feb 10, 2023 · 12 comments
Closed

Design choices for a proc macro implementation #423

silvanshade opened this issue Feb 10, 2023 · 12 comments
Labels
A-objc2 Affects the `objc2`, `objc2-exception-helper` and/or `objc2-encode` crates enhancement New feature or request question Further information is requested

Comments

@silvanshade
Copy link
Contributor

Since I've been working on a proof-of-concept implementation of proc macro equivalents for declare_class!, extern_methods!, etc., I encountered a number of different points where there were some interesting choices to be made in the design space and I thought it would be a good idea to discuss some of those.

For a point of reference, take a current macro_rules! based definition like this:

declare_class!(
    struct Delegate {
        text_field: IvarDrop<Id<NSTextField>, "_text_field">,
        web_view: IvarDrop<Id<WKWebView>, "_web_view">,
    }
    mod ivars;

    unsafe impl ClassType for Delegate {
        type Super = NSObject;

        const NAME: &'static str = "Delegate";
    }

    unsafe impl Delegate {
        #[method(initWithTextField:andWebView:)]
        unsafe fn __initWithTextField_andWebView(
            self: &mut Self,
            text_field: *mut NSTextField,
            web_view: *mut WKWebView,
        ) -> Option<&mut Self> {
            let this: Option<&mut Self> = msg_send![super(self), init];
            let this = this?;
            Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
            Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
            Some(this)
        }
    }
);

extern_methods!(
    unsafe impl Delegate {
        #[method_id(initWithTextField:andWebView:)]
        #[allow(non_snake_case)]
        pub fn initWithTextField_andWebView(
            this: Option<Allocated<Self>>,
            text_field: &NSTextField,
            web_view: &WKWebView,
        ) -> Id<Self>;
    }
);

The equivalent in terms of the proof-of-concept proc macros currently looks like this:

#[objc(super = NSObject)]
mod Delegate {
    struct Delegate {
        text_field: IvarDrop<Id<NSTextField>>,
        web_view: IvarDrop<Id<WKWebView>>,
    }

    unsafe impl Delegate {
        #[objc(sel = "initWithTextField:andWebView:")]
        unsafe fn __initWithTextField_andWebView(
            self: &mut Self,
            text_field: *mut NSTextField,
            web_view: *mut WKWebView,
        ) -> Option<&mut Self> {
            let this: Option<&mut Self> = msg_send![super(self), init];
            let this = this?;
            Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
            Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
            Some(this)
        }

        // NOTE: we only need this until `#173: Support super in msg_send_id!` is merged
        pub fn initWithTextField_andWebView(
            this: Option<Allocated<Self>>,
            text_field: &NSTextField,
            web_view: &WKWebView,
        ) -> Id<Self>;
    }
}

A couple of observations about this:

Originally, I was thinking it would make sense to have more separate macros like #[class], #[protocol], etc., when I proposed something looking closer to this:

#[class(extern, super = NSActionCell, inherits = NSCell, NSObject)]
struct NSPathCell;

#[class]
unsafe impl NSPathCell {
    ...
}

#[protocol]
pub unsafe trait NSPathCellDelegate {
    ...
}

But at that time I didn't realize yet that we need to be able to parse the class struct and the class impl together in order to correctly define the ::class() method (because it registers the methods when first called).

Unfortunately, there is also no practical way (that I know of) to manage state across proc-macro invocations. So the only real obvious choice as an alternative is to place the respective struct and impl items within an enclosing item so the proc macro can work similarly to declare_class. Which leads to the choice of using mod.

Given an invocation like this:

#[objc(super = <superclass>, inherits? = [<superclass>*])]
mod <ClassName> {
    ...
}

What happens is the macro expects to find, within the mod <ClassName>, a struct <ClassName> { ... }, or a type <ClassName>; (note the lack of =). The actual mod is just a dummy item and is not emitted, only the items it encloses are emitted. Furthermore, the name of the struct or type must exactly match the name of the mod, and only a single struct xor type is allowed.

Within a class #[objc(super = <superclass>)] mod C { ... }, an impl is translated in the following way.

Specifying the selector is not mandatory (if omitted, it is computed from the current camel-case/snake-case hybrid scheme we use, correctly handling trailing :).

Also, #[method] / #[method_id] are not necessary since we determine this from the method return type (looking for -> Id<...> or Result<Id<...>, ...>), although as with selectors it is possible to manually control this behavior. In that case you can specify #[objc(managed)] fn f(&self, args*) -> ... (without explicit retain semantics) or #[objc(managed = "init")] fn f(args*) -> ... (with explicit retain semantics).

For impl C, we translate methods fn(args*) -> ... { ... } as class methods, fn f(&self, args*) -> ... { ... } as instance methods, similar as for declare_class!. Methods fn(&self?, args*) -> ...; (which are not valid Rust syntax, but which we can parse) are handled as with extern_methods!.

For impl T for C, we translate the enclosed methods as protocol methods.

One choice I've been considering is splitting this behavior up a little more and using extern blocks along with mod, in the following sense.

For #[objc] mod C { ... } we would only allow a class struct and not a class type. Furthermore, we would no longer parse methods without bodies like fn f(...) -> ...; within impl items in the class mod.

Instead, to handle those cases, you would now write this:

#[objc(super = Delegate)]
unsafe extern "ObjC" {
    type Delegate;

    fn initWithTextField_andWebView(
        this: Option<Allocated<Self>>,
        text_field: &NSTextField,
        web_view: &WKWebView,
    ) -> Id<Self>;

    fn control_textView_doCommandBySelector(
        &self,
        _control: &NSControl,
        text_view: &NSTextView,
        command_selector: Sel,
    ) -> bool 
}

The obvious disadvantages to this approach are that it's maybe a little uglier (since we don't have impl C) and we'd probably still need an outer enclosing mod to handle protocol translations, since we also can't write impl P for C within extern.

Advantages are that it's arguably clearer what is happening semantically, specifically because we are using extern type here. It's also arguably easier to parse, since within extern, having type T; and fn f() -> ...; is valid syntax.

The latter part is not a huge issue, since in the case of syn, it handles those non-valid syntax cases as a raw TokenStream, it just requires re-implementing some of the parsing for those items by hand. But to be honest, I am already doing some of that in order to parse items within a class mod without backtracking (e.g., several items are ambiguous until after you parse attributes and visibility qualifiers).

This is also the approach that cxx and wasm-bindgen use with their proc-macros.

Actually, with cxx you have this:

#[cxx::bridge]
mod ffi {
    // Any shared structs, whose fields will be visible to both languages.
    struct BlobMetadata {
        size: usize,
        tags: Vec<String>,
    }

    extern "Rust" {
        // Zero or more opaque types which both languages can pass around but
        // only Rust can see the fields.
        type MultiBuf;

        // Functions implemented in Rust.
        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
    }

    unsafe extern "C++" {
        // One or more headers with the matching C++ declarations. Our code
        // generators don't read it but it gets #include'd and used in static
        // assertions to ensure our picture of the FFI boundary is accurate.
        include!("demo/include/blobstore.h");

        // Zero or more opaque types which both languages can pass around but
        // only C++ can see the fields.
        type BlobstoreClient;

        // Functions implemented in C++.
        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
        fn put(&self, parts: &mut MultiBuf) -> u64;
        fn tag(&self, blobid: u64, tag: &str);
        fn metadata(&self, blobid: u64) -> BlobMetadata;
    }
}

where the stuff in extern "Rust" { ... } is used for generating header files for using Rust definitions from C++. AFAIK, we don't have an equivalent for that (and maybe it's out of scope), but it might be worth considering as a future option.

And there's also the part where cxx uses the include! directive in the extern "C++" block for generating bindings. Something that might be interesting for us to consider, if proc macros seem like the way to go, is making the header-translator functionality available in terms of macro invocations instead of requiring it to be run externally.

I think that's all I have to say about this for now. I didn't mention macros for static, fn, and enum, but I was planning on just re-using the #[objc] macro for that. It trivial to determine which item it is applied to, so it seemed to make sense to minimize the number of names we use for the macros. But maybe something other than #[objc] would be appropriate too.

Any thoughts or feedback on this? Does it make sense to split the functionality into extern even if it's more verbose?

@madsmtm madsmtm added enhancement New feature or request question Further information is requested A-objc2 Affects the `objc2`, `objc2-exception-helper` and/or `objc2-encode` crates labels Feb 13, 2023
@madsmtm
Copy link
Owner

madsmtm commented Feb 13, 2023

But at that time I didn't realize yet that we need to be able to parse the class struct and the class impl together in order to correctly define the ::class() method (because it registers the methods when first called).

Unfortunately, there is also no practical way (that I know of) to manage state across proc-macro invocations. So the only real obvious choice as an alternative is to place the respective struct and impl items within an enclosing item so the proc macro can work similarly to declare_class.

There is linkme, but that's a hack. But given that the runtime/ClassType::class requires it, I think it's better to just keep it in the same invocation than to try to split the macros up.

Which leads to the choice of using mod.

Given that we don't output this module, I think it violates the Rust API guideline that input syntax is evocative of the output (C-EVOCATIVE).

What's the problem with just continuing to use the existing declare_class!( ... ); syntax?
I assume the reason is that you want #[class] and #[declare_class]/#[objc] to look more similar?

Specifying the selector is not mandatory (if omitted, it is computed from the current camel-case/snake-case hybrid scheme we use, correctly handling trailing :).

I don't think we should try to support this, since we want to change the naming scheme in icrate anyhow, and in that case the feature will be basically useless, or at least more confusing than helping.

Also, #[method] / #[method_id] are not necessary since we determine this from the method return type (looking for -> Id<...> or Result<Id<...>, ...>), although as with selectors it is possible to manually control this behavior.

This is already possible to implement whether one uses proc-macros or not, but the reason I didn't do it is because it will only be a textual match; rc::Id<...> will fail, and that will be very confusing for the user!
If we want to get rid of #[method(...)] vs. #[method_id(...)], it should instead be done in the type-system (which is definitely possible, I didn't do it originally because I feared the create-a-const-and-compute-retain-semantics-from-selector would be prohibitively expensive when you didn't need it).

In that case you can specify #[objc(managed)] fn f(&self, args*) -> ... (without explicit retain semantics) or #[objc(managed = "init")] fn f(args*) -> ... (with explicit retain semantics).

I don't really want the user to be able to specify explicit retain semantics; if their use-case requires it, they can just define the method manually (see LAYERED_SAFETY.md for a bit more on the philosophy that we don't have to handle every use-case under the sun because the user can do it themselves).

Methods fn(&self?, args*) -> ...; (which are not valid Rust syntax, but which we can parse) are handled as with extern_methods!.

Now that we can use proc-macros to name the internal method something else, I'd rather just do something like:

// Input
#[method(test)]
pub fn test(&self) -> i32 {
    1
}

// Output
builder.add_method(sel!(test), Self::__test);
fn __test(&self, _cmd: Sel) -> i32 {
    1
}
pub fn test(&self) -> i32 {
    unsafe { msg_send![self, test] }
}

Advantages are that it's arguably clearer what is happening semantically, specifically because we are using extern type here. It's also arguably easier to parse, since within extern, having type T; and fn f() -> ...; is valid syntax.

Agreed, extern ABI { type T; } is also roughly what we'd output in the future, so that part indeed more correctly tells the user what is happening.

The latter part is not a huge issue, since in the case of syn, it handles those non-valid syntax cases as a raw TokenStream, it just requires re-implementing some of the parsing for those items by hand. But to be honest, I am already doing some of that in order to parse items within a class mod without backtracking (e.g., several items are ambiguous until after you parse attributes and visibility qualifiers).

Heh, yeah, no-one ever said this task would be easy - but I think macro implementation difficulty should not really be considered when trying to find ideal syntax.

Actually, with cxx you have this:

[snip]

where the stuff in extern "Rust" { ... } is used for generating header files for using Rust definitions from C++. AFAIK, we don't have an equivalent for that (and maybe it's out of scope), but it might be worth considering as a future option.

Yeah, perhaps it would indeed be smart to try to take ideas from cxx, it does seem like a lot of time has been spent there on making the macro easy to use and understand. However, I think (though I can't find it right now) that there's a somewhat competing project to cxx, which to me says that the design space is not exhausted, so it might also be interesting to see what they do (again, I wish I could find a reference).

Also, I would like to note, I think the cxx example is confusing! It is not at all clear to me what &self in put, tag and metadata refers to (BlobstoreClient, it turns out).

And there's also the part where cxx uses the include! directive in the extern "C++" block for generating bindings. Something that might be interesting for us to consider, if proc macros seem like the way to go, is making the header-translator functionality available in terms of macro invocations instead of requiring it to be run externally.

Yeah, except it somewhat conflicts with #345 (unless we start requiring clang and compiling a bunch of ClangImporter to be able to use objc2).

I didn't mention macros for static, fn, and enum,

I don't think those need to be implemented using a proc-macro, or even be pub at all.

re-using the #[objc] macro for that. It trivial to determine which item it is applied to, so it seemed to make sense to minimize the number of names we use for the macros. But maybe something other than #[objc] would be appropriate too.

At the very least, I think the name declare is important to have when we're actually declaring a class (or in the future, protocol).
Opinion-wise, I don't think the extern_class, extern_methods, declare_class names are really that bad, sure, you have to import a few more things, but on the other hand it's also much easier to document separately what each macro does. Perhaps we could consider #[objc2::declare] and #[objc2::extern] as alternatives?


Also, you should check out objrs, they have done some of the same work in this space.

@silvanshade
Copy link
Contributor Author

Which leads to the choice of using mod.

Given that we don't output this module, I think it violates the Rust API guideline that input syntax is evocative of the output (C-EVOCATIVE).

Hmm, that is a point to consider. I'm not entirely sure if I would agree that it violates those principles though.

Conceptually, a module is used as input, and everything contained is produced as output, it's just that the scope of everything in the module is imported afterwards, rendering the module redundant (except for grouping input).

Alternatively, we could leave the module in place in the output, but what purpose would that really serve?

What's the problem with just continuing to use the existing declare_class!( ... ); syntax? I assume the reason is that you want #[class] and #[declare_class]/#[objc] to look more similar?

There are really a few points here that I think make proc macros interesting to consider:

  1. I think that (after the initial groundwork), proc macros should be easier to understand and maintain. Perhaps it's just me, but I find it really difficult to read the current macro implementation (and indeed in a number of cases I've had to sort of recreate the excepted behavior by looking at the fully expanded output since I couldn't fully understand the call graph).

  2. There's also the issue of performance. It's still an open question to me how much of an impact on performance either approach will have, but I do know that in some cases (e.g., very large enums), the current approach is quite slow (although from the notes it seems to be using the tt-muncher approach you mentioned). Given the complexity of the entire macro collection, it seems likely to me that there's some non-trivial performance improvements to be had by switching to a more imperative, loop-based approach like what is possible with proc-macros.

  3. There's also the fact that it's generally easier to support more of the full rust syntax with syn (considering, e.g., corner cases like with generics and where clauses and other things that complicate the other approach). It's not necessarily that we will need or want to support all of that additional stuff, but just, it's easier to do so this way I think (and indeed I've noticed that as I've been porting the macros).

  4. Finally, we can have much better errors (both in terms of messages and in terms of location with Span).

I don't think the current approach is bad though, I mean it works quite well so far. Maybe in the end it's better to just keep it that way, I don't know.

Specifying the selector is not mandatory (if omitted, it is computed from the current camel-case/snake-case hybrid scheme we use, correctly handling trailing :).

I don't think we should try to support this, since we #284, and in that case the feature will be basically useless, or at least more confusing than helping.

True, but maybe it would still be useful for people who are using it for their own code outside of icrate? It wasn't hard to implement though so either way I don't really have a strong opinion about it.

Also, #[method] / #[method_id] are not necessary since we determine this from the method return type (looking for -> Id<...> or Result<Id<...>, ...>), although as with selectors it is possible to manually control this behavior.

This is already possible to implement whether one uses proc-macros or not, but the reason I didn't do it is because it will only be a textual match; rc::Id<...> will fail, and that will be very confusing for the user!
If we want to get rid of #[method(...)] vs. #[method_id(...)], it should instead be done in the type-system (which is definitely possible, I didn't do it originally because I feared the create-a-const-and-compute-retain-semantics-from-selector would be prohibitively expensive when you didn't need it).

Yeah, I eventually realized that what you describe would be an issue, but I thought I should at least try to implement it as an experiment. I suppose my impression (which wasn't tested) was that the error message from a hypothetical case like that would still be clear enough that the user would have some idea of what was happening.

But I certainly agree it's a hack and that handling it through the type system would be far better if feasible. This is also probably the most complex part of the macro implementation, so removing it would be nice.

In that case you can specify #[objc(managed)] fn f(&self, args*) -> ... (without explicit retain semantics) or #[objc(managed = "init")] fn f(args*) -> ... (with explicit retain semantics).

I don't really want the user to be able to specify explicit retain semantics; if their use-case requires it, they can just define the method manually (see LAYERED_SAFETY.md for a bit more on the philosophy that we don't have to handle every use-case under the sun because the user can do it themselves).

Okay, that seems reasonable to me.

Also, I would like to note, I think the cxx example is confusing! It is not at all clear to me what &self in put, tag and metadata refers to (BlobstoreClient, it turns out).

I actually agree that that example is a little confusing. The cxx docs are clearer on the use of self:

Any signature having a self parameter (the Rust name for C++'s this) is considered a method / non-static member function. If there is only one type in the surrounding extern block, it'll be a method of that type. If there is more than one type, you can disambiguate which one a method belongs to by writing self: &BlobstoreClient in the argument list.

Yeah, except it somewhat conflicts with #345 (unless we start requiring clang and compiling a bunch of ClangImporter to be able to use objc2).

Something like the latter part is what I was imagining long term I guess.

Perhaps we could consider #[objc2::declare] and #[objc2::extern] as alternatives?

Yeah, something like that could work too. The point about being more descriptive is probably a good idea.

Also, you should check out objrs, they have done some of the same work in this space.

That looks interesting. The macro design and grammar looks nice but I don't really understand what the implementation is doing (there's a lot of low-level stuff in there). Also, they seem to manage to decouple macro invocations for the class struct and class impl so I'm assuming they must be constructing and/or representing the class internals differently somehow. Any ideas about that? I'm assuming we can't really manage to do that still.

@madsmtm
Copy link
Owner

madsmtm commented Feb 13, 2023

Which leads to the choice of using mod.

Given that we don't output this module, I think it violates the Rust API guideline that input syntax is evocative of the output (C-EVOCATIVE).

Hmm, that is a point to consider. I'm not entirely sure if I would agree that it violates those principles though.

Conceptually, a module is used as input, and everything contained is produced as output, it's just that the scope of everything in the module is imported afterwards, rendering the module redundant (except for grouping input).

Alternatively, we could leave the module in place in the output, but what purpose would that really serve?

Well, we could probably just wrap things in extern "Objective-C" { ... } instead then? I think the syntax I'd prefer then would likely be something like:

// In user-code:

#[objc2::declare]
unsafe extern "Objective-C" {
    #[super = NSObject]
    // #[inherits = ...]
    #[name = "MyApplicationDelegate"]
    struct Delegate {
        text_field: IvarDrop<Id<NSTextField>>,
        web_view: IvarDrop<Id<WKWebView>>,
    }

    impl Delegate {
        #[method(initWithTextField:andWebView:)]
        unsafe fn initWithTextField_andWebView(
            this: Allocated<Self>,
            text_field: *mut NSTextField,
            web_view: *mut WKWebView,
        ) -> Option<&mut Self> {
            let this = unsafe { msg_send![super(this), init]? };
            Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
            Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
            Some(this)
        }
    }
    
    unsafe impl NSApplicationDelegate for Delegate {
        // ...
    }
}

// In `icrate`:

#[objc2::bridge]
unsafe extern "Objective-C" {
    /// Doc comment
    #[derive(PartialEq, Eq, Hash, Debug)]
    #[super = NSObject]
    // #[name = "NSString"]
    pub type NSString;

    impl NSString {
        #[method(length)]
        fn length() -> usize;
    }

    /// Doc comment
    // #[name = "NSLocking"]
    pub unsafe trait NSLocking {
        #[method(lock)]
        unsafe fn lock(&self);

        #[method(unlock)]
        unsafe fn unlock(&self);
    }
}

(Assuming that works with rustfmt).

What's the problem with just continuing to use the existing declare_class!( ... ); syntax? I assume the reason is that you want #[class] and #[declare_class]/#[objc] to look more similar?

There are really a few points here that I think make proc macros interesting to consider:

[snip]

Sorry, I meant: Convert declare_class to a proc-macro, but simply let it stay as a function-like macro?

Also, #[method] / #[method_id] are not necessary since we determine this from the method return type (looking for -> Id<...> or Result<Id<...>, ...>), although as with selectors it is possible to manually control this behavior.

This is already possible to implement whether one uses proc-macros or not, but the reason I didn't do it is because it will only be a textual match; rc::Id<...> will fail, and that will be very confusing for the user!
If we want to get rid of #[method(...)] vs. #[method_id(...)], it should instead be done in the type-system (which is definitely possible, I didn't do it originally because I feared the create-a-const-and-compute-retain-semantics-from-selector would be prohibitively expensive when you didn't need it).

Yeah, I eventually realized that what you describe would be an issue, but I thought I should at least try to implement it as an experiment. I suppose my impression (which wasn't tested) was that the error message from a hypothetical case like that would still be clear enough that the user would have some idea of what was happening.

But I certainly agree it's a hack and that handling it through the type system would be far better if feasible. This is also probably the most complex part of the macro implementation, so removing it would be nice.

That's totally cool, and a reasonable point - I'd be fine with either way in the initial implementation. Later on, once we actually have some knowledge of the compile-time cost it would impose, I could do the type-system parts for you (or mentor you in it).

Also, I would like to note, I think the cxx example is confusing! It is not at all clear to me what &self in put, tag and metadata refers to (BlobstoreClient, it turns out).

I actually agree that that example is a little confusing. The cxx docs are clearer on the use of self:

Ah, okay.

Also, you should check out objrs, they have done some of the same work in this space.

That looks interesting. The macro design and grammar looks nice but I don't really understand what the implementation is doing (there's a lot of low-level stuff in there). Also, they seem to manage to decouple macro invocations for the class struct and class impl so I'm assuming they must be constructing and/or representing the class internals differently somehow. Any ideas about that? I'm assuming we can't really manage to do that still.

Yeah, their implementation is doing things with statics with special names that the linker understand, which is why they can split it up into two macro invocations. There's various issues with that though (apart from implementation difficulty), which is why we're doing it via. the runtime instead.

@madsmtm
Copy link
Owner

madsmtm commented Feb 13, 2023

Also, noting the general principles I try to follow when designing macros (which I've already somewhat stated above, but just for completeness):

  • The C-EVOCATIVE API guideline (which is, yeah, a bit fluffy, but still)
  • There must be an unsafe marker somewhere that the user can add a // SAFETY: comment to.
  • rustfmt must be able to format things.

Also, I've explicitly opted for having the user manually specify that they implement ClassType/ProtocolType, since I think it made it clearer what the macro was doing - but I can be convinced otherwise here.

@silvanshade
Copy link
Contributor Author

silvanshade commented Feb 14, 2023

Alternatively, we could leave the module in place in the output, but what purpose would that really serve?

Well, we could probably just wrap things in extern "Objective-C" { ... } instead then?

Unfortunately, I don't think we can get that to work.

The problem is that the items allowed in extern is quite limited. See here.

There is a verbatim item there, which allows us to at least represent arbitrary syntax (if we produce the item ourselves, instead of parsing it), but I don't think we can actually parse it that way as input to the macro without getting an error from the compiler first.

You can try some examples with ast explorer, which uses syn, to see what happens, but that example doesn't seem to parse.

Those few examples I was using (e.g., fn and type without a body) seem to be corner cases that are actually allowed by the compiler (and parsed as verbatim by syn) but later throw an error if left in place (we transform them before that happens).

I think we're probably stuck with mod one way or another as long as we need to couple struct with the impl within the same invocation somehow.

@madsmtm
Copy link
Owner

madsmtm commented Feb 14, 2023

Ah yeah, you're right, I thought proc-macro attributes allowed arbitrary token streams as their input.

In that case, I think keeping declare_class!( ... ); as a function-like #[proc_macro] is preferable - again, I would quite like to avoid the mod ... if we don't actually output a module.

Perhaps I could be persuaded to instead allow something like mod __ { ... } though? Or maybe we could instead write a feature-request for the compiler to accept extern "ABI" { any token tree } until after attribute macros have been resolved?

@silvanshade
Copy link
Contributor Author

Oh, I guess I hadn't considered a function-like proc macro replacement for declare_class! (maybe that's what you meant earlier and I misunderstood). That could certainly work.

I guess the disadvantage to that approach, versus attributes, is that it looks less natural if you want to pass arguments to it beyond the syntactic input.

Perhaps I could be persuaded to instead allow something like mod __ { ... } though?

This could also work. I think I tried that with only a single _ (which doesn't work) and didn't realize that __ would parse as an identifier (but I just checked, and it does). This is easy to change from the current behavior so I might as well switch it to this for now.

Or maybe we could instead write a feature-request for the compiler to accept extern "ABI" { any token tree } until after attribute macros have been resolved?

That would be interesting, and probably useful for other projects too, but I'm not sure what the chances are that such a change would be approved by the compiler team. If you think it's worth exploring though I wouldn't be against it.

@silvanshade
Copy link
Contributor Author

Given the recent discussion with regard to how to group the items and the issues with both mod and extern, I decided to take a closer look at linkme.

It seems we could actually use that for building the ::class() method, without too much trouble, but it would require some modifications to the ClassBuilder definitions (or some additional definitions for this special case).

I haven't tried a full implementation, but based on this small example which compiles, it seems that it should be possible:

#![deny(unsafe_op_in_unsafe_fn)]

use icrate::{
    objc2::{
        declare::{Ivar, IvarDrop},
        declare_class,
        msg_send,
        rc::Id,
        ClassType,
    },
    AppKit::NSTextField,
    Foundation::NSObject,
    WebKit::WKWebView,
};
use linkme::distributed_slice;

pub trait DynMethodImplementation {
    fn callee(&self) -> objc2::encode::Encoding;
    fn ret(&self) -> objc2::encode::Encoding;
    fn args(&self) -> &'static [objc2::encode::Encoding];
    fn __imp(self) -> objc2::runtime::Imp;
}

impl<Callee, Ret, Args, F> DynMethodImplementation for F
where
    F: objc2::declare::MethodImplementation<Callee = Callee, Ret = Ret, Args = Args>,
    Callee: objc2::RefEncode + ?Sized,
    Ret: objc2::encode::__unstable::EncodeReturn,
    Args: objc2::encode::__unstable::EncodeArguments,
{
    fn callee(&self) -> objc2::encode::Encoding {
        F::Callee::ENCODING_REF
    }

    fn ret(&self) -> objc2::encode::Encoding {
        F::Ret::ENCODING_RETURN
    }

    fn args(&self) -> &'static [objc2::encode::Encoding] {
        F::Args::ENCODINGS
    }

    fn __imp(self) -> objc2::runtime::Imp {
        F::__imp(self)
    }
}

#[distributed_slice]
pub static DELEGATE_INSTANCE_METHODS: [&(dyn DynMethodImplementation + Sync)] = [..];

#[distributed_slice(DELEGATE_INSTANCE_METHODS)]
static DELEGATE_INSTANCE_METHOD_0: &(dyn DynMethodImplementation + Sync) =
    &(Delegate::__init_withTextField_andWebView as unsafe extern fn(_, _, _, _) -> _);

declare_class!(
    struct Delegate {
        text_field: IvarDrop<Id<NSTextField>, "_text_field">,
        web_view: IvarDrop<Id<WKWebView>, "_web_view">,
    }
    mod ivars;

    unsafe impl ClassType for Delegate {
        type Super = NSObject;

        const NAME: &'static str = "Delegate";
    }

    unsafe impl Delegate {
        #[method(initWithTextField:andWebView:)]
        #[allow(non_snake_case)]
        unsafe fn __init_withTextField_andWebView(
            self: &mut Self,
            text_field: *mut NSTextField,
            web_view: *mut WKWebView,
        ) -> Option<&mut Self> {
            let this: Option<&mut Self> = msg_send![super(self), init];
            let this = this?;
            Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
            Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
            Some(this)
        }
    }
);

fn main() {
}

Thoughts on this approach?

I think it would really nice from an ergonomics point of view if we could allow decoupling these invocations like how the objrs interface works (without the complexity of their underlying implementation).

@madsmtm
Copy link
Owner

madsmtm commented Feb 19, 2023

Hmm, in my eyes the points against linkme are primarily:

  1. It's an extra dependency that, while fairly popular, still seems like it hasn't seen that much real-world use yet.
  2. It feels a bit too "magical" - if Rust had first-class support for distributed slices, I wouldn't bat an eye, but it doesn't, and that makes me a bit concerned over how robust / future-proof it is?
  3. Related to the above: I'm not sure of the soundness of it? At the very least, I think I'd like to get some eyes on it from the people over at unsafe code guidelines first.

So I think I'd rather avoid it, and keep the declaration and the implementations together (e.g. inside #[objc2::declare] mod __ { ... } or declare_class!( ... )). But that may also just be me being overly cautious.

@silvanshade
Copy link
Contributor Author

Hmm, in my eyes the points against linkme are primarily:

  1. It's an extra dependency that, while fairly popular, still seems like it hasn't seen that much real-world use yet.

That's a fair point. But it is a little bit niche though, compared to say, serde or something of that nature.

  1. It feels a bit too "magical" - if Rust had first-class support for distributed slices, I wouldn't bat an eye, but it doesn't, and that makes me a bit concerned over how robust / future-proof it is?

I think robustness or future-proof-ness is probably the most convincing argument against it.

  1. Related to the above: I'm not sure of the soundness of it? At the very least, I think I'd like to get some eyes on it from the people over at unsafe code guidelines first.

There doesn't seem to be much (or really any) discussion about it's soundness that I can find. The author has a pretty good track record with developing robust crates though, for whatever that counts for. But maybe it would just be best to ask about it on the linkme repo as a starting point?

So I think I'd rather avoid it, and keep the declaration and the implementations together (e.g. inside #[objc2::declare] mod __ { ... } or declare_class!( ... )). But that may also just be me being overly cautious.

Alright, although since I've already started looking at that, I'd like to try to finish putting together a working implementation to see if it's even feasible, if only for my own curiosity. If it turns out not to be a good idea, that's okay.

But maybe another possibility (if it does work) would be to allow both approaches: we could always gate a linkme based implementation behind an (unstable) feature to discourage it's use unless/until it's deemed more appropriate.

@silvanshade
Copy link
Contributor Author

By the way, I notice that in the docs there is this statement (with regard to safety):

The implementation is based on `link_section` attributes and platform-specific linker support. It does not involve life-before-main or any other runtime initialization on any platform. This is a zero-cost safe abstraction that operates entirely during compilation and linking.

So the claim is at least made that it is safe. Also, since it does operate only during compilation and linking, I'm not sure how hypothetical problematic unsafety (in the sense of UB) could actually manifest.

@madsmtm
Copy link
Owner

madsmtm commented Nov 15, 2023

@madsmtm madsmtm added this to the Procedural macros milestone Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-objc2 Affects the `objc2`, `objc2-exception-helper` and/or `objc2-encode` crates enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants