Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make safe WindowHandle s to imilictly ref-count #158

Closed
kchibisov opened this issue Jan 8, 2024 · 27 comments
Closed

Make safe WindowHandle s to imilictly ref-count #158

kchibisov opened this issue Jan 8, 2024 · 27 comments

Comments

@kchibisov
Copy link
Member

The current handle can only work safely when the window is either end up in OnceCell, done Box::leak, or send in some sort of wrapper to other thread. However the most common way write multi window is doing something like that

struct State {
    windows: HashMap<WindowId, MyWindow>,
}

struct MyWindow {
    window: Window 
    surface: Surface, // depends on Window. Explicit bug here.
}

Such state is being owned by EventLoop or defined as part of the captured value in the closure, so it can't have correct lifetime. Rust doesn't allow to have Surface depend on Window and be stored in the same struct, but just in the right order.

This could can work, but with e.g. GBM/drm stuff it may segfault, for example, see Smithay/smithay#1102 as an example.

One solution could be to have a WindowHandle even owned one somehow retain the original object so it can't drop in the wrong order.

This can work really nice when the Window itself is not owned by the user, but rather by the library, so to remove the Window they'll have to call failing destructor and the library could just deny you dropping it because you still hold reference to it.

Such design doesn't really need a lifetime, since you just can't drop the original object. Yes it does require the library serving the handle to retain the resources, but it's safe.

Even if the handle gets invalidated, it's often safe to use, so it's not an issue and irrelevant to the suggestion.

Some sketch of the API and semantics based on the rust-windowing/winit#3367 .

pub struct State {
    surface: Option<glutin::Surface>,
}

impl Application for State {
    fn new_events(&mut self, loop_handle: &mut dyn EventLoopHandle, start_cause: StartCause) {
        let _ = loop_handle.create_window(&Default::default());
    }

    fn about_to_wait(&mut self, _: &mut dyn EventLoopHandle) {
        println!("About to wait");
    }

    fn loop_exiting(&mut self, _: &mut dyn EventLoopHandle) {
        println!("Exiting the loop");
    }
}

impl ApplicationWindow for State {
    fn created(&mut self, loop_handle: &mut dyn EventLoopHandle, window_id: WindowId) {
        let window = loop_handle.get_window(window_id).unwrap();
        // This retain the original window inside the owned event loop.
        let handle = window.window_handle();
        // This call is actually safe now, because it can't outlive the window no matter how you
        // write it, since iti retains the original one.
        self.surface = Some(glutin::Surface::new(&window).expect("failed to create glutin surface"));
    }

    fn close_requested(&mut self, loop_handle: &mut dyn EventLoopHandle, window_id: WindowId) {
        // Surface retained the `Window` inside the winit, so it can't drop it and will throw
        // an error until you actually destroy it.
        //self.surface = None;

        // The window got a close request. Try to close it without dropping the `Surface`.
        // This line will panic.
        loop_handle.remove_window(&window_id).except("failed to destroy widndow because it has
                                                     handles holding it");
    }

    fn scale_factor_changed(
        &mut self,
        _: &mut dyn EventLoopHandle,
        _: WindowId,
        scale_factor: f64,
    ) {
        println!("New scale factor {scale_factor}");
    }

    fn redraw_requested(&mut self, loop_handle: &mut dyn EventLoopHandle, window_id: WindowId) {
        let (window, surface) = match (loop_handle.get_window(window_id), self.surface.as_mut()) {
            (Some(window), Some(surface)) => (window, surface),
            _ => return,
        };

        surface.swap_buffers();
    }

    fn destroyed(&mut self, loop_handle: &mut dyn EventLoopHandle, _: WindowId) {
        if loop_handle.num_windows() == 0 {
            loop_handle.exit();
        }
    }
}

fn main() {
    let mut event_loop = <EventLoop as EventLoopRequests<State>>::new().unwrap();
    let context =
        unsafe { Context::new(&event_loop).expect("failed to create softbuffer context") };
    let state = State { context, surface: None };
 
    let proxy = EventLoopRequests::<State>::proxy(&mut event_loop);

    // Test out the proxy.
    std::thread::spawn(move || loop {
        proxy.wakeup();
        std::thread::sleep(Duration::from_millis(500));
    });

    event_loop.run(state);
}

If we state that to safely impl the window must be dropped only when it has only a single reference(from itself) it means that it can't also outlive the event loop, since it'll continue to run until all the windows are closed.

So the end API could be safe in all cases, unlike the current one where you're forced to store data in a special way due to lifetime.

Ref rust-windowing/winit#3365
Ref rust-windowing/winit#3317

@kchibisov
Copy link
Member Author

The display stuff is not that prone, but there's alacritty/alacritty#7146 at least.

With display you can create type depending on it before actually running the loop, so it'll have the current lifetime of event loop, however it'll still help if there wasn't a lifetime at all.

@kchibisov
Copy link
Member Author

Alternative is for apps to require Arc<WindowHandle> and for provider to provide it as well, but I don't see why we should make this all safety opt-in, if users want to opt-out they can use RawWindowHandle as before and convert the unsafe fn into_raw removing the retained part.

@kchibisov
Copy link
Member Author

Other point for that is that libraries based on raw-window-handle can adopt to that with just a single line change basically, since it works the same as RawWindowHandle but just retains the original window ensuring that it won't be dropped, thus the users can easily adopt instead of using the lifetime through out the library, which doesn't work as soon as the Surface is send from the thread the Window is on.

And keep in mind that Window is generally not Send however Surface, like EGLSurface or wgpu one is.

@notgull
Copy link
Member

notgull commented Jan 8, 2024

I haven't read the entire Matrix thread yet, so I don't fully understand the context behind this, but I think there's a way around this without Arc<Window>. For instance, take softbuffer's Surface.

I use Surface<Arc<Window>> in examples because it's easy. However you could also very easily use Surface<Window>, and then expose a get_ref() function to get a reference to the underlying Window. I haven't implemented this yet in softbuffer because I forgot to, but this pattern could be used to get a reference to the underlying window. As an analogy this is already done in async_io::Async.

I'm against adding any kind of window closing code to this library. There's no common way to destroy windows on all platforms, it would add a lot of dependencies and it probably wouldn't work anyways.

I know there was talk of adding a lifetime to Window in winit, is this the current plan for winit-next? If so I'd be against it; it would make usage of Window extremely fragile.

@kchibisov
Copy link
Member Author

@notgull you'll make it depend on winit actually and it won't really work in MT context because you can Send surface but not Window. There's nothing interesting in matrix thread actually.

asy. However you could also very easily use Surface, and then expose a get_ref() function to get a reference to the underlying Window. I haven't implemented this yet in softbuffer because I forgot to, but this pattern could be used to get a reference to the underlying window. As an analogy this is already done in async_io::Async.

This is more restrictive to what I propose and I'm not sure how easy it is to use a thing like that.

I'm against adding any kind of window closing code to this library. There's no common way to destroy windows on all platforms, it would add a lot of dependencies and it probably wouldn't work anyways.

This is not about closing either, it's just an indicator that you have handles alive for the library. Consumers don't need to even know about that, they should just store the WindowHandle for as long as it's valid, however e.g. Winit could check that and prevent window from being destroyed unless all the handles are gone.

So the end code will be safe and the same as it was using the RawWindowHandle form 0.5.0, e.g. zero changes to glutin other than making the calls safe.

I know there was talk of adding a lifetime to Window in winit, is this the current plan for winit-next? If so I'd be against it; it would make usage of Window extremely fragile.

No, I won't add it.

Besides, keep in mind that we don't solve innert problems here because they are not an actual problems, the problem is dropping after the window and inability to use the current handle in the case where you can't pass the entire window to some other lib, since it's just makes it pretty ugly.

Like

#[derive(Clone)]
struct RefCountedHandle {
    window_handle: Arc<RawWindowHandle>, // It's still not send/sync, but it's ref-counted because it can be used on other thread.
}

Maybe some other prototype will be better, but the thing is that winit will know that you have something still using the handle when it's dropping its window or exiting the loop, thus it'll be able to force you drop the window and not let you out unless you clear resources or by use unsafe to bypass the check.

In the end you can port e.g. glutin and wgpu to that with literally zero changes and remove all unsafe.

@notgull
Copy link
Member

notgull commented Jan 8, 2024

I'm sorry, I'm having trouble following with what you are proposing. Is the idea to add a reference counted window handle with an indicator as to whether or not the window is still alive? And this indicator is shared between winit and the window handle to synchronize this state change?

In my opinion Rust offers us the tools to work with this scenario and we should try to use these tools first instead of trying to invent our own bespoke primitives. I don't see how composition is overly restrictive. For your MySurface example above, you could do:

struct MySurface {
    inner: Surface<Window>
}

impl MySurface {
    fn window(&self) -> &Window {
        self.inner.get_ref()
    }

    fn sw_surface(&self) -> &Surface<Window> {
        &self.inner
    }
}

You can access the window just like you would in your original code. The only downside is that you don't get &mut access, but that's unsound anyways.

@kchibisov
Copy link
Member Author

But I need &mut access though on the Window so I don't start to write tons of RefCells. I'm aware of the approach you suggest, however it makes other library own the Window where it should belong either to user or winit in the first place. Window is also not Send unlike the Surface and by doing what you want you make it Send automatically, since you consume it?

Like I know that you can do what you suggest as alternative, it's just doesn't really work in my opinion, because the constraints a different, and it requires a lot of changes to all the libraries, where with my approach you just replace RawHandle with RefCountedHandle and you're safe.

notgull added a commit to rust-windowing/softbuffer that referenced this issue Jan 10, 2024
Adds the `get_ref` and `get_mut` functions, which can be used to get
references (mutable or otherwise) to the underlying window handle.

cc rust-windowing/raw-window-handle#158 (comment)

Signed-off-by: John Nunley <[email protected]>
@notgull
Copy link
Member

notgull commented Jan 10, 2024

I don't think that "friction with the current GUI ecosystem" should be evaluated as a constraint for this design. The prior way of handling window handles was the equivalent of passing raw pointers to functions. Normally I'd be a fan of the path of least harm, but unfortunately many of the crates that have built their APIs on top of this unsoundness.

The solution is to break the APIs so they actually pass around real-life windows instead of pointers. The solution shouldn't be to staple on refcounting to the existing system. I have serious doubts that this can be implemented without introducing a deadlock, leak or other bug to this system. Composition is just types inside of other types, so it's bug free right off the bat.

We should be taking a lesson from the Rust HTTP client fiasco here. My vote is to use the tools the rest of the ecosystem uses, in contrast to settling further into a system that's unsound.

But I need &mut access though on the Window so I don't start to write tons of RefCells

This can happen with the current system. Most graphics surfaces are implemented like this:

struct MyGraphicsSurface<W> {
    window: W,
    gorey_system_bits: Foobar
}

So this can just be done:

impl<W> MyGraphicsSurface<W> {
    fn get_ref(&self) -> &W {
        &self.window
    }

    fn get_mut(&mut self) -> &mut W {
        &mut self.window
    }
}

See rust-windowing/softbuffer#193 for an example of this in practice.


I'm requesting @grovesNL and any other interested wgpu maintainers to provide feedback here. It was previously stated that wgpu supported this proposal as it would lead to less friction in wgpu.

I understand why wgpu would be averse to adding display and window generics to Surface, as it would require changes to downstream code. However the current system is unsafe, and the proposed solution introduces technical debt that I would be uncomfortable having in large projects like winit, wgpu and glutin. On the other hand, the composition pattern is already used in a number of other rust crates (ref async_io::Async) and lends itself well to existing patterns.

@Lokathor
Copy link
Contributor

Just popping in to second the notion that breaks that improve soundness are absolutely worth considering strongly.

@grovesNL
Copy link
Contributor

grovesNL commented Jan 10, 2024

I'm not sure if I understand the soundness problem with the existing API, but in general wgpu would prefer approaches that doesn't require a generic parameter or lifetime for windows if possible (instead preferring explicitly or implicitly refcounted window handles).

Edit: actually I misspoke - looks like wgpu already has the lifetime generic (I missed that change) so this might be ok https://github.com/gfx-rs/wgpu/blob/f27bb443c16634596618e9b2969ee6822cc7a696/wgpu/src/lib.rs#L366

@kchibisov
Copy link
Member Author

kchibisov commented Jan 10, 2024

There's no soundness issues with what we have now, mostly convenience and boilerplate kind of thing. I don't like the current approach because generally the thing should be ref-counted and winit has its handles ref-counted already. However, to make things safe, we'd explicitly need to wrap into ref-counted object and libraries to be generic over T: HasWindowHandle.

The borrowed lifetime approach doesn't work, only Owned one.

Like in the end both work the same, I just propose to move the ref-counting into raw-window-handle's safe handle type, because it's how it's anyway and what is expected from the windowing libraries.

@Lokathor
Copy link
Contributor

Since no one else has spoken up on behalf of SDL yet in this thread maybe I should add that this whole thing would probably be a pain in the butt to work in with an SDL based window situation.

@kchibisov
Copy link
Member Author

I never worked with SDL so can't really help here, as I said, I don't really mind the way things are right now, I just think that given that it's ref-counted anyway (you need to do that to ensure to keep handle alive and loop running, otherwise you'll invalidate and break users) it could make sense to ref-count in the handle itself so the library giving the WindowHandle can track ref counting based on the handle and release the window once it's ref count is 1.

@Lokathor
Copy link
Contributor

Well SDL, the C API, lets you

  • create and destroy windows (which are not Send)
  • check the raw handle of any window

This means that the FFI bindings later to SDL can currently basically support this crate without breaking the general rule that "bindings crates should be just bindings and avoid original code/abstractions".

However, if the ref counting were mandatory then that would require the ffi bindings to not support directly getting the window handle at all, and the "rusty" safe layer of SDL would need to keep an extra ref counted thing on top of other stuff it's probably already ref counting (eg: having an SDL window open keeps the SDL api open, this is usually handled in rust with a cluster of Arc values).

In total not an impossible blocker, but as someone who works with SDL mostly and not really with winit because I always think that winit is absurdly complex, this doesn't seem like an obvious win.

@kchibisov
Copy link
Member Author

I'm not touching Raw handle, but WindowHandle though, SDL is basically Raw handle, safe wrapper for SDL could add extra as safe wrappers for other stuff in general. The current API doesn't really work for SDL either I think because you can't build the safe handle? And you also need to linger SDL windows if you do multithreaded rendering.

Again, winit has nothing to do with all of that, it'll work fine with whatever and provide safe handles.

@Lokathor
Copy link
Contributor

Ah, I follow now. Then don't worry about everything I just said.

@madsmtm
Copy link
Member

madsmtm commented Jan 10, 2024

If we added a reference-counting handle like OwnedWindowHandle, then raw-window-handle would mirror std::os::fd quite closely:

raw-window-handle std::os::fd
WindowHandle<'_> BorrowedFd<'_>
OwnedWindowHandle OwnedFd
RawWindowHandle RawFd
trait HasWindowHandle trait AsFd

Is that what you're proposing when you say that WindowHandle should ref-count?


In that past, we have decided against having that in this crate, since it would require a lot of dependencies, and a major selling point for this crate is to avoid such dependencies!

@kchibisov
Copy link
Member Author

@madsmtm yeah, something like that would be fine by me. Right now the owned handle is exposed via T: HasWindowHandle.

In that past, we have decided against having that in this crate, since it would require a lot of dependencies, and a major selling point for this crate is to avoid such dependencies!

std::sync::Arc is coming from a stdlib, like you already have a way to do ref-counting and track the count on Arc, you don't need to bring anything else.

@notgull
Copy link
Member

notgull commented Jan 11, 2024

@grovesNL

I'm not sure if I understand the soundness problem with the existing API,

The problem with the existing API (by this I mean the API that the latest released version of wgpu uses) is that it's unsafe and requires the user to be aware of the lifetimes of the respective windows and surfaces. This is unsound for the same reason passing around raw pointers is unsound in Rust.

The API that wgpu master uses, boxing up the window reference, is alright but not ideal. This requires the window to be Send and Sync, which doesn't work well with the planned design that @kchibisov has. So I think the solution for wgpu would be to replace this line with a generic W: HasWindowHandle. That way it's not constrained to being Send or Sync.

but in general wgpu would prefer approaches that doesn't require a generic parameter or lifetime for windows if possible (instead preferring explicitly or implicitly refcounted window handles).

Can you please elaborate on this? If it's a matter of friction, I think investing into a safer system is better than stapling reference counting on an unsafe system and creating technical debt.


@kchibisov

I don't like the current approach because generally the thing should be ref-counted and winit has its handles ref-counted already.

Not all windowing systems are winit. I'm not too familiar with glazier's architecture, so please correct me if I'm wrong, but for one-window applications like glazier there's little to no reference counting. While winit might be structured like this internally I'm not sure other windowing systems are.

However, to make things safe, we'd explicitly need to wrap into ref-counted object and libraries to be generic over T: HasWindowHandle.

The borrowed lifetime approach doesn't work, only Owned one.

This approach demonstrably works in both softbuffer and glutin.

I think we should be making the GUI ecosystem simpler by streamlining it and adopting Rusty idioms, rather than trying to force in new tricks for short-term wins in patch noise.

@kchibisov
Copy link
Member Author

The API that wgpu master uses, boxing up the window reference, is alright but not ideal. This requires the window to be Send and Sync, which doesn't work well with the planned design that @kchibisov has. So I think the solution for wgpu would be to replace this line with a generic W: HasWindowHandle. That way it's not constrained to being Send or Sync.

Please, read the rust-windowing/winit#3367 (comment) . There will be Send + Sync handle.

Can you please elaborate on this? If it's a matter of friction, I think investing into a safer system is better than stapling reference counting on an unsafe system and creating technical debt.

What I suggest is as safe as what you provide, since I propose to create OwnedFd and and RefCount. Fd is ref-counted on the kernel side.

Not all windowing systems are winit. I'm not too familiar with glazier's architecture, so please correct me if I'm wrong, but for one-window applications like glazier there's little to no reference counting. While winit might be structured like this internally I'm not sure other windowing systems are.

Refcount of 1 is also a refcount, if you just Own the window. In general, see @madsmtm reply on what place I target.

This approach demonstrably works in both softbuffer and glutin.

Yes, but it's a lot of work and it's unsound in e.g. winit because it doesn't linger window because of not doing proper ref-counting. Like what you suggest is already unsound if you do naive impl and not do ref-count.

@notgull
Copy link
Member

notgull commented Jan 11, 2024

What I suggest is as safe as what you provide, since I propose to create OwnedFd and and RefCount. Fd is ref-counted on the kernel side.

I'll have to see it to believe it. I'm worried that needing to coordinate across RWH, winit and graphics APIs to make sure something is safe will be a lot of effort. Also, it sounds nearly impossible to verify without ecosystem-wide crater tests.

Refcount of 1 is also a refcount, if you just Own the window. In general, see @madsmtm reply on what place I target.

My main point is that we're specializing the API for winit in a way that will make it more difficult for other GUI crates to adopt. As @Lokathor said it would probably be difficult to integrate into Rust's SDL bindings.

Yes, but it's a lot of work

I'll admit this, but it's work in favor of a system that works better with Rust's existing idioms than to keep with our current flawed design.

it's unsound in e.g. winit because it doesn't linger window because of not doing proper ref-counting. Like what you suggest is already unsound if you do naive impl and not do ref-count.

What platforms does this apply to?The platforms that I am familiar with (X11, Windows) can cause slight annoyances but not anything unsound.

@kchibisov
Copy link
Member Author

What platforms does this apply to?The platforms that I am familiar with (X11, Windows) can cause slight annoyances but not anything unsound.

rust-windowing/winit#3317 all? If you drop event loop before the Window issuing any requests on it will be crash, and in MT setup you need to linger the Window and keep event loop around until you drop all of the handles, display handles.

But again, it's not needed for winit at all, because the way new API work will be fine with whatever we have now, it's just libraries should be more aware of such a thing.

@grovesNL
Copy link
Contributor

@notgull

Can you please elaborate on this? If it's a matter of friction, I think investing into a safer system is better than stapling reference counting on an unsafe system and creating technical debt.

Mostly ergonomics - the extra type parameter adds some extra friction (especially if the type has to be written out anywhere, like when window and surface creation might happen in different places, or surface is held in a struct). It's not a big deal though, it just makes things slightly more difficult which feels like it could be avoidable.

I might be forgetting why raw window handle prefers borrowed handles in the first place, but it feels like it could be reasonable to always pass owned raw window handles between crates (even removing the lifetime if possible) assuming we can reliably refcount inside that handle. I don't know all the nuances about how this needs to work on each platform, but some platforms already internally refcount anyway, for example:

  • on WebAssembly this means cloning the JsValue to bump the refcount
  • on macOS this means bumping the NSWindow refcount (e.g., retain)

@kchibisov
Copy link
Member Author

ref counting like bumping NSWindow refcount only matters if you deal with ffi. If you stay in rust, just using Arc should be fine.

@grovesNL
Copy link
Contributor

Yeah that's true, I was just thinking about cases like the SDL use case mentioned by @Lokathor if we did want to support it directly on this owned handle type (without wrapping platform handles in Arc).

@kchibisov
Copy link
Member Author

As I said, all have nothing to do with Raw handle, only with safe handles, see @madsmtm post on what niche ref-counted could be in.

The only argument against generic is that Surface is not really tied to the window, but generic is bounded by concrete type, however render target could be basically anything, like you mix view, subview, popup, etc, and library may not develop common type for all of them at the same time making it annoying for users to store those surfaces at the same place.

I don't have strong opinion here anyway given that some folks want to pass T: HasWindowHandle, it doesn't have issues like the borrowed thing.

notgull added a commit to rust-windowing/softbuffer that referenced this issue Jan 12, 2024
Adds the `get_ref` and `get_mut` functions, which can be used to get
references (mutable or otherwise) to the underlying window handle.

cc rust-windowing/raw-window-handle#158 (comment)

Signed-off-by: John Nunley <[email protected]>
@notgull
Copy link
Member

notgull commented Feb 9, 2024

This issue was discussed in our weekly winit maintainer's meeting. As we've determined that Arc<Window> or composition can fill this niche, this issue is being closed as unplanned.

@notgull notgull closed this as not planned Won't fix, can't repro, duplicate, stale Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants