Make upstream window and app references weak to avoid reference cycles #2061

freakboy3742 · 2023-08-04T06:07:18Z

The manual test case for this is in #1215. Windows won't be deleted immediately, but once the garbage collector kicks in, both the window interface and impl will be freed. I'm not sure how we'd manufacture an automated test, since it's entirely dependent on garbage collection behavior.

The underlying problem appears to be that the relations between widgets, App and Window are complex.

Apps own Windows, and Windows keep a reference to app
Widgets keep a reference to both app and window
Interface objects point at implementation objects, and vice versa
Apps and Windows both maintain a list of widgets
Bound methods used as signal handlers retain references to the objects they were created on. Most user handlers are bound to App; the updated macOS/iOS container handling introduced by [widget Audit] toga.ScrollContainer #1969 has a handler bound to the Window impl retained by the container, which is owned by the Window.

Python's garbage collection should pick up closed cycles between objects, but there's clearly something in this nest that is preventing collection.

This PR corrects the problem by making all "upstream" references (i.e., references from impl to interface, references from window to the app that owns the window, and references to app or window from a widget) weak. This is done by:

Modifying the getter and setter for app and window on the base widget into a wrapper around a weak property, rather than a full object reference,
Adding a getter and setter for interface on the Cocoa implementation of App and Window so that these properties are a wrapper around a weak property, rather than a full object reference
Modifying Cocoa's Container to retain a weak reference to the "parent" container, rather than a bound method for callbacks

This PR builds on #2058 so that the full GUI test suite for Window, plus all the changes to Container are incorporated. For convenience this is a link to the diff that is specific to this bug fix

There is almost certainly an analogous set of changes needed on the iOS backend (although it's less critical there because you can't create secondary windows).

PR Checklist:

All new features have been tested
All new features have been documented
I have read the CONTRIBUTING.md file
I will abide by the code of conduct

…itions.

samschott · 2023-08-04T23:24:34Z

I can't argue with the results, it does fix the test case from #1215 (which I've updated to account for recent API changes).

Python's garbage collection should pick up closed cycles between objects, but there's clearly something in this nest that is preventing collection.

It is indeed strange that GC does not pick up the reference cycles here. My first instinct would have been a reference cycle involving ObjC instances, but those should be prevented by the explicit weak ObjC properties that we have introduced some time ago.

Another possibility could be that we are indeed holding on to an additional reference somewhere that is not part of a closed cycle. Replacing it by a weakref enables collecting the cycle, but might return an unexpected None somewhere at some point.

I'm a bit cautious about this fix because it may be only addressing the symptom and not the cause. The cause being (1) potentially a leftover non-cyclic reference somewhere and (2) a complex reference graph that makes it difficult to reason about.

But clearly you have gone through most of the reference graph, it does not seem too complex after all, and one or multiple of references that you replaced with a weekref fixed the issue. I'd be curious to see a minimal fix that addresses it though, just to maybe narrow down which references were problematic.

I'm not sure how we'd manufacture an automated test, since it's entirely dependent on garbage collection behavior.

Maybe holding on to a weak reference to a Window in a test, forcing a GC run, and then asserting that the reference was collected, would work?

freakboy3742 · 2023-08-05T03:11:29Z

It is indeed strange that GC does not pick up the reference cycles here. My first instinct would have been a reference cycle involving ObjC instances, but those should be prevented by the explicit weak ObjC properties that we have introduced some time ago.

That was my thought too - and my "but it can't be that" for the same reason.

Another possibility could be that we are indeed holding on to an additional reference somewhere that is not part of a closed cycle. Replacing it by a weakref enables collecting the cycle, but might return an unexpected None somewhere at some point.

That also occurred to me; but if it's true, it's a subtle one that I haven't been able to discover. I tried mapping out all the relationships in a big spiderweb of a diagram, but that has rapidly converged on being one of those "red string between photos on the wall" conspiracy maps, rather than actual enlightenment :-)

I'm a bit cautious about this fix because it may be only addressing the symptom and not the cause. The cause being (1) potentially a leftover non-cyclic reference somewhere and (2) a complex reference graph that makes it difficult to reason about.

100% agreed on this. As a fix, there's at least a consistent rationale (i.e., "upstreamness") behind why the specific references are weakrefs, but it's not a very satisfying answer specifically because circular references should be resolved by the GC.

But clearly you have gone through most of the reference graph, it does not seem too complex after all, and one or multiple of references that you replaced with a weekref fixed the issue. I'd be curious to see a minimal fix that addresses it though, just to maybe narrow down which references were problematic.

Honestly - AFAICT, this is the minimal fix (or, at least, a minimal fix). The only part that is maybe optional is the references to app - app references don't appear to cause problems deleting windows, but the implementation is so symmetrical with window that I included them in this PR. But if I only apply 1 or 2 of the three other fixes (interface references, window references, and container callback references), the GC doesn't kick in - you need all three for it to work.

…me as a title.

samschott · 2023-08-06T09:53:55Z

Ok, you have convinced me that weakrefs are indeed a pragmatic solution :)

Do you know if the window deletion issue is limited to macOS or also occurs on Windows and Gtk backends? The window and app reference fixes will apply to all backends but your interface reference fix is macOS only. Since the general structure of backrefs to the interface layer is more general, it may be worth extending the weakrefs to other backends as well. I don't currently have access to a machine with VMs to test this myself.

Curiously, widget implementations have similar backrefs to the interface layer. You haven't introduced weakrefs here, and it seems that those are indeed not needed. A similar example as in #1215 but for Widgets works just fine with the current public release 0.3.1. Admittedly, the reference graph is a lot simpler here.

Widgets memleak test

import gc
import toga
from toga.constants import COLUMN
from toga.style import Pack


class Window(toga.Window):

    instances = 0

    def __init__(self):
        super().__init__(title=f'Window: {Window.instances}', position=(200, 200))
        Window.instances += 1

    def __del__(self):
        Window.instances -= 1


class WidgetRefcountExample(toga.App):

    def startup(self):

        self.window = None

        self.button_create = toga.Button(
            'Create window',
            style=Pack(padding=10, width=120),
            on_press=self.on_create,
        )

        self.button_delete = toga.Button(
            'Close and delete window',
            style=Pack(padding=10, width=120),
            on_press=self.on_delete,
        )

        self.label = toga.Label(
            f'Instances: {Window.instances}',
            style=Pack(padding=10, width=120),
        )

        self.button_create.enabled = True
        self.button_delete.enabled = False

        self.box = toga.Box(
            children=[
                self.button_create,
                self.button_delete,
                self.label,
            ],
            style=Pack(direction=COLUMN),
        )

        self.main_window = toga.MainWindow(title="Toga")
        self.main_window.content = self.box
        self.main_window.show()

    def on_create(self, sender):
        if not self.window:
            self.window = Window()
            self.app.windows += self.window
            self.window.content = toga.Box()
            self.window.show()
            self.label.text = f'Instances: {Window.instances}'

        self.button_create.enabled = False
        self.button_delete.enabled = True

    def on_delete(self, sender):
        if self.window:
            self.window.close()
            self.window = None
            gc.collect()  # clean up ref cycles before displaying instance count
            self.label.text = f'Instances: {Window.instances}'

        self.button_create.enabled = True
        self.button_delete.enabled = False


def main():
    return WidgetRefcountExample('Example', 'org.beeware.refcount')


if __name__ == '__main__':
    app = main()
    app.main_loop()

samschott · 2023-08-06T15:19:10Z

cocoa/src/toga_cocoa/window.py

+        self._interface = weakref.ref(value)
+
+    def __del__(self):
+        print("DEL WINDOW IMPL")


Leftover debug statement?

samschott

I went spelunking around a bit and removed as many of the weak refs as I could without re-introducing the memory leak.

I could remove all but one: The weakref that a toga.Widget holds to the toga.Window it belongs to. That lead me to conclude: The window is kept alive by widgets that still reference it, therefore the widgets themselves must be kept alive by something even though they are no longer required when the window is closed.

The obvious reference cycles are all closed loops, but one point stood out: The registries toga.App.windows and toga.App.widgets. When a window is closed, it is removed from toga.App.windows by toga.Window.close():

toga/core/src/toga/window.py

Lines 273 to 275 in 9a482d3

    
           def close(self) -> None: 
        
               self.app.windows -= self 
        
               self._impl.close()

However, the window's widgets still remain in toga.App.widgets add each one of them still holds a reference to the window. There are our leftover non-cycling references!

I've drafted a minimal fix as a PR at #2066.

My original diagnosis still stands, all the cross references and the different parts of code that manage them are too complicated to track and reason about easily. Even though my suggestion fixes the memory leak, your weak ref solution would make regressions more difficult (though note that it only cleans up the references to windows, references to their widgets would have still lived on forever). If both of us are struggling, imagine someone new to the codebase.

I don't have immediate suggestions to improve things though :(

Maybe remove all cross-references that are not strictly required? The registries are very convenient though. Or centralise and simplify the code when:

Windows are added to or removed from an app.
Widgets are added to or removed from a window, a parent widget (such as a box) or a container.

There is a lot of very similar code for the above cases and a regression in any of those places can be problematic.

freakboy3742 · 2023-08-07T03:20:55Z

Do you know if the window deletion issue is limited to macOS or also occurs on Windows and Gtk backends?

It's not limited to macOS; the same test case causes problems on GTK and Windows, and the same fix resolves the issue. I've added that to the PR.

I guess conceptually the same issue exists on iOS and Android, but you can't create and destroy additional windows, so it's a moot point. I've added the weakrefs for consistency.

Curiously, widget implementations have similar backrefs to the interface layer.

Yeah - I did some poking around this and came to the same conclusion. I do wonder if a more complicated widget graph, with widget references held by the app and closures on widget event handlers would manufacture a problem, but at least simple tests of this didn't reveal a problem for me.

However, the window's widgets still remain in toga.App.widgets add each one of them still holds a reference to the window. There are our leftover non-cycling references!

This is indeed an obvious source of leakage; however, as noted on that PR, it seems to be an incomplete fix.

Maybe remove all cross-references that are not strictly required?

The problem is that it's difficult to judge what is strictly required. I guess now that we've got a (near 100%) functional test suite, we'd be in a position to prune the references that aren't actually used in practice;

There is a lot of very similar code for the above cases and a regression in any of those places can be problematic.

Agreed there's some similarities; @mhsmith has also remarked that there are some commonalities that could be cleaned up. Conceptually, I'm in favor of any cleanup that reduces duplication and turns the backends into an even more "dumb" wrapper around native widgets; however, I'm not sure I have a clear vision on exactly what that simplification would look like in practice. There's enough differences between widget lifecycle on different platforms that I'm not sure there is a completely clean abstraction of widget management. I'd definitely be happy to be proven wrong on this, though.

freakboy3742 · 2023-08-07T04:02:27Z

I've added a test case on the testbed to verify that both the window interface and impl is deleted on each platform. I've done this on the testbed rather than core/dummy because:

The dummy backend contains a bunch of additional references in the event log that are very difficult to remove
Backends can have additional complications (such as references to native objects) that the dummy backend won't exhibit.

samschott · 2023-08-09T22:06:44Z

Hm, from the last CI run, I still see the Windows test test_secondary_window_cleanup fail due to a remaining reference. Could this possibly be a similar issue as previously on macOS with the native layer holding references to the interface and implementation that introduce non-Python reference cycles?

freakboy3742 · 2023-08-10T00:50:19Z

Hm, from the last CI run, I still see the Windows test test_secondary_window_cleanup fail due to a remaining reference. Could this possibly be a similar issue as previously on macOS with the native layer holding references to the interface and implementation that introduce non-Python reference cycles?

I think that failure is a red herring (at least for now). There's no Window test backend implemented for Winforms yet, so all the windows tests are failing because they can't find the probe for a Window. The secondary window reference test fails in a slightly different way specifically because it doesn't have a fixture, so it's not failing during fixture creation like all the other tests.

@mhsmith and I have been splitting the work of the audit process; I do a first pass handling Cocoa, iOS, and GTK; he follows behind handling Windows and Android. As a side effect of this distribution of labor, the tests for Android and Winforms tend to be explosive until he's started work on whatever feature is being audited.

samschott · 2023-08-12T10:41:48Z

I think that failure is a red herring (at least for now). There's no Window test backend implemented for Winforms yet, so all the windows tests are failing because they can't find the probe for a Window. The secondary window reference test fails in a slightly different way specifically because it doesn't have a fixture, so it's not failing during fixture creation like all the other tests.

I'm not so sure about that. From the test logs for test_secondary_window_cleanup:

Traceback (most recent call last):
  File "D:\a\toga\toga\testbed\build\testbed\windows\app\src\app\tests\test_window.py", line 211, in test_secondary_window_cleanup
    assert window_ref() is None
AssertionError: assert <toga.window.Window object at 0x000001CAE3CB4FA0> is None
 +  where <toga.window.Window object at 0x000001CAE3CB4FA0> = <weakref at 0x000001CAE4F77B50; to 'Window' at 0x000001CAE3CB4FA0>()

This looks like a test case failure to me. I also finally spun up a windows VM (pythonnet works on Apple Silicon!), and my manual example from #1215 still fails with this branch. It does pass however after removing circular references from Pythonnet to the toga_winforms.Window instance (mostly from instance methods that are set as callbacks).

freakboy3742 · 2023-08-16T23:17:27Z

I'm going to close this in favor of #2066; the approach there is a lot simpler, and isn't platform dependent.

freakboy3742 added 19 commits August 3, 2023 13:54

Update docstrings and documentation for Window and MainWindow.

d3fd16d

Add changenote.

8c09ef1

Migrated core tests to pytest.

d2eec81

Complete coverage of core dialogs.

7ee6369

Remove GTK window tests.

3e6bf86

Correct a spelling issue, and document an API rename.

cec300d

Exclude some clases from interface tests.

a13683b

Cocoa GUI tests for Window (excluding dialogs and toolbars.

32097ca

Add menu items and keyboard shortcuts for close/minimize on macOS.

212fb89

Add GUI test of full screen mode.

f3bffde

Converge on US spelling for resizable, closable

86e91ae

Cocoa dialogs 100% coverage.

9f719f6

100% coverage for Window on iOS.

e3613f2

Minor coverage tweaks.

7cd3fde

GTK Window and Dialogs to 100%.

c867585

Tweak visibility handling to ensure complete coverage in testbed cond…

066a384

…itions.

Use blackbox as a testing WM.

3bd0b05

Rework file dialog setup handling to avoid CI failure.

054658d

More test updates for CI's benefit.

e18c5b1

freakboy3742 requested review from mhsmith and samschott August 4, 2023 06:16

samschott mentioned this pull request Aug 4, 2023

[widget Audit] toga.Window #2058

Merged

11 tasks

freakboy3742 added 2 commits August 5, 2023 11:38

Restructure menu creation for cocoa.

59cfb50

Rework default title handling so MainWindow defaults to the formal na…

4fffdbe

…me as a title.

samschott reviewed Aug 6, 2023

View reviewed changes

samschott mentioned this pull request Aug 6, 2023

Widget and window reference cleanup #2066

Merged

4 tasks

samschott requested changes Aug 6, 2023

View reviewed changes

freakboy3742 marked this pull request as ready for review August 7, 2023 04:03

freakboy3742 force-pushed the ownership-cleanup branch from 3c21869 to abd9397 Compare August 10, 2023 01:51

Add a test of closing window explicitly.

ce4130d

freakboy3742 added 6 commits August 16, 2023 10:17

Merge branch 'main' into audit-window

a04ad63

Insert a pause on app exit to make sure Briefcase gets all the app logs.

cc2df8b

Make app and window back references weak.

f8a8773

Roll out weakref change across other backends.

b6fdd2d

Add weakrefs to dummy, and a test of GC on the live backends.

c9ed0f8

Corrected some inheritance issues with iOS containers.

73714f6

freakboy3742 force-pushed the ownership-cleanup branch from 9c03537 to 73714f6 Compare August 16, 2023 23:08

freakboy3742 closed this Aug 16, 2023

freakboy3742 deleted the ownership-cleanup branch August 16, 2023 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make upstream window and app references weak to avoid reference cycles #2061

Make upstream window and app references weak to avoid reference cycles #2061

freakboy3742 commented Aug 4, 2023 •

edited

Loading

samschott commented Aug 4, 2023

freakboy3742 commented Aug 5, 2023

samschott commented Aug 6, 2023 •

edited

Loading

samschott Aug 6, 2023

samschott left a comment •

edited

Loading

freakboy3742 commented Aug 7, 2023

freakboy3742 commented Aug 7, 2023

samschott commented Aug 9, 2023

freakboy3742 commented Aug 10, 2023

samschott commented Aug 12, 2023 •

edited

Loading

freakboy3742 commented Aug 16, 2023

	def close(self) -> None:
	self.app.windows -= self
	self._impl.close()

Make upstream window and app references weak to avoid reference cycles #2061

Make upstream window and app references weak to avoid reference cycles #2061

Conversation

freakboy3742 commented Aug 4, 2023 • edited Loading

PR Checklist:

samschott commented Aug 4, 2023

freakboy3742 commented Aug 5, 2023

samschott commented Aug 6, 2023 • edited Loading

samschott Aug 6, 2023

Choose a reason for hiding this comment

samschott left a comment • edited Loading

Choose a reason for hiding this comment

freakboy3742 commented Aug 7, 2023

freakboy3742 commented Aug 7, 2023

samschott commented Aug 9, 2023

freakboy3742 commented Aug 10, 2023

samschott commented Aug 12, 2023 • edited Loading

freakboy3742 commented Aug 16, 2023

freakboy3742 commented Aug 4, 2023 •

edited

Loading

samschott commented Aug 6, 2023 •

edited

Loading

samschott left a comment •

edited

Loading

samschott commented Aug 12, 2023 •

edited

Loading