Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure garbage collection is able to clean up after bokeh server sessions are destroyed #3106

Merged
merged 7 commits into from
Oct 29, 2018

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Oct 26, 2018

Cleans up plots when a bokeh server session is destroyed allowing the plots and objects to be garbage collected.

Here's a screenshot of the output of memprof of a bokeh server session where I open consecutive sessions and then close them. As you can see after all the sessions are cleaned up memory usage returns to the 200 MB baseline, which is used to keep all the sys.modules in memory and run a bokeh server.

Before:

screen shot 2018-10-26 at 10 21 41 pm

After:

screen shot 2018-10-26 at 10 13 00 pm

Without bokeh/bokeh#8368 the garbage collection will not happen immediately and things may stick around for a while. So we won't quite see the behavior shown above until bokeh 1.0.1 is released with that fix.

@philippjfr philippjfr changed the title Ensure cleanup on bokeh server session destroy Ensure garbage collections is able to clean up after bokeh server sessions are destroyed Oct 26, 2018
@philippjfr philippjfr changed the title Ensure garbage collections is able to clean up after bokeh server sessions are destroyed Ensure garbage collection is able to clean up after bokeh server sessions are destroyed Oct 26, 2018
@philippjfr philippjfr added this to the v1.10.8 milestone Oct 26, 2018
@philippjfr
Copy link
Member Author

philippjfr commented Oct 26, 2018

The graphs which shows the memory usage going back down to 200 MB were all done while I still had code in bokeh which would print any uncollected references to certain objects which itself seems to have triggered some garbage collection. Without that the memory profile is slightly worse, but in the graph below I opened 10 sessions consecutively, closed them, waited for a bit, and repeated it two more times. You can see after each of the sets of 10 sessions it returns to ~230 MB.

screen shot 2018-10-26 at 11 26 45 pm

@philippjfr
Copy link
Member Author

philippjfr commented Oct 26, 2018

Also worth noting that there are almost certainly still minor memory leaks, e.g. custom options are not getting cleaned up. This PR only deals with the most important leaks, i.e. both Elements and Streams, which can both hold actual and potentially very large data.

@@ -38,7 +38,7 @@
import IPython # noqa (API import)
from .ipython import notebook_extension
extension = notebook_extension # noqa (name remapping)
except ImportError as e:
except ImportError:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but started raising flake errors all of a sudden.

self._document = doc
if self.subplots:
for plot in self.subplots.values():
if plot is not None:
plot.document = doc

def _session_destroy(self, session_context):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only reason this method exists is to drop session_context before calling cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, bokeh checks signatures.


target = param.ClassSelector(class_=ViewableElement, doc="""
The target object of the link (optional).""")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shame to lose these as parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, although it did give me the idea of generalizing links at the panel level.



@source.setter
def source(self, source):
if self._source:
source_list = self.registry[id(self._source)]
old_source = self._source() if self._source else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you can't use a property getter when writing a setter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be fine, and if there was any more logic I'd have opted for using the getter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think using self.source is nicer than defining old_source. Doesn't need to hold up this PR though...

# Mapping from a source to a list of streams
# WeakKeyDictionary to allow garbage collection
# of unreferenced sources
registry = weakref.WeakKeyDictionary()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a dictionary type, doesn't this assume the keys are hashable? Using id meant you could always be sure you could use it as a key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik, none of the object types that we support as sources (elements, DynamicMaps, overlays etc.) implement any custom __hash__ method which would interfere with them being used as keys. The reason I used the id originally was to avoid references to the source sticking around, which was actually counter-productive it turns out.

@jlstevens
Copy link
Contributor

jlstevens commented Oct 29, 2018

Looks good to me. Once question, is it worth considering what happens when you use del on these various objects? For instance, we might want to think about implementing __del__ although I'm not yet convinced that the extra code would be worth it for supporting something used so rarely (and implementing __del__ is not always recommended anyway).

@philippjfr
Copy link
Member Author

philippjfr commented Oct 29, 2018

Once question, is it worth considering what happens when you use del on these various objects?

I'm not quite sure whether that's even necessary and what the custom __del__ would do. With this PR it should be sufficient to call del obj and due to the weak references that should be sufficient.

@jlstevens
Copy link
Contributor

I was thinking a custom __del__ could do more clean up properly, e.g close comms. Anyway, it was just a thought and even if we decide it is a good idea, doesn't need to go into this PR.

@philippjfr
Copy link
Member Author

I was thinking a custom del could do more clean up properly, e.g close comms

The comms are associated with plots/widgets, not with an element directly so I don't think that helps. It may end up being helpful for cleaning up custom option trees though.

@jlstevens
Copy link
Contributor

What I had in mind was calling del on a BokehPlot instance. Anyway, it isn't important here. Merging.

@jlstevens jlstevens merged commit 98b5e96 into master Oct 29, 2018
philippjfr added a commit that referenced this pull request Oct 29, 2018
philippjfr added a commit that referenced this pull request Oct 29, 2018
@philippjfr philippjfr deleted the server_session_cleanup branch November 12, 2018 18:00
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants