-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure multiple callbacks do not bleed wrong plot state #1034
Conversation
Does this fix holoviz/datashader#250? |
The immediate problem there is already fixed, this is a more subtle issue that occurs under certain circumstances. In the case that it came up it was in a scenario where there are two datashader plots with RangeXY callbacks, which have one linked axis and one independent axis, e.g. two different quantities over time. |
4f73b24
to
eaf6249
Compare
Works well for me. |
Some other issue happening here where callback events are clobbering each other and it times out. |
Had to make some further changes to the ACK message protocol. Previously when multiple callbacks fired consecutively it could happen that some messages were getting throttled causing one particular comm to never receive an ACK message that would unblock it. Therefore callback event messages and the response may now contain the |
34cb892
to
9aa0a0f
Compare
Need to add more tests and update existing tests. |
e447324
to
0665086
Compare
@jlstevens I believe this is now ready for review. Both issues that I fixed here are pretty subtle and caused strange bugs. Here's the example I've been using to test the fix: import holoviews as hv
import numpy as np
import pandas as pd
from holoviews.operation.datashader import datashade, aggregate, shade
hv.notebook_extension('bokeh')
%opts DynamicMap [width=900, tools=['xwheel_zoom']] Layout [shared_datasource=False]
n=100000
def f(n):
return np.cumsum(np.random.randn(n))
df = pd.DataFrame({'t': range(n), 'y1': f(n), 'y2': f(n), 'y3' : .10 * f(n)} )
def p(y):
return hv.Curve(df, kdims=['t'], vdims=[y])
layout = datashade(p('y1')) * datashade(p('y2')) + datashade(p('y3'))
layout.cols(1) A rough explanation of the changes implemented here:
|
@@ -10,7 +10,7 @@ | |||
from ..comms import JupyterCommJS | |||
|
|||
|
|||
def attributes_js(attributes): | |||
def attributes_js(attributes, handles): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to see the docstring updated with an example showing how handles is involved...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring still needs to be updated I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, still need to do that.
# Gather the ids of the plotting handles attached to this callback | ||
# This allows checking that a stream is not given the state | ||
# of a plotting handle it wasn't attached to | ||
stream_handle_ids = defaultdict(list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to turn this bit of code into a method (get_handle_ids
?) and turn the comment into a docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
for stream in self.streams: | ||
stream.update(trigger=False, **msg) | ||
ids = self.stream_handles[stream] | ||
sanitized_msg = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this bit sounds like a 'message filter' (as opposed to sanitization) and could be its own method (with docstring). Something like message_filter(msg, ids)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds good.
else: | ||
handle.callback.code += code | ||
else: | ||
self.stream_handles.update(stream_handle_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the contents of self.stream_handles
can keep growing between set_customjs
calls with no way to reset/clear it? Is this more state that hangs around when a visualization is removed (e.g a notebook cell is deleted?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it'll stick around but compared to all the plotting state I'm not worried about a few IDs.
To summarize private discussion I've had directly with Philipp, I'll just say that I think the general strategy used in this PR looks sound - labelling comms with an id and keeping track of plotting handles. I intend to use inline comments for suggestions specific to this PR and here are some of the more general (future) plans that we discussed:
|
@jordansamuels I just updated the PR fixing the issue you reported when resetting, could you test it as well? |
@jlstevens Ready for review again. |
@@ -152,39 +164,84 @@ def __init__(self, plot, streams, source, **params): | |||
self.streams = streams | |||
self.comm = self._comm_type(plot, on_msg=self.on_msg) | |||
self.source = source | |||
self.handle_ids = defaultdict(list) | |||
|
|||
|
|||
def initialize(self): | |||
plots = [self.plot] | |||
if self.plot.subplots: | |||
plots += list(self.plot.subplots.values()) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the code below could be made into a small get_handles
(or get_plot_handles
) method. Alternatively, it could be a function in utils so the code below could be:
handle_ids = self._get_handle_ids(util.get_handles(plots))
self.handle_ids.update(handle_ids)
Which I think is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also assuming there is a good reason to use a dictionary update... i.e there may be other handle_ids
on the Callback
other than the ones currently returned by _get_handle_ids
. If that is true, it is a bit strange to see this in an initialize
method which sounds like it should only happen once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the code below could be made into a small get_handles (or get_plot_handles) method.
Originally had one, didn't seem worth it though.
If that is true, it is a bit strange to see this in an initialize method which sounds like it should only happen once.
That's true, the handles get merged if multiple streams end up being attached to the same callback, in that case the set_customjs
method below merges the callbacks:
for k, v in self.handle_ids.items():
cb.handle_ids[k] += v
I could just assign in initialize
instead of using the dictionary update, but I wanted to indicate that the handle_ids
may be modified by another callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally had one, didn't seem worth it though.
I agree it is a small bit of code and I was also wondering if it was worth it. On balance, I felt making the intent clearer for this block of code was more valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit confused about self.handle_ids
- I see it mentioned in four place:
- When it is declared.
- The bit in
initialize
above. - What looks like an a dictionary access in
on_msg
. - The iteration over
k, v
you just mentioned.
In none of these places do I see self.handle_ids
being changed after it is initialized. If you mean that self.handle_ids
is being changed somewhere else in the code, then I wouldn't worry about using update
here.
In that case, I would also want to look at where the merging of self.handle_ids
is happening to consider whether that should become part of the API of Callback
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted where it's being changed above, in set_customjs
a callback will merge its own handle_ids
with another callback's handle_ids
if that callback has already been attached to the plot handle (since you can't attach multiple callbacks on one handle).
for k, v in self.handle_ids.items():
cb.handle_ids[k] += v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I was looking for self.handle_ids
but cb
here is another Callback
instance. In that case, how about merging to a different place e.g cb.joint_handle_ids
? If it is too complicated, then I don't mind leaving it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No point keeping both around imo.
found = [] | ||
for plot in plots: | ||
for handle in self.handles: | ||
if handle not in plot.handles or handle in found: | ||
continue | ||
self.set_customjs(plot.handles[handle]) | ||
self.set_customjs(plot.handles[handle], handles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand right, found
is a list of handles that are in some sense 'unique'. So maybe it could be something like:
unique_handles = self.unique_handles(plots)
for unique_handle in unique_handles:
self.set_customjs(plot.handles[handle], handles)
Now you might want to call unique_handles
something like filtered_handles
instead and this variable would then replace found
.
return msg | ||
|
||
|
||
def set_customjs(self, handle): | ||
def _get_handle_ids(self, handles): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like a name that conveys the filtering operation that is also happening here. Maybe _get_attached_ids
or _get_attached_handle_ids
?
data['x_range'] = (msg['x0'], msg['x1']) | ||
if 'y0' in msg and 'y1' in msg: | ||
data['y_range'] = (msg['y0'], msg['y1']) | ||
return data | ||
|
||
|
||
class RangeXCallback(Callback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the new bits of code below seems to have the form:
if predicate(msg):
return dictionary_from_msg(msg)
else:
return {}
If you agree this is a general pattern, maybe we can just have an applicable
predicate method with the baseclass checking the predicate value. E.g
class RangeXCallback(Callback):
handles = ['x_range']
def applicable(msg):
return 'x0' in msg and 'x1' in msg
def _process_msg(self, msg):
return {'x_range': (msg['x0'], msg['x1'])}
And of course applicable
can return True
by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are multiple predicates, such as in RangeXY
? I'd prefer not to complicate this for now, although I do agree something like this is worth considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RangeXY
is really the union of RangeX
and RangeY
. We might want to consider generalizing this idea of a union so we can build things like RangeXY
out of the component pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree with this suggestion, maybe it should be made into a new issue (feature request)? I don't think it would be hard to implement later.
Okay, implemented most of your new suggestions. Need to update the attribute_js docstring and will try to add a few unit tests for the various Callback methods. |
@jlstevens Added a unit test and revised the docstring. We can settle what to do about the callback |
I'm happy to merge now although looking at the things we discussed, there might be one thing that could still be addressed in this PR:
Here are the other suggestions made (that we should turn into issues):
|
17ad6ab
to
0dedc3b
Compare
0dedc3b
to
89921f1
Compare
Thanks for doing the renaming (which looked a bit tricky)! Merging. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Currently there is a bug in the code that handles multiple callbacks being attached to one bokeh plot object. The issue arises when two streams are attached to one plot object, this can happen when one axis is linked between two plots. In these cases the
RangeXY
stream of one plot can end up with the y-range of another plot, which has a linked and therefore shared x-axis. The solution is to keep track of the IDs of the handles each stream is attached to, the frontend can send the matching ids along with the updated values and we can then check that only the appropriate data is actually forwarded to the stream.