Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect streams with bokeh callbacks #889

Merged
merged 21 commits into from
Sep 30, 2016
Merged

Connect streams with bokeh callbacks #889

merged 21 commits into from
Sep 30, 2016

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Sep 28, 2016

This finally allows connecting Streams with interactions on a bokeh plot, which will allow for all kinds of interactive features becoming trivial to implement. The PR is currently a WIP and only works for a DynamicMap returning a single element.

The PR works by defining Callback classes which define the attributes that should be returned from the bokeh plot and the handles the callback will be attached to. The Callback then auto-generates the code to look up those attributes and sends them back via a JupyterCommJS.

The next part I will work on is how the source of a stream is looked up to make sure it also works with overlays and layouts.

Example

Having laid out what still needs work I will outline a concrete example of what does work. The PR registers the following callback:

class PositionXYCallback(Callback):

    attributes = {'x': 'cb_data.geometry.x', 'y': 'cb_data.geometry.y'}

    handles = ['hover']


callbacks = Stream._callbacks['bokeh']
callbacks[PositionXY] = PositionXYCallback

When creating a PositionXY stream on a DynamicMap it will now create a hovertool on the source of the stream, which sends its x and y position back to the Stream.

dmap = hv.DynamicMap(lambda x, y: hv.Points([(x**i, y**i) for i in np.linspace(0, 3)],
                                            extents=(0, 0, 10, 10)),
                     kdims=[], streams=[PositionXY()])
dmap

xy

@jlstevens
Copy link
Contributor

jlstevens commented Sep 28, 2016

Fantastic!

It is probably worth noting that in the final version, users won't even need to define PositionXYCallback as we will have all those defined and registered already. This means, all that will need to be done is to import PositionXY and define the DynamicMap.

Nonetheless, it is great to see how little code is involved which suggests custom functionality will be very easy to add!

I haven't reviewed the PR yet but I am assuming the mechanism that will register these callbacks for each backend (in such a way the user doesn't need to worry about it) is something we will still need to discuss.

@philippjfr
Copy link
Member Author

It is probably worth noting that in the final version, users won't even need to define PositionXYCallback as we will have all those defined and registered already. This means, all that will need to be done is to import PositionXY and define the DynamicMap.

That's already the case, I just pulled it out to demonstrate how it works and how little code is involved. Currently there is a registry on Stream._callbacks, indexed by backend and Stream type.

@jlstevens
Copy link
Contributor

Great!

I'll review the PR when you feel it is ready and have hopefully added a few more examples.

@jbednar
Copy link
Member

jbednar commented Sep 29, 2016

Excellent! Looks really clean. And somehow you did it by removing 300 lines of code.

@philippjfr
Copy link
Member Author

philippjfr commented Sep 29, 2016

I've replaced the old callback system with this, which means some of the old downsampling callbacks should become operations (probably in another PR). There is some worry about backward compatibility, but I'd prefer not to have two separate systems around.

@jlstevens
Copy link
Contributor

There is some worry about backward compatibility, but I'd prefer not to have two separate systems around.

I agree. I don't think we need to worry too much as the old plotting callback system wasn't properly documented so I think we were probably the only people using it.

@@ -49,25 +49,6 @@ class PolygonPlot(ColorbarPlot, PathPlot):
style_opts = ['color', 'cmap', 'palette'] + line_properties + fill_properties
_plot_methods = dict(single='patches', batched='patches')

def _init_tools(self, element):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to get rid of this! I guess it now inherits it from the superclass.

@@ -119,7 +119,7 @@ def send(self, data):



class JupyterCommJS(Comm):
class JupyterCommJS(JupyterComm):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new version makes more sense...was this a bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but the class wasn't used.


index = param.List(default=[])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these new stream classes look good - they will need docstrings on their parameters though....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will add them shortly.

def __init__(self, preprocessors=[], source=None, subscribers=[], **params):
super(PositionX, self).__init__(preprocessors=preprocessors, source=source,
subscribers=subscribers, **params)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting rid of these unnecessary constructors!

if self._source:
raise Exception('source has already been defined on stream.')
self._source = source
self.registry[source].append(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be id(source)?

@@ -121,14 +115,26 @@ def __init__(self, preprocessors=[], source=None, subscribers=[], **params):
datastructure that the stream receives events from, as supported
by the plotting backend.
"""
self.source = source
self._source = source
self.subscribers = subscribers
self.preprocessors = preprocessors
self._hidden_subscribers = []

self.uuid = uuid.uuid4().hex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the uuid attribute might no longer b necessary...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove it.

registry = OrderedDict()
registry = defaultdict(list)

_callbacks = defaultdict(dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this has an underscore, it is worth having a comment explaining what this is, as it isn't used anywhere else in this file (that I can see).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea definitely need to add docstrings in this file still. Basically it defines the registry of callbacks for each backend which supply the stream information.

def replace_models(obj):
"""
Processes references replacing Model and HasProps objects.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 'Recursively processes references, replacing Models with there .ref values and HasProps objects with their property values'. At least that is how I understand the code!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will update it.

callbacks = param.ClassSelector(class_=Callbacks, doc="""
Callbacks object defining any javascript callbacks applied
to the plot.""")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to get rid of this! Though we mustn't forget to mention its removal in the CHANGELOG.

Convert the document to a dictionary of references. Avoids
converting document to json and the performance penalty that
involves.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps 'Avoids unnecessary JSON serialization/deserialization within Python and the corresponding performance penalty'. If you don't want to use that, just make it clear that this is all within Python.

@@ -168,18 +166,56 @@ def __init__(self, element, plot=None, **params):
self.current_ranges = None
super(ElementPlot, self).__init__(element, **params)
self.handles = {} if plot is None else self.handles['plot']
element_ids = self.hmap.traverse(lambda x: id(x), [Element])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was element_ids not needed?

Copy link
Member Author

@philippjfr philippjfr Sep 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that even if a HoloMap of Overlays has only a single item it's still static.

else:
source = self.hmap.last
streams = Stream.registry.get(source, [])
registry = Stream._callbacks['bokeh']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so _callbacks contains the backend specific code? Worth clarifying where _callbacks is defined on Stream (see comment below).

if type(stream) in registry and streams}
cbs = []
for cb, group in groupby(sorted(callbacks), lambda x: x[0]):
cb_streams = [s for _, s in group]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cb_streams doesn't seem to be used...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that's a bug.

cb_tools = []
for cb in callbacks:
for handle in cb.handles:
if handle and handle in known_tools:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am following this right, this could be simplified if there was a single (custom) tool for capturing events? I believe this code is about finding the right tool that the callback needs...

cb_tools.append(tool)
self.handles[handle] = tool

tools = cb_tools + self.default_tools + self.tools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which also means that tools may appear because a callback requests it...which could confuse users who see tools they didn't (explicitly) request. Another reason to consider a (hidden) custom tool that all our callbacks can use...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases an actual tool is necessary e.g. to get the geometry of a selection, but I agree a hidden tool would be preferable for hover and clicking callbacks.

@@ -512,7 +552,7 @@ def initialize_plot(self, ranges=None, plot=None, plots=None, source=None):
self.handles['plot'] = plot

# Get data and initialize data source
empty = self.callbacks and self.callbacks.downsample
empty = False
if self.batched:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this temporary? If not, there is no reason for empty to appear as I don't see it being set a different value anywhere...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm going to end up getting rid of empty everywhere I think. But probably in a separate PR.

el = element.get(key)
if el is not None:
el_tools = subplot._init_tools(el, self.callbacks)
el_tools = [t for t in el_tools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess OverlayPlot is merging the set of tools? Did it always have to do this or is this just for callbacks? If it is the latter, then I think this is another reason to have our own custom tool. :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always been merging them.

modified bokeh plot objects.
Generates JS code to look up attributes on JS objects from
an attributes specification dictionary.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the docstring have a simple example of input and the corresponding JS output? It is hard to tell just looking at the code...


* handles : The handles define which object the callback will be
attached on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of object? HoloViews objects? If so, it should say so. Also, 'definition' is misspelt above...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now think they are in fact 'plotting handles'.

location of the attribute separated by periods.
All plotting handles such as tools, the x_range,
y_range and (data)source can be addressed in this
way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading this description, I think I understand but I think a short example would really help here....


The callback can also define a _process_msg method, which can
modify the data sent by the callback before it is passed to the
streams.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having described these three things (handles, attributes and code) is there a performance reason that only code is a parameter and the other two are class attributes? I'm also a little confused as to why they are class attributes and not instance attributes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user can supply some custom code to the instance to be executed alongside the callback but each Callback really only applies to certain handles (e.g. RangeX and x_range) and specific attributes on those handles (e.g. x_range.attributes.x). I have no objection making them all parameters though, you originally suggested making them class attributes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact code seems to be a class parameter as you can't set it through the constructor! Nothing wrong with that although I think my question above still holds...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, code seems to be the only reason this class is parameterized at all...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user can supply some custom code to the instance to be executed alongside the callback

Sure! In that case you'll need to call super :-)

In the end, I do think class attributes are fine now I see how they are defined on each concrete Callback...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though as the plotting code creates these things, I don't see how code could be customized per instance...

self.callbacks.update(self)

for cb in self.callbacks:
cb.initialize()
if not self.overlaid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little confusing that there is a _init_callbacks method which actually constructs the callback instances (based on my understanding of it) and then an initialize method as well. Maybe the former should be called _construct_callbacks instead...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

self.source = source


def initialize(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this right, the docstring could say something like 'Iterates through the plot (and any subplots), registering the necessary JS code across all relevant plotting handles'. Again, that is what I expect it to be doing from the code!


def set_customjs(self, cb_obj):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should cb_obj simply be called handle for consistency?



class Callback(param.ParameterizedFunction):
def attributes_js(attributes):
Copy link
Contributor

@jlstevens jlstevens Sep 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, I am not convinced that this class needs to be parameterized.

Lastly, it might be nice to define Callback to be backend agnostic. These methods look fairly semantically independent from bokeh to me:

def __init__(self, plot, streams, source, **params):

def initialize(self):

def on_msg(self, msg):

def _process_msg(self, msg):

No need to abstract it out in this PR though...

@jlstevens
Copy link
Contributor

I understand this is now ready to merge!

Overall, I'm really pleased with how the streams system is turning out. It was simple to implement streams in core/plotting and this PR shows how you can get access to events from Bokeh in a nice clean way.

There are still some outstanding things to address in future PRs:

  • I think some of the implementation here could be simplified using a custom Bokeh tool to captures some of the events.
  • Throttling needs to be implemented on the JS side with the intent of eventually making it backend agnostic.
  • Overlays need to be handled properly.
  • We need to replace the plotting callback system e.g re-introduce downsampling as an operation.

Together with some nice documentiation and examples on mybinder, I think this will warrant the 1.7 release!

@jlstevens jlstevens merged commit 9d80880 into master Sep 30, 2016
@philippjfr philippjfr deleted the connect_streams branch October 14, 2016 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants