Graph support #1829

philippjfr · 2017-08-28T16:22:23Z

Based on the discussion @jlstevens and I had during SciPy sprints this PR adds a Graph Element to HoloViews. The outline below represents the discussion surrounding the API we had:

Terminology

Graph: Element that may or may not visualize itself.
Nodes: Element associating indices to positions and other information. Self-visualizing.
NodePaths: dataframe of connectivity information. Self-visualizing.

Graph - Composite Element

Can be abstract
.data represents the abstract graph (i.e. start and end pairs of node indices)
Can be visualized by a bokeh layout algorithm as a plot option.
Can be visualized with straight line connections if given Nodes.
Can be visualized with arbitrary paths if given NodePaths.

data: start and end columns (where start and end are the node indices)
kdims: [start, end]
vdims: [edge label and other metadata]

Nodes

Defines concrete position of each node using three columns/kdims, x/y position plus node index/label. Any value dimensions will be exposed as hover labels.

data: x,y positions + node index + anything else
kdims: [x, y, index]
vdims: [anything else (used as hover labels)]

NodePaths

Subclass of Path using multi-interface representing the concrete paths

data: [ dfs/array, ... ]
kdims: [x, y]
df/array: [xs,ys] Defines the path for each edge defined in abstract Graph

Graph Constructors

Graph(edges) - Abstract graph (cannot be displayed).
Graph((edges, nodes)) - Defines concrete node positions
Graph((edges, nodes, nodepaths)) - Define concrete node positions and concrete paths for edges connecting paths

Graph attributes

.data - exposes just the abstract graph (i.e. start and end connections)
.nodes - exposes concrete nodes
.nodepaths - Computed automatically from .nodes (direct connections) or returns explicitly supplied paths.

ToDo

jbednar · 2017-09-01T03:52:23Z

Looks like a great plan to me.

jlstevens · 2017-09-01T11:35:17Z

@philippjfr Ping me when you think it is in a good state for me to test it out.

philippjfr · 2017-09-15T11:50:18Z

@jbednar @jlstevens This is ready for a review round. You should probably also play with it a bit so you get a feel for it. I'll mostly be focusing on tests at this point.

jbednar

Fabulous! I think this has turned out to be very clean and will be really nice for people to use. I had a variety of comments about the example notebook, mostly about things to make clear to the users how to do, but also possibly about the underlying support (not sure which).

Also, there's no example of mapping edge "weight" onto anything visible; is that supported?

And it would be useful to have a datashaded categorical example, with some large graph of edges broken into categories that show up as different colors; that always makes for pretty pictures that drive people to try to explore what the categories and network structures mean.

jbednar · 2017-09-15T13:50:26Z

examples/user_guide/Network_Graphs.ipynb

+   "source": [
+    "Visualizing and working with network graphs is a common problem in many different disciplines. HoloViews provides the ability to represent and visualize graphs very simply and easily with facilities for interactively exploring the nodes and edges of the graph, especially using the bokeh plotting interface.\n",
+    "\n",
+    "The ``Graph`` ``Element`` differs from other elements in HoloViews in that it consists of multiple sub-elements. The data of the ``Graph`` element itself are the abstract edges between the nodes, on its own this abstract graph cannot be visualized. In order to visualize it we need to give each node in the ``Graph`` a concrete ``x`` and ``y`` position in form of the ``Nodes``. The abstract edges and concrete node positions are sufficient to render the ``Graph`` by drawing straight-line edges between the nodes. In order to supply explicit edge paths we can also declare ``NodePaths``, providing explicit coordinates for each edge to follow.\n",


I know it's a bit late to be raising this, but is NodePaths really the best name here? To me that sounds like the paths of a node, whereas it's actually the paths of an edge, i.e. the path between two nodes. Why not EdgePaths (since I assume you are reserving "edge" for the logical connection, with no path information)?

jbednar · 2017-09-15T16:53:34Z

examples/user_guide/Network_Graphs.ipynb

+    "To summarize a ``Graph`` consists of three different components:\n",
+    "\n",
+    "* The ``Graph`` itself holds the abstract edges stored as a table of node indices.\n",
+    "* The ``Nodes`` hold the concrete ``x`` and ``y`` positions of each node along with a node ``index``. The ``Nodes`` may also define any number of value dimensions, which can be revealed when hovering over the nodes or to color the nodes by.\n",


I'm not sure I quite get this. There are no Nodes unless they have x,y positions? And the Graph doesn't store nodes or anything about nodes unless it includes x,y positions? This doesn't seem right to me; shouldn't we be able to represent (even if not visualize) graphs that have nodes and information about nodes, but no positions?

Thanks for the reminder, I still have to do some work around this. You will be able to pass in node information without positions in the constructor and it will add the positions automatically. Constructing an abstract Nodes object without positions doesn't make sense to me though.

Done and example added.

Ok, sounds good.

Just to clarify, is assigning positions to the nodes lazy, in that no positions will be assigned if no one ever tries to visualize it? If so, that sounds good.

Computing a circular layout isn't expensive in time, but it is expensive in storage, at least for a sparse graph -- each node then gets two extra columns of data. I'm still not sure what either of you mean by "conditional"; conditional on what? I was suggesting that the element always be displayable, just that this extra data not be computed until precisely the point where it first needs to be displayed.

The use case I imagine for that is to declare a huge Graph but then use HoloViews machinery to select only a subset of it to make a much smaller Graph that you actually display. I'm not worried about pickling it, just about making the handy tools in HoloViews available without forcing the Graph to be something displayable, as that's a separate consideration.

I was suggesting that the element always be displayable, just that this extra data not be computed until precisely the point where it first needs to be displayed.

Okay sure, I can make it a property which computes nodes the first time they are accessed, which will basically happen when you call any method.

@jlstevens, is that ok? If you veto it, I won't object; up to you.

If the idea of the property is simply to defer assigning positions in a way that you can't tell the positions weren't there all along (except by tracking memory usage) then that seems ok to me. I won't veto it and I'll let Philipp decide although it does strike me a little bit as being premature optimization for a case we don't have a real example of yet.

Pushed it now, revert it if you do object.

jbednar · 2017-09-15T17:15:05Z

examples/user_guide/Network_Graphs.ipynb

+    "#### Hover and selection policies\n",
+    "\n",
+    "Thanks to Bokeh we can reveal more about the graph by hovering over the nodes and edges. The ``Graph`` element provides an ``inspection_policy`` and a ``selection_policy``, which define whether hovering and selection highlight edges associated with the selected node or nodes associated with the selected edge, these policies can be toggled by setting the policy to ``'nodes'`` and ``'edges'``."
+   ]


Why do we have to choose? I can see why someone might want to restrict to one or the other, but can't we have the default be that if hit testing says i'm over a node, do the node hovering, and if not do the edge hovering? If that's possible, then people will almost never need to mess with this option; it will just usually do what they want.

That's an issue to raise with bokeh.

I swear there was an issue about this in bokeh but I can't find it now.

I can raise one, but it's probably better if you do it, because you're down there using the bokeh implementation and can be very precise about what's there now and what you'd like to see.

jbednar · 2017-09-15T17:15:38Z

examples/user_guide/Network_Graphs.ipynb

+   "source": [
+    "%%opts Graph [tools=['hover']]\n",
+    "bezier_graph.opts(plot=dict(inspection_policy='edges'))"
+   ]


Can we have hover info about the edge? I don't see an example of that.

Not currently, another issue to raise with bokeh.

And these are issues you haven't already raised?

No, we still don't have a way of efficiently storing additional edge information in HoloViews (although that is work I'll be doing for 1.9), so I haven't gotten around to this yet and won't in this PR.

Ok. Please raise a separate issue for that and reference it in this PR, but then leave it open.

Nevermind, the inspection_policy determines whether you can inspect nodes or edges and you can just add the edge information to the abstract graph edges, so this works now.

jbednar · 2017-09-15T17:16:25Z

examples/user_guide/Network_Graphs.ipynb

+   "metadata": {},
+   "source": [
+    "#### Animating graphs"
+   ]


Animating graphs sounds like a great thing to include, but I don't see anything in this section about that. Can you add something that actually does animate a graph, e.g. while dragging a slider?

Sure, I can add a different example.

This one also could use the padding that helps the ones above. I guess auto-padding is only for 2.0?

Unfortunately that's probably the case.

jbednar · 2017-09-15T17:17:55Z

examples/user_guide/Network_Graphs.ipynb

+    "fb_nodes = hv.Nodes(pd.read_csv('../assets/fb_nodes.csv')).sort()\n",
+    "fb_graph = hv.Graph((edges_df, fb_nodes), label='Facebook Circles')\n",
+    "fb_graph = fb_graph.redim.range(x=(-0.05, 1.05), y=(-0.05, 1.05)).opts(style=dict(cmap=colors))\n",
+    "fb_graph"


Here the hover information seems like it should be "circle: 50", not "circle: circle50". I know that means the original data needs updating, but I do think that's important to avoid being confusing.

Or it could be something like "group: circle50"...

Also, seems like the hover should have something (even if we have to make it up) to indicate the specific identity of the node. I imagine these are anonymized, but presumably we need something like "identity: person2090" to show that we can use this information to learn not just what category it's in, but who it is (the node identity).

If you pulled the latest version you should have the node index as well at least, is that sufficient?

Ok, just pulled again and got the index info. Yes, that's helpful, but in general people won't only want an index, they'll want some sort of human-meaningful information. But that's presumably available if it's in a column, so we can just say that in words ("Hovering over the node will reveal the index of that node, along with any other columns of information about that node that are available")?

BTW, the index info isn't always coming out ok:

That's pretty damn weird.

The ??? isn't the only odd bit, it's odd that it's trying to show all the node ids at all. Maybe that's useful sometimes, but I'd guess probably not, since in general that can be a huge list and it's unclear what any of those other indexes would be (and unclear which is the one node I'm hovering over). Or maybe those are edge indexes. Confusing!

Or maybe those are edge indexes.

I suspect that's it, in which case I can definitely fix it.

jbednar · 2017-09-15T17:22:41Z

examples/user_guide/Network_Graphs.ipynb

+    "fb_graph = fb_graph.redim.range(x=(-0.05, 1.05), y=(-0.05, 1.05)).opts(style=dict(cmap=colors))\n",
+    "fb_graph"
+   ]
+  },


I think we need to work on the visibility of the hovering. Right now, it's clear that nodes and edges are being selected, because they change color, but as soon as the mouse stops moving it is very difficult to see the difference between selected and unselected nodes and especially edges:

Here the selected node has a slightly different outline, which is very difficult to see, and the selected edges turn red, which is only visible because these happen to be on white; when they are in a complex nest it's super hard to make them out.

Can we make all the non-selected nodes and edges turn dim (Hunt's suggestion)? Or make the selected ones be much thicker than the rest, so that they actually stand out?

I'll look into it.

Just checked, this is another thing to file with bokeh, the nonselection style options are not exposed at the moment.

Ok, please do file those issues, and you can tag me if you need someone other than you to express a user-level opinion. But none of those things should hold up this PR.

I was wrong, can probably get this working.

I realize now what the issue was, there is no way of styling something that's not being hovered over. I can only customize non-selected nodes/edges, e.g. when using the tap or box_select tools.

Here's what that looks like:

jbednar · 2017-09-15T17:27:00Z

examples/user_guide/Network_Graphs.ipynb

+   "metadata": {},
+   "source": [
+    "## Datashading graphs"
+   ]


This plot isn't yet compelling. For one thing, the colorized edges don't really show up, which is partly because colors of fine lines are just never very visible, and partly because they are using the fire colormap, which is an inappropriate choice against a white background. Here the strongest bundles are colored yellow, which is very hard to see against white, thereby deemphasizing precisely the bits that should be most emphasized. I really think we need to get rid of the default "fire" colormap for datashader in HoloViews; it's almost never the right choice for the default white backgrounds.

In any case, this example isn't yet very compelling, perhaps because the data volume isn't high enough to need datashader in the first place. Maybe show the unbundled graph too, where the color shading can help make sense of the tangled thicket around the main clump of nodes?

Since this is a user guide I don't think a larger dataset is workable. I agree about changing the color scheme, but do you think a larger dataset is required? I'd like to avoid switching to synthetic data.

Well, try the non-bundled case; the effect of datashader's coloring should be clear there.

It just looks very messy unfortunately, I've switched to 'linear' normalization, which means that the paths are mostly black. If we can get the nodes to line up I think this is reasonable.

jbednar · 2017-09-15T17:28:27Z

examples/user_guide/Network_Graphs.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Alternatively we can select the nodes and edges by an attribute that resides on either. In this case we will select the nodes and edges for a particular circle and then overlay just the selected part of the graph on the datashaded plot. In this way a smaller subgraph can be highlighted and the larger graph can be datashaded."


This description sounds great, but I really couldn't make sense of the example; it's super hard to see what's datashaded and what isn't and what to make of it. Again I think that's probably because this dataset isn't sufficiently large for there to be a difference between what bokeh alone can do and what datashader can. Whatever it is, something needs to change for this to be a useful example.

I can understand that visually it's not hugely compelling and a change in color schemes should help here, but is it really not clear conceptually? It's simply datashading everything and then overlaying part of the graph selected based on some criterion, which could be an attribute on either the edges or nodes.

The concept is clear to me, but it's not clear visually -- I can't tell which bits are overlaid and which bits are datashaded. If I zoom quickly I can see that both are present, but I really can't visually get any idea which is which.

The other issue is that bundling shifts the start and end points of the edges slightly (and sometimes significantly) so they don't line up with the nodes anymore. I'd expect bundling to leave the start and end points unaffected.

@jbcrail ? I would also have expected bundling to leave nodes exactly where they are.

@jbednar I would expect that also. A quick look at the code indicates at least two areas where the start and end points are resampled and modified.

@philippjfr Do you have an example? If so, could you put that into a datashader issue?

@jbcrail Yes, I'll work up an example using the datashader API.

jlstevens · 2017-09-15T18:54:46Z

Also, there's no example of mapping edge "weight" onto anything visible ...

I agree, I think mapping a weight to an edge thickness/color is a very common case.

jbednar · 2017-09-15T19:01:08Z

BTW, is there any way to turn off the x,y locations of graph nodes by default for hovering? The location is usually not informative at all, given that the layout is from some algorithm instead of the data.

philippjfr · 2017-09-15T19:14:17Z

I agree, I think mapping a weight to an edge thickness/color is a very common case.

Yes, I agree, not implemented yet. Any suggestion on plot options to control this? Ideally we'd have addressed that issue size/color/alpha_index options by now and there was a general way to do this mapping. I have some further ideas about that, which I'll have to write up. The basics are that each element type declares a number of visual mappings and we provide a list of common transforms, e.g. you could do something like Graph (edge_alpha=Normalize('weight') edge_line_width=Normalize('weight', max=10) to map 'weight' to alpha and normalize it to be in the range 0-1 and do the same to the line_width but scale it to the range 0-10. I think we can build a very powerful system around that. Not sure where that leaves us in the meantime though.

The location is usually not informative at all, given that the layout is from some algorithm instead of the data.

I'll implement that.

jlstevens · 2017-09-15T19:37:14Z

One thing that I think is missing from this PR are element notebooks documenting Graph, Nodes and EdgePaths with the latter two obviously linking to Graph. One goal of this design was to have each element be meaningful in isolation with an appropriate visual representation and I think new element notebooks would be a good way to show that (and show off the difference between the matplotlib and bokeh support).

philippjfr · 2017-09-15T19:41:42Z

Yes, element notebooks and gallery examples are still missing.

jlstevens · 2017-09-15T19:56:21Z

There is a fair bit of documentation work that needs doing. For this reason, I wouldn't mind merging this PR to get the code in place with a separate PR improving the docs. That PR could improve the user guide in the ways Jim has suggested and add the various other notebooks.

philippjfr · 2017-09-15T20:05:10Z

I'd prefer to just do it in this PR tbh. Also have to add tests still.

jlstevens · 2017-09-15T20:17:24Z

I agree we want tests! At least enough to turn coveralls green again. :-)

philippjfr · 2017-09-20T12:01:04Z

@jbednar @jlstevens I've now added Element reference and gallery examples for graphs, both are very similar to sections of the user guide and the element notebooks link to the user guide for further detail.

jbednar · 2017-09-20T14:58:09Z

Ok, looks great! Can you please look over our comments above, pull out anything that you haven't been able to deal with yet into an open issue, and then we can merge this?

philippjfr · 2017-09-20T16:45:49Z

My to do list is basically:

Once we have implemented the op aesthetic mapping suggestion we should add an example that maps egde weights to line_width or alpha.
Make a pure datashader example to show how datashader bundling shifts the nodes.

jlstevens · 2017-09-20T20:18:44Z

holoviews/element/graphs.py

+    group = param.String(default='Nodes', constant=True)
+
+
+class EdgePaths(Path):


I'm wondering whether it is worth introducing EdgePaths. Why not just use Path?

jlstevens · 2017-09-20T20:22:21Z

holoviews/element/graphs.py

+    Accepts a Graph and lays out the corresponding nodes with the
+    supplied networkx layout function. If no layout function is
+    supplied uses a simple circular_layout function.
+    """


It's fine for this operation to live here for now. But if we end up with a bunch more operations, they should move to that module.

jlstevens · 2017-09-20T20:24:05Z

holoviews/operation/datashader.py

+    return [df for df in np.split(path_df, splits) if len(df) > 1]
+
+
+class bundle_graph(Operation, hammer_bundle):


Another graph operation ... but this time depending on datashader. Not sure where these should go...

jlstevens · 2017-09-20T20:25:22Z

holoviews/plotting/bokeh/graphs.py

+    inspection_policy = param.ObjectSelector(default='nodes', objects=['edges', 'nodes', None], doc="""
+        Determines policy for inspection of graph components, i.e. whether to highlight
+        nodes or edges when hovering over connected edges and nodes respectively.""")
+


Shorter names would be nice. Maybe select_by and inspect_by?

Or select_mode and hover_mode?

jlstevens · 2017-09-20T20:29:14Z

holoviews/element/graphs.py

+        """
+        Computes the node positions the first time they are requested
+        if no explicit node information was supplied.
+        """


I still think this is a bit weird. I won't revert it though but I would like to keep the option to reconsider later...

jlstevens · 2017-09-20T20:35:46Z

The tests are now green and I'm very happy with the API overall. There are three things I want us to follow up on:

Some of the new comments I have just made (see above).
The example with edge thicknesses representing weight.
Dedicated Nodes and EdgePaths element notebooks as these are elements too.

I don't think these issues should hold up this PR any longer as any PR adding a major new feature is going to need tweaks. I am happy to merge now so we can get testing and improving our new graph support!

jonberthet

lgtm

github-actions · 2024-10-24T09:31:50Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

philippjfr added the type: feature A major new feature label Aug 28, 2017

philippjfr force-pushed the graph_support branch 3 times, most recently from 6069bda to 4998b64 Compare September 14, 2017 22:36

jbednar reviewed Sep 15, 2017

View reviewed changes

philippjfr force-pushed the graph_support branch 2 times, most recently from b2ffc03 to 181f6d6 Compare September 18, 2017 12:07

philippjfr added 12 commits September 20, 2017 11:48

Added initial Graph support

20380b3

Small fix for style group handling

f3d0bae

Make graph imports conditional

4ee5c1d

Small bugfix for bokeh property handling

95bd7d3

Fix bug when computing NodePaths

8196ed6

Added control over Graph inspection/selection policy

4131560

Added bokeh version check to Graph support

1401523

Added matplotlib GraphPlot implementation

71cb107

Defined default styling for graphs

9ab946c

Added graph hover info support

9b53989

Fix handling for empty HoverTool

5fcfa8f

Fixed Graph nodepaths method

d10bc08

philippjfr added 9 commits September 20, 2017 11:48

Fixed Graph without hover/select policy

92266ef

Added Graph plot tests

b91f805

Add selenium to travis requirements

1174c8c

Added Graph redim test

19e7584

Added bokeh Graph colormapping test

8ae63db

Small fix for network graph user guide

fcd8ff1

Updated bokeh test for v0.12.9

e7155d9

Added Graph element and demo notebooks

8fd1566

Small update for Network Graphs user guide

5e646d6

philippjfr force-pushed the graph_support branch from e6a329f to 5e646d6 Compare September 20, 2017 11:33

Compute Graph Nodes lazily if none supplied

0ec07e0

philippjfr force-pushed the graph_support branch from 06765b7 to 0ec07e0 Compare September 20, 2017 18:00

jlstevens reviewed Sep 20, 2017

View reviewed changes

jlstevens merged commit ab67026 into master Sep 20, 2017

philippjfr deleted the graph_support branch September 28, 2017 20:11

pyup-bot mentioned this pull request Nov 3, 2017

Update holoviews to 1.9.0 zenotech/zCFDSuperBuild#104

Closed

pyup-bot mentioned this pull request Nov 13, 2017

Update holoviews to 1.9.1 zenotech/zCFDSuperBuild#120

Closed

pyup-bot mentioned this pull request Dec 12, 2017

Update holoviews to 1.9.2 zenotech/zCFDSuperBuild#139

Merged

reckoner mentioned this pull request Mar 12, 2019

Hover on Graph nodes *and* edges? #3555

Open

jonberthet reviewed Feb 26, 2020

View reviewed changes

github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2024

		group = param.String(default='Nodes', constant=True)


		class EdgePaths(Path):

		return [df for df in np.split(path_df, splits) if len(df) > 1]


		class bundle_graph(Operation, hammer_bundle):

Graph support #1829

Graph support #1829

Conversation

philippjfr commented Aug 28, 2017 • edited by jbednar Loading

Terminology

Graph - Composite Element

Nodes

NodePaths

Graph Constructors

Graph attributes

ToDo

jbednar commented Sep 1, 2017

jlstevens commented Sep 1, 2017

philippjfr commented Sep 15, 2017

jbednar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr Sep 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr Sep 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlstevens commented Sep 15, 2017

jbednar commented Sep 15, 2017

philippjfr commented Sep 15, 2017 • edited Loading

jlstevens commented Sep 15, 2017 • edited Loading

philippjfr commented Sep 15, 2017

jlstevens commented Sep 15, 2017 • edited Loading

philippjfr commented Sep 15, 2017

jlstevens commented Sep 15, 2017

philippjfr commented Sep 20, 2017

philippjfr commented Aug 28, 2017 •

edited by jbednar

Loading

philippjfr Sep 15, 2017 •

edited

Loading

philippjfr Sep 15, 2017 •

edited

Loading

philippjfr commented Sep 15, 2017 •

edited

Loading

jlstevens commented Sep 15, 2017 •

edited

Loading

jlstevens commented Sep 15, 2017 •

edited

Loading