-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update sankey algorithm to d3-sankey-v0.12.3 #4707
Conversation
Not ready for review, needs some fixes. |
This is looking great! Can you describe the remaining issues? You mentioned that it doesn't actually seem to have an effect or did I misunderstand? |
The issue: my implementation does not work yet :) Need some time to find and fix bug(s). Now, an unexpected part. I maintained a temporary flag that switches between the master and the new implementations: improved = param.Boolean(default=False, doc="""Use the improved version.""") When the job was "completed" I simply removed the flag and the master implementation and created the pr. The reason that made me think that my implementation works (and is equivalent to the master implementation) is that in JupyterLab after one enters an already computed plot variable as the last line of a cell, the computation happens for the second time. And for reasons I don't understand the flag somehow switches to the default value: # sankey.py
...
if self.p.improved:
print('improved')
<improved implementation>
else:
print('old')
<old implementation>
# JupyterLab cell BEGIN
plot = hv.Sankey(..., improved=True)
plot
# JupyterLab cell END
# prints:
# improved
# old (what?) |
@philippjfr Hi! The new implementation is ready (sorry for such a long pause), see some screenshots at the end of this comment. It may be a good idea to remind users about the The only thing I am not sure about is tests. I fixed them as well, so the following commands succeed when run locally: flake8 holoviews
python -m unittest holoviews/tests/element/testgraphelement.py
python -m unittest holoviews/tests/plotting/bokeh/testsankey.py However, they fail in CI and from logs it looks like my old implementation is tested against new tests or my new implementation is tested against the old tests. Do you have any ideas, what could go wrong? ExamplesI took examples from here and here. Within every two pictures the first one is what current implementation does and the second one is the result of the new implementation. UPD: although the first example is ok compared to the current implementation, it is obvious that it can be improved by moving the green and purple nodes to the upper part of the plot and swapping brown connections so that they don't intersect. Given that such a complex plot as in the gallery example is rendered "correctly", I don't think this is a bug of my implementation, but it can be a room for further improvements. |
@Yura52 Thanks for your answer. My understanding was that the following is a result of your new implementation: Since the vertical alignment had changed in comparison to the current implementation, I was wondering whether it was possible to optimize it in this PR. You already mentioned the possible improvements yourself: "moving the green and purple nodes to the upper part of the plot and swapping brown connections so that they don't intersect." I agree. |
Yes, it is the result of the new implementation.
Do you think this change is an improvement compared to the current implementation? If it is, we should probably start with this incremental change which is very clear of what it does (synchronizes the implementation with that of d3-sankey). After that we could move further and design some custom algorithm on top of the new implementation. However, if the new plots do not look better, please, share your feedback on that, it will also be very helpful. |
Alright, there you go :-)
Is it straightforward for you to test the same examples with d3-sankey, to see if the implementations yield the same results? If we can establish that the results are basically the same, then any improvements (e.g. regarding overlap) should maybe first be addressed in d3-sankey (?) |
Thanks a lot! Sorry for not getting back, still don't have time to finish the PR.
It turned out to be easy, because I was lucky to find this (I am not sure how to use that tool, I run cells one by one from bottom to top and it feels strange 😄 ): https://observablehq.com/@d3/sankey-diagram You can past the following content to the cell where "data" is defined to test the "PhD" example:
The result looks good, so I think there is a bug somewhere in my implementation: |
@philippjfr Seems like some progress happened: I fixed one bug and now we have a clear improvement for one of the reference examples (the second example in the following list). Below I attach four plots in the following order:
As for 3 and 4, you can also compare the results with this example of what d3-sankey yields for the same input. I can see some difference in terms of links ordering, however, it is not trivial to compare because of the difference in colors. |
I tested your latest changes on some of my sankeys and it is a great improvement! Many thanks to you! |
This is fabulous, thanks @Yura52! |
Thanks for the positive feedback! The current state is as follows:
It is easier for me to get better Energy plots with the first version. It makes me think there are still differences somewhere between my implementation and d3-sankey (or the Energy data is not a good test case). We can stop at this point and merge the version we like more. Another option is to continue searching for bugs (unfortunately, I cannot guarantee that (1) there are any (2) I will find them in the nearest future). What do you think? |
I did a comparison of a961ab6 and 75726bc and I cannot decide what I prefer. Here are html pages for tests with both versions and the source data as xlsx:
So my example "D" is some kind of outlier here, which gets messed up with the latest changes. All in all, I still prefer the content of this PR over the release version. |
@philippjfr Hi! I have noticed that the PR is still opened. I have just reviewed all the examples provided by @jnettels (thanks!) and I agree that merging the PR makes sense. What do you think? |
Apologies @Yura52, I agree this is an improvement so I'll go ahead and merge now. Really appreciate your efforts and have to apologize this slipped through the cracks. |
@philippjfr no problem! |
Can you open an issue? |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Closes #3548