-
-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antialiased lines #1048
Antialiased lines #1048
Conversation
@ianthomas23 just following up on the request for help xref: https://numba.discourse.group/t/numba-and-datashader/1221 So as to be able to help with this, please could you explain in more detail what the problem is with regards to the following (as I assume this is the issue?):
Some guiding questions:
Many thanks! |
Thanks @stuartarchibald.
Yes. I think that this might be the key: If I leave the
Poor performance. (I can get failure to compile but only when trying to use cupy/cudf so I have shelved that for the moment).
I suspect not. I don't think there are actually any bugs/problems with numba or toolz here, it is just the combination of my usage and datashader's. If I could start from scratch I think all would be fine, but I am constrained to work within datashader. Essentially I have done the bit I am good at (the antialiasing) and now I am inefficiently struggling with the datashader codebase, and I was hoping that someone with some complementary skills to mine could give me some guidance on what direction to go in. I know that is a big ask, but I have to ask anyway. |
one thing that you should check, that it probably would not come up in a stand along minimal reproducer, is whether the functions are being re-compiled several times, which would lead to poor performance. |
It looks like the underlying issue is that making the |
There is significant progress here and the performance problems are resolved. The approach, following that suggested by @jbednar, was to change the code at the lower levels that uses This works for both |
I have modified the existing antialiased line tests in line with the new functionality. You can see that the output is much improved by looking at the changes to the PNG files in the I have also added a warning when antialiased lines are attempted using a CUDA-backed data source, e.g. cudf, dask-cudf. This is not yet supported, it could be but is beyond the scope of this PR. |
For illustration here's a video of this in action recorded by @ianthomas23: Screencast.2022-03-16.10_25_57.mp4 |
Apart from that, I'm happy to merge! |
This needs more discussion. The transform from data to canvas coordinates doesn't use xx = (x_mapper(x)*sx + tx)*(width-1)/width This information is not available at the point in the code where it is needed. We have the following options:
|
|
This is now ready to be merged. I was wrong about the transform from data to pixel coordinates for unsnapped antialised lines and @jbednar was right. It just needed a half pixel offset. My error was in attempting to replicate the results for the simple test case of horizontal and vertical lines at the canvas boundaries. In the (unsnapped) antialiased case the lines sit astride the boundaries so only half of the linewidth is within the canvas. In the snapped non-antialised case the lines are snapped to the nearest pixels which are the perimeter pixels of the canvas. The snapping is asymmetric in that the lines at the lower bounds are snapped in the positive x or y direction whereas lines at the upper bounds are snapped in the negative x or y direction. The rectangle formed from the centres of the 4 lines is therefore one pixel smaller in both directions in the non-antialised case than the antialiased one. If you tile adjacent canvases then the non-antialised case renders the boundary lines twice, once per tile, whereas the antialiased case renders more correctly in that half of each boundary line in rendered in each tile. Although this now feels really obvious, it evidently wasn't originally so I will post some example images here to help with understanding in the future. Test code has lines around the perimeter of the canvas, both diagonal and another line half-way up the canvas: import pandas as pd
import numpy as np
import datashader as ds
import datashader.transfer_functions as tf
s = 1.23 # start
e = s + 4.92 # end
m = 0.5*(s + e) # midway
df = pd.DataFrame(
dict(x=[s, e, e, s, s, e, np.nan, s, e, np.nan, s, e],
y=[s, s, e, e, s, e, np.nan, e, s, np.nan, m, m]),
dtype=np.float64,
)
for w in (30, 29):
cvs = ds.Canvas(plot_width=w, plot_height=w)
for i, linewidth in enumerate([0, 1, 2]):
agg = cvs.line(source=df, x="x", y="y", agg=ds.count(), linewidth=linewidth)
im = tf.shade(agg, how='linear', span=(0, 1), cmap=["white", "darkblue"])
ds.utils.export_image(im, f"antialias_simple_{w}_{i}", background="white") For canvas size an even number of pixels (30 here) the magnified results for non-antialiased, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fabulous! The examples are convincing, and for each of the test cases that differ, I believe the new behavior is significantly better. Ready to merge when the tests pass after applying these minor changes.
This is a work in progress to add antialiased lines to
datashader
. It is making good progress, but I need help/advice on how to proceed with thenumba
/toolz
decorators and the nested functions that create and return other functions.There are 2 commits. The first adds the low-level antialiasing code but only works in certain situations using
pandas
dataframes for the input data. This is fast. The second commit restructures the code in a better way and supportsdask
dataframes, but the performance is appalling.There is a gist (https://gist.github.com/ianthomas23/898d0d54e35abace6f9be3efcef4a8c5) that can be used to show the performance difference. You will need to use this branch of course, and change lines 216-7 of
datashader/core.py
fromto
so that you get antialiased lines by default without having to modify
holoviz
code.The key difference between the
master
branch and this is that the antialiasing has to occur on floating-point aggs, even if you are using anany
orcount
reduction, which are usuallybool
anduint32
. Drawing antialiased lines is now uses a two-stage agg process. The first stage is run once for each line or group of lines and uses a floating-pointmax
to deal with the antialiasing correctly, then these are combined in a reduction-specific manner to give the returned floating-point agg.The key architectural change is therefore that the
dtype
of aReduction
is no longer fixed but is determined at runtime based on whether antialiasing is turned on or not. In the first commit this is botched to get it working, but has to be done correctly in the second commit so that, for example, the correctdask
combine functions are used. Thedtype
is now an attribute of eachReduction
object rather than it being static. The ramification of this is that I have had to turn off some of the cunning/magic decorators of the nested functions-that-return-functions which loses the numba performance benefits.I suspect that the first place to look is the one-line change in
compiler.py
, commenting out the@memoize
:so I am next inclined to get an
antialiased
flag as an argument here and pass it down through the hierarchy of functions-that-generate-functions. But it would be good to get the opinion of someone who is more familiar with these decorators.