Make value type generic and independent from coordinate type #14

netthier · 2024-04-15T10:55:15Z

This PR adds a new trait called ContourValue (though I'm not too happy with that name and would love better suggestions), which is defined as

pub trait ContourValue: PartialOrd + Copy + Num + NumCast{}
impl<T> ContourValue for T where T: PartialOrd + Copy + Num + NumCast {}

All method signatures that took some type of values before are now rewritten to something such as

pub fn contours<V: ContourValue>(&self, values: &[V], thresholds: &[V]) -> Result<Vec<Contour<V>>>

This allows for substantial memory and performance improvements in cases where the additional precision is not needed.
The PR is currently missing tests however, and it includes the changes from #13, meaning that should be merged first.

src/contourbuilder.rs

netthier · 2024-04-15T11:01:37Z

The geojson feature is also currently not working, because V needs to be converted to a serde_json::Value in those functions. Adding a Serialize bound shouldn't be an issue, but would make those functions fallible, adding another breaking change 🤔

mthh · 2024-04-19T12:56:15Z

Thanks for your PR and sorry it took me a while to reply.

Indeed, making value type generic and independent from coordinate type is a great thing 👍
(and it was probably a debatable design choice to have linked the numerical type of coordinates and the numerical type of values in the input grid as we saw in issue #12).

Adding a Serialize bound shouldn't be an issue, but would make those functions fallible, adding another breaking change

If this is the only way to keep the geojson feature working, we might as well take advantage of the breaking changes this PR is already making to make this one too, what do you thing ?

a new trait called ContourValue (though I'm not too happy with that name and would love better suggestions)

I don't have any particular problems with ContourValue. From a user point of view, we could also talk about GridValue (since it designates the type of input values) or simply Value (since this library has a single, fairly clear purpose, that of making contours from a grid of values, there's not necessarily too much confusion possible).

mthh · 2024-04-25T15:28:40Z

Hi @netthier!
Is there anything I can do to help finalise this PR?
Would you like me to continue the work you have started?

netthier · 2024-04-25T15:35:39Z

If this is the only way to keep the geojson feature working, we might as well take advantage of the breaking changes this PR is already making to make this one too, what do you thing ?

I'll do so then. Who knows what kind of numbers people might put in there, maybe some can indeed fail serialization.

I don't have any particular problems with ContourValue. From a user point of view, we could also talk about GridValue (since it designates the type of input values) or simply Value (since this library has a single, fairly clear purpose, that of making contours from a grid of values, there's not necessarily too much confusion possible).

I prefer GridValue over ContourValue. Value feels a bit too generic tbh.

I'll try to finalize this PR tommorow, but I'm still unsure about the part of the code I commented on in smooth_linear. I believe that the cast should be infallible in normal usage, as NumCast wlll just return +Inf or -Inf if the input is outside the float's range. But it is conceivable that someone could implement NumCast for their own number type and have it fail for whatever reason.
Maybe it makes sense to create a new error kind and have smooth_linear return a Result? Most other methods seem to already do so, so it wouldn't be that extreme of a change.

mthh · 2024-04-25T15:51:18Z

Who knows what kind of numbers people might put in there, maybe some can indeed fail serialization.

I don't really know either but let's say "better safe than sorry".

I prefer GridValue over ContourValue. Value feels a bit too generic tbh.

Let's go with GridValue 👍

Maybe it makes sense to create a new error kind and have smooth_linear return a Result? Most other methods seem to already do so, so it wouldn't be that extreme of a change.

This seems like a good idea to me, with a new error type to be explicit about the reason for the error if it occurs (and since the smooth_linear caller already returns a Result it won't change the API for the user).

netthier · 2024-04-26T14:45:58Z

This could use some tests tbh, but I'm not sure if I have the time and knowledge required to write good ones, let me know how you want to proceed there 🤔
I also found out that you cannot edit this PR because I made it from an organization repo, while edits by maintainers are apparently only possible on personal repos. I'll have to find out how to do it better in the future.

netthier · 2024-04-26T15:01:09Z

I'm also unsure how my changes to ContourBuilder::smooth_linear affect performance. I added more casts than previously, as I noticed that numbers without negative values might panic at the value - v0 and v1 - v0 parts.

mthh · 2024-04-26T15:33:05Z

Thanks!

This could use some tests tbh, but I'm not sure if I have the time and knowledge required to write good ones

I don't really have an idea of how to make meaningful tests for all this yet, but I'm thinking about it a bit.

I'm also unsure how my changes to ContourBuilder::smooth_linear affect performance. I added more casts than previously, as I noticed that numbers without negative values might panic at the value - v0 and v1 - v0 parts.

I'm going to set up some benchmarks over the weekend to try to measure the difference in performance between 0.13.0 and your PR, and review your PR in the process.

mthh

Thanks for all the changes!

I did some performance testing: extending the lib's current benchmarks to test with larger datasets, in f64 and f32; and on real use cases.

There does seem to be a slight slowdown (sometimes around 5% extra time) but the variability of timings between runs (both on benchmarks with test::Bencher and used elsewhere in an application) don't seem to allow me to conclude that this slowdown is a problem.

Have you had a chance to test performance on real data sets on your side?

I'm rather inclined to merge it; I think the flexibility that PR provides with regard to the type of input data is a good thing 👍

src/band.rs

src/contour.rs

src/line.rs

src/contourbuilder.rs

netthier · 2024-04-29T08:47:15Z

I guess a slowdown would be expected with a lot of more handling of results now. I'm also seeing differences when running cargo bench locally, with the worst difference I've seen being 10%.
The use case I implemented this for was one that didn't require floating point precision, and in those cases performance is far better than with floats, meaning the overhead added is irrelevant.
One thing I noticed is that if smoothing is disabled, a lot of functions are infallible and returning results for no reason.
It should be possible to optimize this more 🤔
And a cast to a float failing seems like such an esoteric thing (I dont think any std types fail this) that adding all of that overhead just for that seems bad. If users upgrade and want to keep using floats, they'd get a slowdown in exchange for having error handling on a bunch of f32->f32 conversions D:
I wonder if using some infallible conversion trait instead of NumCast or straight up panicking would be better.

EDIT: Unwrapping and not propagating results at all isn't much faster, I wonder where all that overhead is coming from

Signed-off-by: netthier <[email protected]>

netthier · 2024-04-29T09:05:46Z

I'll have to profile this, I'm a bit surprised that even the no_smoothing benchmark is slower when I remove all the result propagation 🤔

mthh · 2024-04-29T09:06:30Z

if users upgrade and want to keep using floats, they'd get a slowdown in exchange for having error handling on a bunch of f32->f32 conversions
I wonder if using some infallible conversion trait instead of NumCast or straight up panicking would be better.

Unwrapping and not propagating results at all isn't much faster, I wonder where all that overhead is coming from

Indeed, there's no rush to merge this, let's investigate a little bit the overhead it adds first.

I'll have to profile this, I'm a bit surprised that even the no_smoothing benchmark is slower when I remove all the result propagation 🤔

Great if you get the chance to do this!

netthier commented Apr 15, 2024

View reviewed changes

src/contourbuilder.rs Outdated Show resolved Hide resolved

mthh mentioned this pull request Apr 15, 2024

Fix overflow when using large raster sizes #13

Merged

netthier force-pushed the generic-num branch 3 times, most recently from f93227c to 96465c5 Compare April 15, 2024 11:11

netthier force-pushed the generic-num branch 3 times, most recently from 0a062c2 to f645f2c Compare April 26, 2024 14:43

netthier force-pushed the generic-num branch 6 times, most recently from 744fee0 to 46483ea Compare April 26, 2024 15:00

netthier force-pushed the generic-num branch from 46483ea to 7b0b306 Compare April 26, 2024 15:01

mthh requested changes Apr 29, 2024

View reviewed changes

src/band.rs Outdated Show resolved Hide resolved

src/contour.rs Outdated Show resolved Hide resolved

src/line.rs Outdated Show resolved Hide resolved

mthh reviewed Apr 29, 2024

View reviewed changes

src/contourbuilder.rs Outdated Show resolved Hide resolved

netthier force-pushed the generic-num branch 4 times, most recently from b479461 to aa21a5f Compare April 29, 2024 08:33

netthier and others added 2 commits April 29, 2024 10:59

Fix typo

5995459

Signed-off-by: netthier <[email protected]>

Make ContourBuilder generic over the raster data type

2977db8

Signed-off-by: netthier <[email protected]>

netthier force-pushed the generic-num branch from aa21a5f to 2977db8 Compare April 29, 2024 08:59

netthier mentioned this pull request Apr 29, 2024

[FR] Parallelism support #15

Open

This was referenced Jun 13, 2024

[FR] Sparse representation of input #17

Open

Switch benchmarks to criterion for more accurate profiling #20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make value type generic and independent from coordinate type #14

Make value type generic and independent from coordinate type #14

netthier commented Apr 15, 2024 •

edited

Loading

netthier commented Apr 15, 2024

mthh commented Apr 19, 2024

mthh commented Apr 25, 2024

netthier commented Apr 25, 2024 •

edited

Loading

mthh commented Apr 25, 2024

netthier commented Apr 26, 2024 •

edited

Loading

netthier commented Apr 26, 2024 •

edited

Loading

mthh commented Apr 26, 2024

mthh left a comment

netthier commented Apr 29, 2024 •

edited

Loading

netthier commented Apr 29, 2024

mthh commented Apr 29, 2024 •

edited

Loading

Make value type generic and independent from coordinate type #14

Are you sure you want to change the base?

Make value type generic and independent from coordinate type #14

Conversation

netthier commented Apr 15, 2024 • edited Loading

netthier commented Apr 15, 2024

mthh commented Apr 19, 2024

mthh commented Apr 25, 2024

netthier commented Apr 25, 2024 • edited Loading

mthh commented Apr 25, 2024

netthier commented Apr 26, 2024 • edited Loading

netthier commented Apr 26, 2024 • edited Loading

mthh commented Apr 26, 2024

mthh left a comment

Choose a reason for hiding this comment

netthier commented Apr 29, 2024 • edited Loading

netthier commented Apr 29, 2024

mthh commented Apr 29, 2024 • edited Loading

netthier commented Apr 15, 2024 •

edited

Loading

netthier commented Apr 25, 2024 •

edited

Loading

netthier commented Apr 26, 2024 •

edited

Loading

netthier commented Apr 26, 2024 •

edited

Loading

netthier commented Apr 29, 2024 •

edited

Loading

mthh commented Apr 29, 2024 •

edited

Loading