Low level optimizations for contiguous sequences #103

DeveloperPaul123 · 2023-07-25T21:33:48Z

Description

Adds low level optimizations to address #57

Optimizations

std::memset with fill()
std::memcmp with equal()
std::memcmp with compare()
std::memchr with find()
~~std::memmem with search()~~

DeveloperPaul123 · 2023-07-25T21:34:29Z

@tcbrindle I filled this PR to see if I'm on the right track (any feedback much appreciated) and also to make it know that the work is in progress for that particular issue.

tcbrindle · 2023-07-25T21:59:31Z

Thanks for working on this! It looks like there's a syntax error which is tripping up the non-Windows builds, but the general approach looks good to me 👍

codecov · 2023-08-01T16:06:38Z

Codecov Report

Patch coverage: 94.59% and project coverage change: +0.12% 🎉

Comparison is base (051dce9) 97.58% compared to head (96d201f) 97.70%.
Report is 30 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #103      +/-   ##
==========================================
+ Coverage   97.58%   97.70%   +0.12%     
==========================================
  Files          66       67       +1     
  Lines        2276     2398     +122     
==========================================
+ Hits         2221     2343     +122     
  Misses         55       55

Files Changed	Coverage Δ
include/flux/core/utils.hpp	`100.00% <ø> (ø)`
include/flux/op/split_string.hpp	`100.00% <ø> (ø)`
include/flux/op/fill.hpp	`94.11% <91.66%> (-5.89%)`	⬇️
include/flux/op/find.hpp	`96.15% <93.33%> (-3.85%)`	⬇️
include/flux/op/equal.hpp	`92.00% <94.11%> (+10.18%)`	⬆️
include/flux/op/compare.hpp	`97.22% <96.15%> (-2.78%)`	⬇️
include/flux/op/output_to.hpp	`94.44% <100.00%> (+1.11%)`	⬆️

... and 9 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

include/flux/core/utils.hpp

DeveloperPaul123 · 2023-08-03T17:55:34Z

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

DeveloperPaul123 · 2023-08-03T18:51:57Z

Just looking for an initial round of feedback; I'll have to address the build issues.

tcbrindle

There are quite a few comments, but several of them are just minor style things.

This is looking in pretty good shape, we just need to make sure that the conditions for using the specialisations is correct in each case. I'd rather err on the side of caution -- it's better to use a potentially slower code path than give the wrong answer or (worse) end up with UB.

There do seem to be a lot of unnecessary, unrelated formatting changes though -- I guess from clang-format messing things up? I started marking them but gave up as there are quite a few... Please put these lines back to how they were originally so that the diff only includes meaningful changes.

include/flux/core/utils.hpp

include/flux/op/compare.hpp

include/flux/op/fill.hpp

include/flux/op/find.hpp

tcbrindle · 2023-08-04T14:48:19Z

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

This is caused by the new specialisation of find (which split uses) returning the wrong thing in some cases. See the review comments above (the any_of thing is just a coincidence).

Fill now uses `std::memset` for single byte values when not constant evaluated. Also added a unit test for this specific case.

Also moved `any_of` concept to `utils`

DeveloperPaul123 · 2023-08-10T12:53:19Z

I believe I've addressed most, if not all of your comments. Unit tests are also all passing for me. Please take another look when you have a moment. Thanks!

tcbrindle

This is looking really good! There's only one change still outstanding, namely the definition of can_memset in fill.hpp

include/flux/op/fill.hpp

DeveloperPaul123 · 2023-08-10T14:10:51Z

Whoops! I missed that, thanks!

tcbrindle · 2023-08-14T11:10:12Z

How would you like to handle the different cases in compare()? What would make the most sense if both sequences are empty? Or if only one sequence is empty?

For compare(), I think we get the right behaviour if we do something like:

auto const seq1_size = flux::usize(seq1);
auto const seq2_size = flux::usize(seq2);
auto min_size = std::min(seq1_size, seq2_size);

int cmp_result = 0;

if (min_size > 0) {
    auto const data1 = flux::data(seq1);
    FLUX_ASSERT(data1 != nullptr);
    auto const data2 = flux::data(seq2);
    FLUX_ASSERT(data2 != nullptr);

    cmp_result = std::memcmp(data1, data2, min_size);
}

That is, we only call the C function if the number of bytes is greater than zero, also making sure that the data pointers are not nullptr.

This is getting messier and messier, sorry!

DeveloperPaul123 · 2023-08-14T16:38:57Z

Ok I think this is ready now and all tests pass for me!

tcbrindle

We can simplify the test in equal(), because we already know that both sequences are the same size. Other than that, just a couple of tiny code style things.

CodeCov is complaining that the new "size == 0" code-paths aren't being tested, so it's probably worth adding a couple of little tests to cover this -- but I'm happy to do it if you've had enough of this PR already!

Also, it looks like the new compare() specialisation isn't actually reached by any of the existing tests... This definitely needs rectifying before we can commit this, but again, I'm happy to add some unsigned char compare() tests if you don't fancy it.

include/flux/op/equal.hpp

include/flux/op/compare.hpp

include/flux/op/fill.hpp

include/flux/op/find.hpp

include/flux/op/output_to.hpp

single_include/flux.hpp

Made a silly blunder.

DeveloperPaul123 · 2023-08-15T15:51:44Z

I was able to add some tests, but not sure if I covered everything you had in mind. I also address all (I think) your feedback from the latest review. Sorry for the silly oversights at times and thanks for all the feedback!

DeveloperPaul123 · 2023-08-15T16:05:34Z

Looks like I didn't cover all the cases for compare(). Some help with this would be appreciated.

tcbrindle

All the actual implementation code looks great now!

compare() is still missing a couple of test cases, I've made some suggestions below.

CodeCov is also complaining that we're not testing the case of two zero-length contiguous sequences in equal(). Something like

std::array<int, 0> arr;
STATIC_CHECK(flux::equal(arr, arr) == true);

in test_equal.cpp should keep it happy I think?

test/test_compare.cpp

DeveloperPaul123 · 2023-08-15T16:40:55Z

Done!

tcbrindle · 2023-08-15T16:56:30Z

It's done!! 🎉

Thanks so much!

DeveloperPaul123 mentioned this pull request Jul 25, 2023

Use low-level contiguous_sequence optimisations where appropriate #57

Closed

5 tasks

DeveloperPaul123 force-pushed the feature/low-level-optimizations branch from 6a3c81e to 2a1a5f3 Compare August 1, 2023 16:00

DeveloperPaul123 force-pushed the feature/low-level-optimizations branch from 2a1a5f3 to f1848c9 Compare August 3, 2023 13:56

tcbrindle reviewed Aug 3, 2023

View reviewed changes

include/flux/core/utils.hpp Outdated Show resolved Hide resolved

DeveloperPaul123 marked this pull request as ready for review August 3, 2023 18:51

DeveloperPaul123 requested a review from tcbrindle August 3, 2023 18:51

tcbrindle requested changes Aug 4, 2023

View reviewed changes

DeveloperPaul123 added 7 commits August 9, 2023 17:50

Use memset with single byte values for fill()

595e9ca

Fill now uses `std::memset` for single byte values when not constant evaluated. Also added a unit test for this specific case.

Use std::memcmp with equal()

8c06f13

Add std::memchr() optimization for find()

0490d95

Also moved `any_of` concept to `utils`

Add std::memcmp() optimization for compare()

72c5d91

Fix typo

da24411

Fix typo and rules for memcmp in equal()

f5f6e2b

Minor tweaks to compare()

77e9c39

DeveloperPaul123 force-pushed the feature/low-level-optimizations branch from 9e66c2d to 77e9c39 Compare August 10, 2023 04:45

DeveloperPaul123 added 2 commits August 10, 2023 08:51

Mark value as const in fill()

fae4219

Minor fix with find()

a5e7db6

DeveloperPaul123 requested a review from tcbrindle August 10, 2023 12:53

tcbrindle requested changes Aug 10, 2023

View reviewed changes

include/flux/op/fill.hpp Outdated Show resolved Hide resolved

DeveloperPaul123 requested a review from tcbrindle August 10, 2023 14:10

Small update/fix to optimization check in fill()

7d6301f

DeveloperPaul123 force-pushed the feature/low-level-optimizations branch from f7cc097 to 7d6301f Compare August 10, 2023 15:09

DeveloperPaul123 added 3 commits August 14, 2023 07:20

Style fixes

95d87f8

Add size check to output_to()

7c978dc

Add proper size check for compare()

be189ff

DeveloperPaul123 force-pushed the feature/low-level-optimizations branch from 4016f3a to be189ff Compare August 14, 2023 16:38

DeveloperPaul123 requested a review from tcbrindle August 14, 2023 16:39

tcbrindle requested changes Aug 14, 2023

View reviewed changes

DeveloperPaul123 added 10 commits August 15, 2023 00:17

Fix logic in equal()

16b1253

Made a silly blunder.

Style fix

83683bd

Minor tweaks in fill()

684b244

Minor tweaks to find()

9334df6

Minor tweaks to ouput_to()

59fdb48

Remove change from auto-generated file

4e93710

Add new tests for compare()

daa7ba6

Add fill() test for empty sequences

5c3eede

Add find() test for empty sequences

bb8e103

Add output_to() test for empty sequences

6c35b2d

DeveloperPaul123 requested a review from tcbrindle August 15, 2023 15:51

tcbrindle requested changes Aug 15, 2023

View reviewed changes

test/test_compare.cpp Outdated Show resolved Hide resolved

test/test_compare.cpp Show resolved Hide resolved

DeveloperPaul123 added 2 commits August 15, 2023 12:30

Update compare() tests

2d21cfa

Add simple empty == empty test for equal()

96d201f

DeveloperPaul123 requested a review from tcbrindle August 15, 2023 16:41

tcbrindle approved these changes Aug 15, 2023

View reviewed changes

tcbrindle merged commit 29838f3 into tcbrindle:main Aug 16, 2023
24 of 25 checks passed

DeveloperPaul123 deleted the feature/low-level-optimizations branch August 16, 2023 13:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low level optimizations for contiguous sequences #103

Low level optimizations for contiguous sequences #103

DeveloperPaul123 commented Jul 25, 2023 •

edited

Loading

DeveloperPaul123 commented Jul 25, 2023

tcbrindle commented Jul 25, 2023

codecov bot commented Aug 1, 2023 •

edited

Loading

DeveloperPaul123 commented Aug 3, 2023

DeveloperPaul123 commented Aug 3, 2023

tcbrindle left a comment

tcbrindle commented Aug 4, 2023

DeveloperPaul123 commented Aug 10, 2023

tcbrindle left a comment

DeveloperPaul123 commented Aug 10, 2023

tcbrindle commented Aug 14, 2023 •

edited

Loading

DeveloperPaul123 commented Aug 14, 2023

tcbrindle left a comment

DeveloperPaul123 commented Aug 15, 2023

DeveloperPaul123 commented Aug 15, 2023

tcbrindle left a comment

DeveloperPaul123 commented Aug 15, 2023

tcbrindle commented Aug 15, 2023

Low level optimizations for contiguous sequences #103

Low level optimizations for contiguous sequences #103

Conversation

DeveloperPaul123 commented Jul 25, 2023 • edited Loading

Description

DeveloperPaul123 commented Jul 25, 2023

tcbrindle commented Jul 25, 2023

codecov bot commented Aug 1, 2023 • edited Loading

Codecov Report

DeveloperPaul123 commented Aug 3, 2023

DeveloperPaul123 commented Aug 3, 2023

tcbrindle left a comment

Choose a reason for hiding this comment

tcbrindle commented Aug 4, 2023

DeveloperPaul123 commented Aug 10, 2023

tcbrindle left a comment

Choose a reason for hiding this comment

DeveloperPaul123 commented Aug 10, 2023

tcbrindle commented Aug 14, 2023 • edited Loading

DeveloperPaul123 commented Aug 14, 2023

tcbrindle left a comment

Choose a reason for hiding this comment

DeveloperPaul123 commented Aug 15, 2023

DeveloperPaul123 commented Aug 15, 2023

tcbrindle left a comment

Choose a reason for hiding this comment

DeveloperPaul123 commented Aug 15, 2023

tcbrindle commented Aug 15, 2023

DeveloperPaul123 commented Jul 25, 2023 •

edited

Loading

codecov bot commented Aug 1, 2023 •

edited

Loading

tcbrindle commented Aug 14, 2023 •

edited

Loading