Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low level optimizations for contiguous sequences #103

Merged

Conversation

DeveloperPaul123
Copy link
Contributor

@DeveloperPaul123 DeveloperPaul123 commented Jul 25, 2023

Description

Adds low level optimizations to address #57

Optimizations

  • std::memset with fill()
  • std::memcmp with equal()
  • std::memcmp with compare()
  • std::memchr with find()
  • std::memmem with search()

@DeveloperPaul123
Copy link
Contributor Author

@tcbrindle I filled this PR to see if I'm on the right track (any feedback much appreciated) and also to make it know that the work is in progress for that particular issue.

@tcbrindle
Copy link
Owner

Thanks for working on this! It looks like there's a syntax error which is tripping up the non-Windows builds, but the general approach looks good to me 👍

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Patch coverage: 94.59% and project coverage change: +0.12% 🎉

Comparison is base (051dce9) 97.58% compared to head (96d201f) 97.70%.
Report is 30 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #103      +/-   ##
==========================================
+ Coverage   97.58%   97.70%   +0.12%     
==========================================
  Files          66       67       +1     
  Lines        2276     2398     +122     
==========================================
+ Hits         2221     2343     +122     
  Misses         55       55              
Files Changed Coverage Δ
include/flux/core/utils.hpp 100.00% <ø> (ø)
include/flux/op/split_string.hpp 100.00% <ø> (ø)
include/flux/op/fill.hpp 94.11% <91.66%> (-5.89%) ⬇️
include/flux/op/find.hpp 96.15% <93.33%> (-3.85%) ⬇️
include/flux/op/equal.hpp 92.00% <94.11%> (+10.18%) ⬆️
include/flux/op/compare.hpp 97.22% <96.15%> (-2.78%) ⬇️
include/flux/op/output_to.hpp 94.44% <100.00%> (+1.11%) ⬆️

... and 9 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DeveloperPaul123
Copy link
Contributor Author

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

@DeveloperPaul123 DeveloperPaul123 marked this pull request as ready for review August 3, 2023 18:51
@DeveloperPaul123
Copy link
Contributor Author

Just looking for an initial round of feedback; I'll have to address the build issues.

Copy link
Owner

@tcbrindle tcbrindle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few comments, but several of them are just minor style things.

This is looking in pretty good shape, we just need to make sure that the conditions for using the specialisations is correct in each case. I'd rather err on the side of caution -- it's better to use a potentially slower code path than give the wrong answer or (worse) end up with UB.

There do seem to be a lot of unnecessary, unrelated formatting changes though -- I guess from clang-format messing things up? I started marking them but gave up as there are quite a few... Please put these lines back to how they were originally so that the diff only includes meaningful changes.

include/flux/core/utils.hpp Outdated Show resolved Hide resolved
include/flux/core/utils.hpp Outdated Show resolved Hide resolved
include/flux/core/utils.hpp Outdated Show resolved Hide resolved
include/flux/op/compare.hpp Outdated Show resolved Hide resolved
include/flux/op/compare.hpp Outdated Show resolved Hide resolved
include/flux/op/fill.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
@tcbrindle
Copy link
Owner

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

This is caused by the new specialisation of find (which split uses) returning the wrong thing in some cases. See the review comments above (the any_of thing is just a coincidence).

@DeveloperPaul123
Copy link
Contributor Author

I believe I've addressed most, if not all of your comments. Unit tests are also all passing for me. Please take another look when you have a moment. Thanks!

Copy link
Owner

@tcbrindle tcbrindle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good! There's only one change still outstanding, namely the definition of can_memset in fill.hpp

include/flux/op/fill.hpp Outdated Show resolved Hide resolved
@DeveloperPaul123
Copy link
Contributor Author

Whoops! I missed that, thanks!

@tcbrindle
Copy link
Owner

tcbrindle commented Aug 14, 2023

How would you like to handle the different cases in compare()? What would make the most sense if both sequences are empty? Or if only one sequence is empty?

For compare(), I think we get the right behaviour if we do something like:

auto const seq1_size = flux::usize(seq1);
auto const seq2_size = flux::usize(seq2);
auto min_size = std::min(seq1_size, seq2_size);

int cmp_result = 0;

if (min_size > 0) {
    auto const data1 = flux::data(seq1);
    FLUX_ASSERT(data1 != nullptr);
    auto const data2 = flux::data(seq2);
    FLUX_ASSERT(data2 != nullptr);

    cmp_result = std::memcmp(data1, data2, min_size);
}

That is, we only call the C function if the number of bytes is greater than zero, also making sure that the data pointers are not nullptr.

This is getting messier and messier, sorry!

@DeveloperPaul123
Copy link
Contributor Author

Ok I think this is ready now and all tests pass for me!

Copy link
Owner

@tcbrindle tcbrindle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify the test in equal(), because we already know that both sequences are the same size. Other than that, just a couple of tiny code style things.

CodeCov is complaining that the new "size == 0" code-paths aren't being tested, so it's probably worth adding a couple of little tests to cover this -- but I'm happy to do it if you've had enough of this PR already!

Also, it looks like the new compare() specialisation isn't actually reached by any of the existing tests... This definitely needs rectifying before we can commit this, but again, I'm happy to add some unsigned char compare() tests if you don't fancy it.

include/flux/op/equal.hpp Outdated Show resolved Hide resolved
include/flux/op/compare.hpp Outdated Show resolved Hide resolved
include/flux/op/fill.hpp Outdated Show resolved Hide resolved
include/flux/op/fill.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
include/flux/op/find.hpp Outdated Show resolved Hide resolved
include/flux/op/output_to.hpp Outdated Show resolved Hide resolved
single_include/flux.hpp Outdated Show resolved Hide resolved
@DeveloperPaul123
Copy link
Contributor Author

I was able to add some tests, but not sure if I covered everything you had in mind. I also address all (I think) your feedback from the latest review. Sorry for the silly oversights at times and thanks for all the feedback!

@DeveloperPaul123
Copy link
Contributor Author

Looks like I didn't cover all the cases for compare(). Some help with this would be appreciated.

Copy link
Owner

@tcbrindle tcbrindle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the actual implementation code looks great now!

compare() is still missing a couple of test cases, I've made some suggestions below.

CodeCov is also complaining that we're not testing the case of two zero-length contiguous sequences in equal(). Something like

std::array<int, 0> arr;
STATIC_CHECK(flux::equal(arr, arr) == true);

in test_equal.cpp should keep it happy I think?

test/test_compare.cpp Outdated Show resolved Hide resolved
test/test_compare.cpp Show resolved Hide resolved
@DeveloperPaul123
Copy link
Contributor Author

Done!

@tcbrindle
Copy link
Owner

It's done!! 🎉

Thanks so much!

@tcbrindle tcbrindle merged commit 29838f3 into tcbrindle:main Aug 16, 2023
24 of 25 checks passed
@DeveloperPaul123 DeveloperPaul123 deleted the feature/low-level-optimizations branch August 16, 2023 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants