-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix code causing spurious Wstringop-overflow warning #3333
Conversation
Please provide a godbolt repro that demonstrates the warning. The link in the referenced issue doesn't actually show any warnings. |
Here's a reproduction that demonstrates the warning. It seems to occur when compiling any code that prints vectors or other ranges. The code needs to be compiled at -O3, because when compiled in debug mode, the FMT_ASSERT statements are left in, and these do bounds checks. Here is the code:
Edit: The initial reproduction works on fmt 9.1.0. This example is reproducible on fmt trunk as well, provided that |
// Returns the length of a codepoint. Returns 1 for invalid codepoints. | ||
// This is equivalent to | ||
// | ||
// int len = code_point_length_impl(c); | ||
// return len + !len; | ||
// | ||
// This is useful because it allows the compiler to check that the | ||
// length is within the range [1, 4] | ||
FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int { | ||
return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need both code_point_length_impl
and code_point_length_impl_2
, let's merge this logic into code_point_length_impl
and make the latter take unsigned char
to have one case instead of 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a second thought, this logic can be moved directly into code_point_length
and both impl functions removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of code_point_length
is used as len
and as len + !len
.
First case (first and last lines):
Lines 654 to 664 in 3daf338
int len = code_point_length_impl(*s); | |
// Compute the pointer to the next character early so that the next | |
// iteration can start working on the next character. Neither Clang | |
// nor GCC figure out this reordering on their own. | |
const char* next = s + len + !len; | |
using uchar = unsigned char; | |
// Assume a four-byte character and load four bytes. Unused bits are | |
// shifted out. | |
*c = uint32_t(uchar(s[0]) & masks[len]) << 18; |
Second case:
Lines 2215 to 2223 in 3daf338
FMT_CONSTEXPR auto code_point_length(const Char* begin) -> int { | |
if (const_check(sizeof(Char) != 1)) return 1; | |
int len = code_point_length_impl(static_cast<char>(*begin)); | |
// Compute the pointer to the next character early so that the next | |
// iteration can start working on the next character. Neither Clang | |
// nor GCC figure out this reordering on their own. | |
return len + !len; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Then I think we should merge code_point_length_impl
into utf8_decode
and code_point_length_impl_2
into code_point_length
. Unless, of course, there is a similar false positive in utf8_decode
.
// This is useful because it allows the compiler to check that the | ||
// length is within the range [1, 4] | ||
FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int { | ||
return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic deserves some explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Sorry for the delay.
code_point_length_impl is initially implemented like this, which gets the length of a code point based on the first character in that codepoint.
MT_CONSTEXPR inline auto code_point_length_impl(char c) -> int {
return "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\0\0\0\0\0\0\0\0\2\2\2\2\3\3\4"
[static_cast<unsigned char>(c) >> 3];
}
For valid codepoints, we return the length of the codepoint, and for invalid codepoints we return a length of 0.
When used in code_point_length
, the length of invalid codepoints gets converted to 1, so we don't actually need to distinguish between valid and invalid codepoints for code_point_length
.
As a result, we could write it like this:
MT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
return "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\2\2\2\2\3\3\4"
[static_cast<unsigned char>(c) >> 3];
}
Now we have 4 possible return values: [1, 2, 3, 4]. If we instead use [0, 1, 2, 3] and just add one, it looks like this now:
MT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
return ("\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\1\1\1\2\2\3"
[static_cast<unsigned char>(c) >> 3]) + 1;
}
Because this array only contains values 0..3, each value can be represented in 2 bits. This means that the entire array fits in a 64 bit integer.
We can get that integer like this:
arr = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,2,2,3]
result = 0
pow = 1
for x in arr:
result += pow * x
pow *= 4
print(f'0x{result:x}')
This gives us 0x3a55000000000000
.
That logic basically indexes into this integer, treating it as an array of 2-bit values, resulting in the final implementation:
FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1;
}
This has two benefits:
- The compiler recognizes that the return value is always between 1 and 4, so it gets rid of the stringop-overflow warning
- We no longer need to access memory when computing the length of a codepoint
That being said the code could definitely be less ugly.
I'll apply your recommendations with regard to merging code_point_length_impl
into utf8_decode, and merging code_point_length_impl_2
into code_point_length
. I haven't noticed any similar false positives in utf8_decode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks for the explanation.
Merged with some tweaks in d9bc5f1. Thanks for the PR. |
Despite attempts to disable
-Wstringop-overflow
for tests, building the library from source still results in spurious Wstringop-overflow warnings. This occurs both for downstream consumers (eg, those using CMake Package Manager), and for people building the library from scratch.Rather than attempting to disable the warning, I've modified the code such that the compiler is able to "figure out" that
fmt::detail::code_point_length
returns a value between 1 and 4, inclusive. The change also allows using bitshifts to calculate the codepoint length, which avoids a second array access.While the
Wstringop-overflow
warning was spurious in this case, the new code should be more performant, and downstream consumers will no longer have the headache of dealing with a spurious warning when compiling their code.See Issue #2989, as well as #3054.