Skip to content

Commit

Permalink
[Grammar] Update 10-2 Replace branches with arithmetic.md
Browse files Browse the repository at this point in the history
  • Loading branch information
dendibakh authored Sep 22, 2024
1 parent fa93464 commit 92f53b5
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,6 @@ test edi, edi xor eax, edi
cmovns eax, ecx
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In our example, we shift left the input value regardless if it is positive or negative. In addition, if the input value is negative, we also XOR it with a constant (exact value is irrelevant for this scenario). In the modified version, we leverage the fact that arithmetic right shift (`>>`) turns the sign of `x` (the high order bit) into a mask of all zeros or all ones. The subsequent AND (`&`) operation produces either zero or the desired constant. The original version of the function takes ~4 cycles, while the modified version takes only 3 cycles. It's worth mentioning that Clang 17 compiler replaced the branch with a conditional select (CMOVNS) instruction, which we will cover in the next section. Nevertheless, with some smart bit manipulation we were able to improve it even further.
In our example, we shift left the input value regardless if it is positive or negative. In addition, if the input value is negative, we also XOR it with a constant (the exact value is irrelevant for this scenario). In the modified version, we leverage the fact that arithmetic right shift (`>>`) turns the sign of `x` (the high order bit) into a mask of all zeros or all ones. The subsequent AND (`&`) operation produces either zero or the desired constant. The original version of the function takes ~4 cycles, while the modified version takes only 3 cycles. It's worth mentioning that the Clang 17 compiler replaced the branch with a conditional select (CMOVNS) instruction, which we will cover in the next section. Nevertheless, with some smart bit manipulation, we were able to improve it even further.

As of the year 2024, compilers are usually unable to find these shortcuts on their own, so it is up to the programmer to do it manually. If you can find a way to replace a frequently mispredicted branch with arithmetic, you will likely see a performance improvement. You can find more examples of bit manipulation tricks in other books, for example [@HackersDelight].
As of the year 2024, compilers are usually unable to find these shortcuts on their own, so it is up to the programmer to do it manually. If you can find a way to replace a frequently mispredicted branch with arithmetic, you will likely see a performance improvement. You can find more examples of bit manipulation tricks in other books, for example [@HackersDelight].

0 comments on commit 92f53b5

Please sign in to comment.