-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential performance improvement for Guid.CompareTo #53981
Comments
Note to area owners: the sample code contained in this issue is a derivative of the code at https://github.com/boostorg/uuid/blob/ca0185f9f2f1c8d58af8c54ce24579e1c6ebf348/include/boost/uuid/detail/uuid_x86.ipp#L98-L130, which carries its own license. Be aware of any potential impact to .NET's third-party-notices file. |
Tagging subscribers to this area: @tannergooding Issue DetailsThe There is an opportunity to use SSE 3 to do this much faster by first re-ordering the bytes of the two I don't suspect there will be much appetite for this, given the comments/feedback on #52296, but it's an optimisation we use internally with great success, so I figured I'd contribute it in case someone finds it useful. We find that for equal
The catch is that the integer return value is always -1, 0 or 1 with I leave it to your consideration. /// <summary>
/// Shuffle mask for rearranging the bytes of a little-endian <see cref="Guid"/> so it is big-endian, but
/// with its 16 bytes in reverse order.
/// </summary>
private static readonly Vector128<sbyte> _shuffle_mask
= Vector128.Create(15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int FastCompareTo(this in Guid guid, in Guid other)
{
if (Ssse3.IsSupported)
{
// Shuffle the byte order of the two UUIDs so they are big-endian, but with their 16 bytes
// in reverse order. The byte-wise comparison between the two UUIDs must be performed from
// most significant byte to least for each UUID component to produce the same result as that
// produced by the Guid.CompareTo method.
var v1 = Ssse3.Shuffle(Unsafe.As<Guid, Vector128<sbyte>>(ref Unsafe.AsRef(guid)), _shuffle_mask);
var v2 = Ssse3.Shuffle(Unsafe.As<Guid, Vector128<sbyte>>(ref Unsafe.AsRef(other)), _shuffle_mask);
// Compare whether v1 is greater than v2 and vice versa. We must perform the comparison in both
// directions so we can establish for each byte in v1 whether it is greater than, less than or
// equal to its corresponding byte in v2. If a byte in v1_gt_v2 and v2_gt_v1 is zero, then the
// corresponding bytes in v1 and v2 are equal. If a byte in v1_gt_v2 is -1, then the corresponding
// byte in v1 is greater than the corresponding byte in v2. If a byte in v2_gt_v1 is -1, then the
// corresponding byte in v2 is creater than the corresponding byte in v1.
var v1_gt_v2 = Sse2.CompareGreaterThan(v1, v2);
var v2_gt_v1 = Sse2.CompareGreaterThan(v2, v1);
// SSE "compare greater than" operation only has a signed byte instruction (for comparing signed
// byte vectors). There is no instruction for comparing unsigned byte vectors. However, to produce
// the same comparison result as Guid.CompareTo, we must piecewise compare the unsigned bytes (in
// reverse order) of v1 and v2. Therefore, the comparison results in v1_gt_v2 and v2_gt_v1 must
// be adjusted such that if the sign (i.e. the most signficant bit) of any byte in v1 differs from
// that of the corresponding byte in v2, then the corresponding result in v1_gt_v2 and v2_gt_v1 is
// negated - i.e. 'true' (represented with a -1 value) becomes 'false' (represented with a zero value)
// and vice versa. We therefore XOR v1 and v2 such that the most significant bit of each byte in
// the result of the XOR operation indicates whether the corresponding result in v1_gt_v2 and v2_gt_v1
// must be negated.
var signs_mask = Sse2.Xor(v1, v2);
// All the information in v1_gt_v2 and v2_gt_v1 is in the most signficant bit of each byte, which
// will be written (in reverse order) to an unsigned 32-bit integer for each of v1_gt_v2 and v2_gt_v1.
// Therefore, we can invert/negate the 'true'/'false' result in each of v1_gt_v2 and v2_gt_v1 by
// XOR'ing each of v1_gt_v2 and v2_gt_v1 with the "signs mask" produced above.
v1_gt_v2 = Sse2.Xor(v1_gt_v2, signs_mask);
v2_gt_v1 = Sse2.Xor(v2_gt_v1, signs_mask);
// Write the most signficiant bit of each byte in v1_gt_v2 and v2_gt_v1 into v1_gt_v2_bits and
// v2_gt_v1_bits respectively.
var v1_gt_v2_bits = Sse2.MoveMask(v1_gt_v2);
var v2_gt_v1_bits = Sse2.MoveMask(v2_gt_v1);
// Comparing v1_gt_v2_bits and v2_gt_v1_bits performs a bit-wise comparison between each such
// that if the first bit that is different in v1_gt_v2_bits and v2_gt_v1_bits is zero in
// v1_gt_v2_bits and one in v2_gt_v1_bits, then v1_gt_v2_bits is deemed less than v2_gt_v1_bits.
// Conversely if the first different bit is one in v1_gt_v2_bits and zero in v2_gt_v1_bits, then
// v1_gt_v2_bits is deemed greater than v2_gt_v1_bits. The maximum value of v1_gt_v2_bits and
// v2_gt_v1_bits is 65,535, so we can safely subtract v2_gt_v1_bits from v1_gt_v2_bits without
// having to worry about wrapping occurring due to large numbers.
return v1_gt_v2_bits - v2_gt_v1_bits;
}
else
{
// Fall back to the standard Guid.CompareTo method.
return guid.CompareTo(other);
}
} The codegen is:
|
This falls into the same category as |
What sequence of existing/proposed |
The
Guid.CompareTo(Guid)
method currently compares each field of the givenGuid
in turn. If the givenGuid
is not equal, then this is quite fast, because typically the first field returns not equal. However, if the givenGuid
is equal, then it ends up having to check all 11 fields one-by-one, which is quite slow.There is an opportunity to use SSE 3 to do this much faster by first re-ordering the bytes of the two
Guid
s to be big-endian, then comparing them using SSE 3. The approach is inspired by the SSE-based UUID comparison in the C++ boost library (which represents UUIDs as big endian, meaning they don't have to shuffle the bytes first).I don't suspect there will be much appetite for this, given the comments/feedback on #52296, but it's an optimisation we use internally with great success, so I figured I'd contribute it in case someone finds it useful.
We find that for equal
Guid
s, the SSE 3 implementation is 7x faster and for unequalGuid
s, it is over 2x faster.The catch is that the integer return value is always -1, 0 or 1 with
Guid.CompareTo(Guid)
, but with the SSE implementation, it may be any negative number when the givenGuid
is less than, and any positive number when theGuid
is greater than.I leave it to your consideration.
The codegen is:
The text was updated successfully, but these errors were encountered: