-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: One shot hashing of Stream
#62489
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones, @krwq Issue DetailsBackground and motivationAs of .NET 5 (Hash) and .NET 6 (HMAC) we have static one-shots for computing the hash of bytes, either in the form of an array of a This one shot I think is a good direction for .NET. To re-cap what we have today in terms of APIs:
My personal belief is that number three should be mostly de-emphasized for common use cases. One shots should use option one, and incremental should use two. There is one case remaining (I believe) where option one and two don't handle as well as three: hashing streams. It's not uncommon to need to hash a stream all in one go (thus still a one-shot). I would propose that these should also be one shots to further reduce the need for option three. As a matter of implementation, the goal is not to necessarily use platform APIs to produce the most optimal hashing API (though we can certainly do our best to optimize some certain situations). Rather the goal is to provide an API surface that does not require developers to manage an instance of a hash object. API Proposalnamespace System.Security.Cryptography {
public abstract partial class MD5 {
public static byte[] HashData(Stream source);
public static int TryHashData(Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA1 {
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA256 {
public static byte[] HashData(Stream source);
public static int TryHashData(Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA384 {
public static byte[] HashData(Stream source);
public static int TryHashData(Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA512 {
public static byte[] HashData(Stream source);
public static int TryHashData(Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACMD5 {
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static int TryHashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA1 {
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static int TryHashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA256 {
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static int TryHashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA384 {
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static int TryHashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA512 {
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static int TryHashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static Task<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static Task<int> TryHashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
} API Usagestatic void Example1() {
using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
byte[] fileHash = SHA256.HashData(fs);
}
}
static void Example2() {
using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
Span<byte> buffer = stackalloc byte[32];
int written = SHA256.TryHashData(fs, buffer);
}
}
static async Task Example3() {
using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
byte[] fileHash = await SHA256.HashDataAsync(fs);
}
}
static async Task Example4() {
using (FileStream fs = File.Open("/please/hash/me", FileMode.Open)) {
Memory<byte> existingBuffer = default; // Something reasonable here.
int written = await SHA256.TryHashDataAsync(fs, existingBuffer);
}
} Alternative Designs"Do nothing" is likely the best alternative to this. It's possible to do this today using RisksNo response
|
Suggestion: ValueTask, the ReadAsync overload with Memory on Stream is ValueTask too. |
That seems sensible. Updated (I suppose this will be discussed in The API Review, assuming this gets that far). |
+1 to this proposal! Some initial feedback:
I know the int vs. void thing has been the subject of API review debate. I expect Jeremy has more coherent thoughts on this. IMO this is a minor enough detail that it shouldn't delay review. |
Presumably these APIs are going to actually be
The existing APIs already return Looking at my proposal though, the public static async Task<(bool Success, int Written)> TryHashDataAsync(
ReadOnlyMemory<byte> key,
Stream source) => throw null; |
I think the method itself does not have to be ValueTask<byte[]> HashDataAsync(ReadOnlySpan<byte> key, Stream source)
{
var incrementalHash = IncrementalHash.CreateHMAC(HashAlgorithmName.SHA256, key);
return HashDataCoreAsync();
async ValueTask<byte[]> HashDataCoreAsync()
{
using (incrementalHash)
{
var buffer = new byte[4096];
while (await source.ReadAsync(buffer) is int bytesRead and not 0)
{
incrementalHash.AppendData(buffer.AsSpan(0, bytesRead));
}
return incrementalHash.GetHashAndReset();
}
}
} |
Ah. That might work. However, I would say there is one other case where a using System;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
class Program {
static async Task Main() {
ReadOnlySpan<byte> key = new byte[32];
byte[] destination = new byte[64];
_ = await HMACSHA512.HashDataAsync(key, Stream.Null, destination);
await File.WriteAllBytesAsync("digest.bin", destination);
}
}
public abstract partial class HMACSHA512 {
public static ValueTask<int> HashDataAsync(ReadOnlySpan<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default) => await null;
} That won't compile because |
Related (re: span and async): dotnet/csharplang#1331 If in practice we expect callers to have a Of course, I'm waiting until everybody goes on vacation to send the |
Well in that case I would just say take a
The API shape is.. strange in my opinion. "We accept a |
No stranger than an implicit conversion from |
I disagree - using |
@FiniteReality an API accepting spans is always better than an API accepting memories because memories can be converted to spans, but not the opposite. Almost all asynchronous methods such as those in But for HMAC, we don't have to pass the key as a memory because we only use it in the beginning, before any asynchronous operation is performed. Since the algorithm allows it, not accepting spans would be a missed opportunity. And accepting spans is better than accepting memories. |
A consumer of your asynchronous method can't pass a span, because they are asynchronous. Just because the implementation doesn't need the data when it goes asynchronous doesn't mean you should accept a span, because you have to think about how people will use the method. For example, static ValueTask<bool> M(ReadOnlySpan<byte> x) {
return new(true);
}
static async ValueTask N()
{
// CS4012
ReadOnlySpan<byte> data = /* memory from somewhere: */ new byte[10];
// would work but we're async
await M(data);
ReadOnlyMemory<byte> data2 = new byte[10];
// CS1503
await M(data2);
// requires an extra .Span
await M(data2.Span);
} I believe using a Span here effectively lulls people into a false sense of security, because while you are discarding the key material when you go async, a consumer is likely already in an async context, which means that they're keeping the key material referenced in memory for the entire execution of the asynchronous part of the code, simply as a side effect of how the C# compiler disallows the construction of span variables in async methods. |
I'm linking to the original proposal for the now completed one-shot hashing methods for span-like data: #17590 This proposal has reasoning and discussion that applies to this issue as well. |
@FiniteReality Just a reminder that since the |
Yeah, it works if you |
The key doesn't have to be heapable. In theory somebody who's security-paranoid would want to stackalloc it rather than store it in the heap. (And in fact the asp.net DataProtection implementation does just this for the majority of its key material.) This was also why I half-seriously suggested that this would be an appropriate place to implement a |
namespace System.Security.Cryptography
{
public abstract partial class MD5
{
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA1
{
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA256
{
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA384
{
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class SHA512
{
public static byte[] HashData(Stream source);
public static int HashData(Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACMD5
{
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static byte[] HashData(byte[] key, Stream source);
public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<byte[]> HashDataAsync(byte[] key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA1
{
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static byte[] HashData(byte[] key, Stream source);
public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<byte[]> HashDataAsync(byte[] key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA256
{
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static byte[] HashData(byte[] key, Stream source);
public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<byte[]> HashDataAsync(byte[] key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA384
{
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static byte[] HashData(byte[] key, Stream source);
public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<byte[]> HashDataAsync(byte[] key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
public abstract partial class HMACSHA512
{
public static byte[] HashData(ReadOnlySpan<byte> key, Stream source);
public static byte[] HashData(byte[] key, Stream source);
public static int HashData(ReadOnlySpan<byte> key, Stream source, Span<byte> destination);
public static ValueTask<byte[]> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<byte[]> HashDataAsync(byte[] key, Stream source, CancellationToken cancellationToken = default);
public static ValueTask<int> HashDataAsync(ReadOnlyMemory<byte> key, Stream source, Memory<byte> destination, CancellationToken cancellationToken = default);
}
} |
@GrabYourPitchforks @bartonjs Feedback for implementation: We can use
|
@vcsjones I can see that the implementation that was merged does not special-case |
Background and motivation
As of .NET 5 (Hash) and .NET 6 (HMAC) we have static one-shots for computing the hash of bytes, either in the form of an array of a
ReadOnlySpan<byte>
.This one shot I think is a good direction for .NET. To re-cap what we have today in terms of APIs:
HASHALG.HashData
- one shot of fixed-length buffers, array or spanIncrementalHash
- updatable hashHASHALG.Create
- .NET Framework 1.0 designMy personal belief is that number three should be mostly de-emphasized for common use cases. One shots should use option one, and incremental should use two.
Create
should be there for cases where polymorphic behavior is desired, if ever.There is one case remaining (I believe) where option one and two don't handle as well as three: hashing streams, because the
HashAlgorithm
instances haveComputeHash(Stream stream)
It's not uncommon to need to hash a stream all in one go (thus still a one-shot). I would propose that these should also be one shots to further reduce the need for option three.
As a matter of implementation, the goal is not to necessarily use platform APIs to produce the most optimal hashing API (though we can certainly do our best to optimize some certain situations). Rather the goal is to provide an API surface that does not require developers to manage an instance of a hash object.
API Proposal
API Usage
Alternative Designs
"Do nothing" is likely the best alternative to this. It's possible to do this today using
IncrementalHash
(read from the stream, update, done) orHASHALG.ComputeHash{Async}
. Both of these options however require managing an instance of the hash object.The text was updated successfully, but these errors were encountered: