-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove netfx-specific code from projects that no longer build in netfx configurations #27114
Comments
@jkotas what is our bar for introducing unsafe? It may be worth trying SIMD here. There are instructions for CRC |
Please don't bake little endian assumptions into code unconditionally. The code may be reused in Mono, which runs on big endian platforms. It's ok to offer an optimized version and guard it by |
https://github.com/dotnet/corefx/blob/master/src/System.IO.Compression/src/System/IO/Compression/Crc32Helper.Managed.cs is dead code in CoreFX. It was used in the netfx implementation of System.IO.Compression that we are not building in CoreFX anymore. The thing to do here is to cleanup netfx specific stuff from https://github.com/dotnet/corefx/blob/master/src/System.IO.Compression/src/System.IO.Compression.csproj.
The native implementation of CRC32 that corefx is using has SIMD already: https://github.com/dotnet/corefx/blob/bffef76f6af208e2042a2f27bc081ee908bb390b/src/Native/Windows/clrcompression/zlib-intel/crc32.c#L473 |
@jkotas A bit off-topic, but what is the current status of WASM targetting with CoreFX + CoreRT? Someone was asking for WASM-compatible implementation of DeflateStream and that could still be a good use for the managed code... |
It is being worked on as a side-project: https://github.com/dotnet/corefx/issues/26107. The WASM implementation should be able to use the stock C implementation just fine. I do not think there is a strong reason to introduce managed implementation just for the WASM. |
There may be other cleanup of project file content stemming from eade3b3 (which removed some netfx configurations). For example src\System.Net.Http\ref\System.Net.Http.csproj has @TimLovellSmith do you have any interest in cleaning these up? No particular reason why you should,... |
Well, I can only think of one particular reason - I would like there to be some fruit from this discussion. E.g. making a non-existing performance issue look non-existent. I will need to get to grips with the rest of the contribution process, but can give it a go. |
So... sanity checks wanted! These ones look still intended to be needed:
Whereas these next other ones, seem to be the prime targets for cleanup - it looks like 'netfx' configuration is no longer a thing for these, and I should clean up the project accordingly:
and perhaps even
By the way I also found 'netfx' configuration exists in the Configuration.props for various other projects not mentioned above, even though they don't directly test for TargetGroup 'netfx' in their project file. I haven't analyzed why, so let me know if that seems surprising... |
TLDR; basically I think there are a bunch of small tweaks to the code which could make this faster, along with the 'bigger' tweaks (risks) of using unsafe code, and really baking little-endian assumptions into the implementation, which I am thinking of doing a pull request for - high level QN: does it sound useful?
I saw a thread today talking about System.Compression.Crc32Helper using relatively a lot of CPU. Idly I wondered how this could be possible, so I copied the code from this repo into a project, came up with a dumb synthetic benchmark where I did millions of CRC32 computation on an array of length 12 for a few seconds, and looked at what the VS 2017 profiler (for .net framework, AnyCPU, release mode) told me the line costs where.
What I found was that the hottest line was the inner statement of the latter loop:
Which for me was compiled (JITted?) as native code:
The thing that jumped out at me first while looking at this was the 'jae' checks, which I guessed might be array bound checks! Using 'unsafe' should make this faster right? (maybe)
So I tried it out. I didn't notice an obviously significant change from that, but while trying to understand the disassembly further, I realized that the way 'slice by 8' was being implemented by JIT or whoever did the assembler here was not very optimal. It was literally reading 1 byte at a time from the array and zero extending, then xoring all the zero extended words, just to read (and XOR) all the asserted-as-little-endian bytes into the uint32 variable 'crc32?' I think no C or assembly programmer would do it that way...
So I switched that line out with
crc32 ^= ((uint*)buffptr)[offset / 4]'
and it came out with something that looked much more sensible to me, assuming I introduced no bug
this seems to shave several percents off how many CPU samples I was spending on those particular lines, which were hot lines dotnet/corefx#2 and dotnet/corefx#3 for my scenario. And happily, these are the lines which are expected to be in the 'hot path' when processing nice big buffer inputs...
There was still some sorta 'crazy looking' inverse operations happening with multiplying and dividing (using right shift) by 4 or 8 to get array offsets which should be able to be canceled out by better manual assembly code... and the fact that both 'offset' and 'i' are getting incremented each time around the loop by 8 and 1 each, is a bit odd, you should only really need one loop counter....
The text was updated successfully, but these errors were encountered: