-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SqlBulkCopy - generics to avoid boxing #358
Conversation
There seem to be build errors for the PR, please take a look at the logs and try to build and run tests locally before the pipelines trigger builds. You can use steps from BUILDGUIDE to do the same. |
src/Microsoft.Data.SqlClient/tests/FunctionalTests/Microsoft.Data.SqlClient.Tests.csproj
Outdated
Show resolved
Hide resolved
Can you post your test program/query and i'll see if i can help on the WriteTokenLength speed issue? |
@Wraith2 I pushed the branch |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/ValueTypeConverter.cs
Outdated
Show resolved
Hide resolved
I've done some experimenting and have some suggestions. All of this I've only checked on netcore, not netfx so you'll need to check the behaviour on netfx to see if they same applies. ValueConverter can be reduced to:
And then made internal (so it doesn't change the public api surface). You can then add a linked reference to the file in the test harness so you can access it if you need to. Though really I don't think there need to be specific tests for it.
All of this relies on the netcore JIT being able to elide the pattern |
I will implement your I was using the unit tests to more quickly iterate on the Side note, I was kind of surprised to see the Test project as a whole didn't have internals of the SqlClient project exposed to it.
I definitely agree -- that is definitely an improvement and I will implement that.
I actually didn't need the Side note, for a
I will definitely keep the public API surface aligned between |
That's a behaviour which can be changed by the JIT and by the platform. Unix and windows may have different behaviours due to the ABI of the native code generated. It is also a behaviour that may change in future iterations of the JIT so it's a level I don't tend to push at too hard. So I don't know 😁. |
Expanded Still seeing some perf diffs, however they're much smaller (20%) when doing a like-for-like comparison using
|
@Wraith2 I completed your PR feedback on the Some notes: I changed the name of I'd like to resolve the perf issues above before porting to Some ideas I have:
I've been profiling using JetBrains DotTrace and haven't had success in determining why it's so much slower -- it's hard to pinpoint comparing old vs. new because the control flow has changed to let the generic value flow downwards from read -> write. |
Can you export the profiler traces and upload them, I have dottrace so I should be able to open them and take a look. You also need to take a look at the test failures and see if there's anything that can be fixed. |
@Wraith2 I can't upload to Github, so sent to your email listed on your profile. I will take a look at the tests to get that cleaned up. |
I split out the benchmark into 4 different tables (decimal, string, int, bool) being copied into (instead of one table w/ all 4), and now i'm seeing zero performance degradation:
|
Weird. Incidentally a lot of those SqlDecimal allocations will be removed in .net 5 if dotnet/runtime#1155 gets merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Wraith2 @cheenamalhotra please see my comments in this review --
@cheenamalhotra does this logic needs to be additionally ported into netfx
?
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
There was a problem with my benchmark (some random-ness) and after de-randomizing the copied data, the changes in this PR appear to improve BulkCopy's overall performance (as I was hoping). |
src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/SqlClient/GenericConverter.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/FunctionalTests/GenericConverterTests.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comments:
- The classes
GenericConverter
you added in NetFx and NetCore projects are not same. - You did not make changes to NetFx project, any specific reason?
- Also I did not see extensive test cases to reflect what we achieved with the PR.
I'm yet to go deep down the logic path, wanted to highlight some top observations.
Are the failing checks due to something more widespread in the SqlClient ecosystem? Looks like a lot of CreateDatabase failures in Enclave jobs + 2 functional test failures related to Always Encrypted |
@cmeyertons Yes, I think these failures are not related to your PR |
@cheenamalhotra @DavoudEshtehari @Wraith2 I believe this PR has all feedback completed -- the only remaining item is the decision on whether or not we want to port these changes to |
Have you investigated how difficult/different the port to netfx would be? I think it it's low friction you'll almost certainly be asked to do so, there's a lot of line of business software out there using bulk copy which would be improved by better performance and it would be a good selling point for driving library adoption over the system version. |
Please go ahead and update NetFx source. I think the changes look good to take it forward to completion 👍 Please don't mind the Enclave pipeline failures, we're dealing with some environment issues currently. We've disabled these pipelines for PR validation currently, and will resume them when test environments are ready again! |
@cheenamalhotra @Wraith2 performed the following:
|
Excellent work, thanks 👍 |
That's great progress, would request you to fix the failing tests in NetCore, as these failures are related to changes here.. |
/me pokes @cmeyertons It'd be nice to get this into the next release |
@Wraith2 i can carve out some time today I was kind of stuck on fixing the test - I remember it being related to the stream bulk copy functionality and couldn’t easily step through it because the test had a large amount of data unrelated to the breakage. will dig in some more today and post my findings |
@cmeyertons Could you resolve conflicts, please? |
It is quite irksome that PR's are left open for so long and yet when you do get chance to review them quite often the only thing that is asked is to rebase them on the current version. If you gave us some warning we could do it ahead of time but trying to keep all open PR's rebased on master constantly is a big time sync for contributors. |
I’m currently working on patching my changes on top of main and will create a new PR - there were too many hairy conflicts I couldn’t easily resolve. |
Closing as new PR #1048 replaces it. |
This PR is in reference to #353
This is an attempt to reduce garbage created during a SqlBulkCopy operation for applications that are using SqlBulkCopy to stream batches of data out during a larger overall process.
The overall approach was to be more respectful of the
IDataReader
interface and use it's typed methods if possible to eliminate uneccesary boxing. The unboxed value is passed around generically and converted back to its underlying type using aValueTypeConverter
.I did observe some performance degradation when passing around by value, kicking off the idea to pass everything by
ref
-- this is more in line with howSqlBulkCopy
previously behaved, minus the boxing.While I do see less garbage created, I am seeing some disappointing performance results (abbreviated Benchmark.NET results):
There are still a considerable amount of heap allocations occurring, primarily during
SqlDecimal
-- I have logged the following issue to address: dotnet/runtime#1034I think there is something different with the way my dlls are compiled? I'm seeing much slower times in places where I did not make any changes, such as
TdsParser.WriteTokenLength
My WriteTokenLength: 144ms
Old WriteTokenLength: 30ms
Any assistance in understanding the perf degradation is welcomed and appreciated!