Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV.write internals overhaul #497

Merged
merged 6 commits into from
Sep 13, 2019
Merged

CSV.write internals overhaul #497

merged 6 commits into from
Sep 13, 2019

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Sep 10, 2019

This PR starts overhauling CSV.write with an aim at slightly better extensibility, simplicity, and most of all, performance. It utilizes the new ryu algorithm for float writing. I haven't quite worked out all the performance kinks yet, but so far, on a 10,000 x 20 Float64 DataFrame, I see the follow differences:

current master:

julia> @time CSV.write(io, df)
  0.064680 seconds (400.10 k allocations: 16.215 MiB, 6.68% gc time)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=3853698, maxsize=Inf, ptr=3853699, mark=-1)

this PR:

julia> @time CSV.write(io, df)
  0.016382 seconds (10.04 k allocations: 7.983 MiB)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=3853698, maxsize=Inf, ptr=3853699, mark=-1)

So about a 5x speedup. We're still getting an allocation per row here, so I need to track that down.

@codecov
Copy link

codecov bot commented Sep 13, 2019

Codecov Report

Merging #497 into master will decrease coverage by 1.08%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #497      +/-   ##
==========================================
- Coverage   84.43%   83.34%   -1.09%     
==========================================
  Files           7        7              
  Lines        1195     1243      +48     
==========================================
+ Hits         1009     1036      +27     
- Misses        186      207      +21
Impacted Files Coverage Δ
src/CSV.jl 78.68% <ø> (-0.06%) ⬇️
src/write.jl 85.14% <85.71%> (-8.97%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a6f5bb...def5671. Read the comment docs.

@quinnj quinnj merged commit f046da7 into master Sep 13, 2019
@quinnj quinnj deleted the jq/write branch September 13, 2019 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant