-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate the Ryū algorithm for a simpler/faster implementation of float -> string conversion #52811
Comments
cc @dtolnay in case this is of interest for serde |
The repository can be found at https://github.com/ulfjack/ryu and the paper at https://dl.acm.org/citation.cfm?id=3192369. |
I did a line-by-line translation of the C implementation to unsafe Rust in https://github.com/dtolnay/ryu. It performs better than dtoa on some inputs but the output is less human readable. For example dtoa and serde_json prefer to print 0.3 as
Benchmark commit: dtolnay/dtoa@655152b |
@dtolnay : I would imagine that changing the output format would be trivial, no? On the other hand, the fact that it performs worse on |
As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ryū generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers. The Ryū benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits. |
Thank you for confirming my suspicion.
I would guess that whether the number of digits is random, or not, varies by application. This makes it quite difficult to pick one algorithm or the other. |
Ryū is seems to be slower in a few case versus What trade-off does
From what I understand, both Ryū and the |
Retagging as T-libs, T-compiler hardly has anything to do with libstd matters. |
I switched from |
Triage: not aware of any changes here |
Note: at CppCon19, STL presented the culmination of his work on C++17 |
Has the Dragonbox algorithm ever been discussed? https://github.com/jk-jeon/dragonbox/
It looks like if the float to string conversion algorithm is going to be rewritten, the Dragonbox algorithm might be an even better candidate than Ryu. |
I was wondering the same. Maybe it's a code size issue? EDIT: it's fairly recent too. Was there an intention to rewrite the Here is a paper with a more detailed explanation of the algorithm. The current Grisu3 algorithm implementation seems to have issues, but it doesn't look like a fix has been proposed. Still, it has to fallback on Dragon4 sometimes, I don't know if DragonBox would avoid this altogether. Maybe I'll check it out, though I'm sure the developers have already thought about other algorithms. This lib provides a few pointers and implementations for the different algorithms, as a complement to the link provided by @r00ster91 (which provides detailed performance graphs): https://github.com/abolz/Drachennest |
In addition to being faster, it would also be helpful on embedded systems from a stack perspective as the current implementation uses over 1K of stack which leads to formatting strings with floats on embedded is a common source for stack overflows. Meanwhile dtolnay's Ryū implementation only requires a 24 byte buffer, and doesn't look to require significantly more for local variables. |
There's a new paper making the rounds on the topic of converting floats to their decimal string representations that claims to be both simpler and faster than prior algorithms: https://pldi18.sigplan.org/event/pldi-2018-papers-ry-fast-float-to-string-conversion . I'm particularly interested in the simplicity aspect, since I recall some old conversations regarding our current machinery for this being somewhat subtle and complicated (for speed purposes, I imagine). If we could drastically simplify the implementation without sacrificing speed, that might be a win. Good student or intern project, methinks.
(Apologies for how wishlisty this issue is, I've no clue who might be the presiding expert on our current float->string implementation.)
The text was updated successfully, but these errors were encountered: