-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gateware floating point instructions on the kernel CPU #535
Comments
Awesome, thanks for adding this. Can you post this on the m-labs website in the ARTIQ extensions list as well? |
I intend to remove this page at some point and replace it with a link to https://github.com/m-labs/artiq/issues?utf8=%E2%9C%93&q=label%3Atype%3Afor-contract%20 |
@dhslichter mentioned basic math functions (sin, cos, log, exp). Should those be gateware-accelerated, or are slower software implementations (using gateware FP for floating addition, multiplication, etc.) OK? |
I would really prefer not to stray to far of the beaten path here, i.e. for hard FP and vectorization (which is an extremely complicated thing to reinvent) we should concentrate on the corresponding openrisc instruction sets and implement them, instead of inventing new instruction sets. Also accelerated trigonometry is of similar complexity as vectorization. |
I agree with @jordens; the most important thing to do here would be to implement a "standard" hard FP instruction set. Basically the thing to do would be to implement all the instructions defined in the openrisc standard (ORFPX32) but not actually implemented in mor1kx. The sin/cos/log/exp functions could then be constructed from these with (hopefully) reasonable performance, and we can decide later if they truly require hardware acceleration themselves. |
@whitequark could you add a short assessment of the state/quality/existence of support for or1k FP in the toolchain and the biggest roadblocks to this issue?
|
No changes required as the LLVM instructions are the same for soft or hard FP.
No changes needed as we will be using less compiler-rt code afterwards.
The OR1K backend supports the complete OR1K FPU single-precision instruction set except the MAC instruction. Extending to double precision (which we use) is trivial. Ditto for MAC. |
@sbourdeauducq would we use the mor1kx FPU or port some other FPU or write ourselves a new one or go some completely different route? |
Take the usable parts of the mor1kx FPU (e.g. its interface into the CPU), rewrite the others. |
Seems VexRiscV has a FPU now and I would expect high quality. |
Funded by Duke. |
speedup: #1777 (comment) |
#529 (comment)
The text was updated successfully, but these errors were encountered: