Release Fastor V0.5.1 · romeric/Fastor

Although with a minor tag Fastor V0.5.1 includes some major changes specially in the API design, performance and stability

SIMDVector has been reworked to fix the long-standing issue with fall-back to non SIMD code for non-64 bit types. The fall-back is now always to the correct scalar type where a scalar specialisation is available i.e. float, double, int32_t, int64_tand to a fixed array of size 1 holding the type for other cases. The API is now a lot closer to Vc and std::experimental::simd. SIMDVector for floating points is now also activated at SSE2 level allowing any compiler that automatically defines SSE2 without -march=native vectorise Fastor's code since all compiler these days define __SSE2__ at -O2/-O3 levels
Fix a long-standing bug in network tensor contraction. Rework opmin_meta/cost models to be truly compile-time recursive in terms of depth first search. Strided contractions for networks have completely been removed and for pairs it is deactivated. Tensor contraction of networks now dispatches to by-pair einsum which has many specialisation including dispatching to matmul. More than an order of magninute performance gain in certain cases.
Extremely fast matmul/gemm routines. Fastor now provides potentially the fastest gemm routine for small to medium sized tensors of single and double precision as far as static dispatch is concerned. Benchmarks have been added here. Many flavours of matmul implementations are now available, for different sizes and with remainder handling and mask loading/storing.
AVX512 support for single and double floats
Better macro handling through a series of new FASTOR_... macros
Accurate timeit function based on rdtsc together with memory clobber and serialisation for further accuracy
Fastor is now Windows compatible. The whole test suite runs and passes on MSVC 2019
Quite a few bugs and compiler warnings have been fixed along the way

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fastor V0.5.1