Tfloat float and double coexistance -- working towards that goal #7

GerHobbelt · 2021-07-13T14:52:39Z

This started out as tesseract-ocr#3490 and is the answer to the questions:

can we get rid of a lot of #if/else FAST_FLOAT?
@stweil said he would like float and double run-times to be both present and selectable at run-time. How much effort would it take?

(Answer to Q2: clearly more than a day; plus more thought on this required as this solution at least gets you half-way, but then it stops being useful -- DRY or not to DRY 🤔 😉 )

The key idea is to use function templates (template <class TFloat> ...) and then let the compiler do the heavy lifting of picking and instantiating the required ones where applicable. (Turns out only one place needed explicit template instantiation -- still pretty clean, DRY-wise -- and that was halfway the sprint, so now I wonder if those two explicit instantiations can be dropped after all. 🤔 )

Note, for example, the disappearance of the function duplicates in src/arch/intsimdmatrixavx2.cpp thanks to the function templates: only the float/double-specific stuff has to stay, while the rest of the code in there is now using that template <class Tfloat> line...

Anyway, all the work is collected in the first commit. The second one is just a few unrelated cleanup bits, which I will check against mainline to see if they came from there and then file a pullreq there for those few minor household items.)

Hope you like the idea and how it works out (Serialization becomes easy to read and review when you question the code: "what format are you reading or writing now? float or double?" -- but that's me being happy about it with 20:20 hindsight after coding this.)

GerHobbelt · 2021-07-15T13:29:31Z

[Edit: rebased & squashed to your bleeding edge tfloat branch]

Up to now Tesseract used double for training and recognition with "best" models. This commit replaces double by a new data type TFloat which is double by default, but float if FAST_FLOAT is defined. Ideally this should allow faster training. Signed-off-by: Stefan Weil <[email protected]>

Signed-off-by: Stefan Weil <[email protected]>

…vector (8x32) (contrasting 4 double FPs: 4*64)

…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.

Signed-off-by: Stefan Weil <[email protected]>

…aster" This partially reverts commit 122daf1, reversing changes made to 4cd56dc. This fixes a fatal assertion for certain images: cell_y_.size() >= 2 && cell_x_.size() >= 2:Error:Assert failed:in file ../../../src/textord/tablerecog.cpp, line 363 Signed-off-by: Stefan Weil <[email protected]>

…ith all other tesseract defined types, to prevent collisions with thirdparty software.

…at got through while I manually extracted the template work from my mainline (warnings due to running MSVC at Level 4) [sw]: Use different fix for blamer.cpp Signed-off-by: Stefan Weil <[email protected]>

…g function templates for TFloat float & double implementations to co-exist in the run-time without cluttering the code with #if/#else and no run-time switches (yet). ## Observations thus far - DRY? Check! - the whole function template (and let the C++ compiler do the heavy lifting) idea of stops somewhere. This regrettably happens to be at the weightmatrix.cpp code, where the code calls the CPU+configuration-selected SIMD implementation via function pointer: `intSimdMatrix->matrixDotVectorFunction` -- this would require code duplication of some kind (e.g. a FP32 callback pointer co-existing with a FP64 callback ptr in the struct and then have the code pick the right one, depending on current TFloat size, for example) and is thus deemed unsatisfactory (my opinion). - So far, and very probably independent of any solutions for the co-existence issue at higher levels in the code, this template approach works out well, with the compiler smartly picking the one matching the current float/double choice. - while we have double the number of specialized SIMD implementations (obviously), these do not need #if/#else checks as we can let the C++ compiler do its prototype matching job --> cleaner code. - the template functions also help clean up the serialization/de-serialization code as the `<T, ST>` dual-type approach there allows one to specify the run-time type (TFloat) and the file-storage type at the same time: also do note how this cleans up the 'Old' scales deserialization code, as the old file storage is simply 'float' instead of 'double'. - the added cost there is a double copy of file data when T==ST, but that turned out negligible in the preliminary tests as that bit of code didn't even reach the Top20 CPU Guzzlers Chart, so that extra copy can wait for smarter C++ template writers to take care of when microtuning is called for.

GerHobbelt mentioned this pull request Jul 13, 2021

TFloat (FAST_FLOAT) work done & slightly different idea used to make code easily switchable between double & float tesseract-ocr/tesseract#3490

Closed

stweil force-pushed the tfloat branch from 81db6f1 to 13829cd Compare July 14, 2021 20:23

GerHobbelt force-pushed the tfloat-float-and-double-coexist branch from 5d620a5 to 25ea26f Compare July 15, 2021 13:28

stweil and others added 18 commits July 15, 2021 16:09

Fix some compiler warnings

fa1850f

Signed-off-by: Stefan Weil <[email protected]>

Optimize DotProductStdInnerProduct for float

507b8cb

Signed-off-by: Stefan Weil <[email protected]>

Avoid double / float conversion

0f9acea

Signed-off-by: Stefan Weil <[email protected]>

Implement TFloat for IntSimdMatrix

16437ca

Signed-off-by: Stefan Weil <[email protected]>

Test more implementations of DotProduct

8e77429

Signed-off-by: Stefan Weil <[email protected]>

Add unittest for dotproduct

c7034f0

Signed-off-by: Stefan Weil <[email protected]>

Support Apple Accelerate framework for training and best models

f497c18

Signed-off-by: Stefan Weil <[email protected]>

Fix TFloat builds for Apple M1

4676d22

Signed-off-by: Stefan Weil <[email protected]>

Fix DotProductNative for TFloat

82236f6

Signed-off-by: Stefan Weil <[email protected]>

bugfix of FMA port to FAST_FLOAT: 8 float FPs fit in a single 256bit …

1402521

…vector (8x32) (contrasting 4 double FPs: 4*64)

extracted from 3490: implements DotProductSSE() for FAST_FLOAT

00feac2

bugfixing the AVX2 Extract8+16 codes, where there's lines like `__m25…

dd3b5f2

…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.

Improve build code for native dotproduct

a9ff366

Signed-off-by: Stefan Weil <[email protected]>

Enhance unittest/dotproduct_test

9c6503b

Signed-off-by: Stefan Weil <[email protected]>

Remove test code for fast float dotproduct

284fdb0

Signed-off-by: Stefan Weil <[email protected]>

Implement fast float dotproduct for SSE IntSimdMatrix

a71edc9

Signed-off-by: Stefan Weil <[email protected]>

stweil force-pushed the tfloat branch from 554bd1b to ffea0f2 Compare July 15, 2021 14:26

GerHobbelt added 3 commits July 15, 2021 16:38

Place TFloat type in the tesseract namespace, same as has been done w…

77cd861

…ith all other tesseract defined types, to prevent collisions with thirdparty software.

just a couple of 'shadowed local variables' compiler warning fixes th…

8d1c1e1

…at got through while I manually extracted the template work from my mainline (warnings due to running MSVC at Level 4) [sw]: Use different fix for blamer.cpp Signed-off-by: Stefan Weil <[email protected]>

GerHobbelt force-pushed the tfloat-float-and-double-coexist branch from 25ea26f to 97834d0 Compare July 15, 2021 15:56

stweil force-pushed the tfloat branch 3 times, most recently from 3fd7053 to 0d412a8 Compare July 20, 2021 18:46

stweil force-pushed the tfloat branch 9 times, most recently from 894e7c4 to 2759788 Compare July 24, 2021 13:14

stweil closed this Jul 25, 2021

stweil deleted the branch stweil:tfloat July 25, 2021 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tfloat float and double coexistance -- working towards that goal #7

Tfloat float and double coexistance -- working towards that goal #7

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 15, 2021

Tfloat float and double coexistance -- working towards that goal #7

Tfloat float and double coexistance -- working towards that goal #7

Conversation

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 15, 2021