Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFloat (FAST_FLOAT) work done & slightly different idea used to make code easily switchable between double & float #3490

Closed
wants to merge 486 commits into from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jan 15, 2021

  1. Merge remote-tracking branch 'remotes/stweil/network-string'

    # Conflicts:
    #	src/training/combine_tessdata.cpp
    GerHobbelt committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    cc2f5be View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'remotes/UB-Mannheim/windows'

    # Conflicts:
    #	src/ccutil/errcode.h
    #	src/ccutil/serialis.cpp
    #	src/ccutil/tprintf.h
    #	src/viewer/scrollview.h
    GerHobbelt committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    ebfb844 View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'remotes/stweil/fuzzers'

    # Conflicts:
    #	Makefile.am
    #	src/ccutil/helpers.h
    #	src/ccutil/scanutils.h
    #	src/ccutil/tprintf.h
    #	unittest/Makefile.am
    GerHobbelt committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    cacad1b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    81b21e4 View commit details
    Browse the repository at this point in the history
  5. Merge remote-tracking branch 'remotes/UB-Mannheim/windows'

    # Conflicts:
    #	dll/i686-w64-mingw32/iconv.dll
    #	dll/i686-w64-mingw32/icudt64.dll
    #	dll/i686-w64-mingw32/icuin64.dll
    #	dll/i686-w64-mingw32/icuuc64.dll
    #	dll/i686-w64-mingw32/libarchive-13.dll
    #	dll/i686-w64-mingw32/libbz2-1.dll
    #	dll/i686-w64-mingw32/libcairo-2.dll
    #	dll/i686-w64-mingw32/libcurl-4.dll
    #	dll/i686-w64-mingw32/libeay32.dll
    #	dll/i686-w64-mingw32/libexpat-1.dll
    #	dll/i686-w64-mingw32/libffi-6.dll
    #	dll/i686-w64-mingw32/libfontconfig-1.dll
    #	dll/i686-w64-mingw32/libfreetype-6.dll
    #	dll/i686-w64-mingw32/libgcc_s_sjlj-1.dll
    #	dll/i686-w64-mingw32/libgif-7.dll
    #	dll/i686-w64-mingw32/libglib-2.0-0.dll
    #	dll/i686-w64-mingw32/libgobject-2.0-0.dll
    #	dll/i686-w64-mingw32/libgomp-1.dll
    #	dll/i686-w64-mingw32/libharfbuzz-0.dll
    #	dll/i686-w64-mingw32/libintl-8.dll
    #	dll/i686-w64-mingw32/libjbig-2.dll
    #	dll/i686-w64-mingw32/libjpeg-8.dll
    #	dll/i686-w64-mingw32/liblept-5.dll
    #	dll/i686-w64-mingw32/liblz4-1.dll
    #	dll/i686-w64-mingw32/liblzma-5.dll
    #	dll/i686-w64-mingw32/liblzo2-2.dll
    #	dll/i686-w64-mingw32/libnettle-6.dll
    #	dll/i686-w64-mingw32/libnghttp2-14.dll
    #	dll/i686-w64-mingw32/libopenjp2.dll
    #	dll/i686-w64-mingw32/libpango-1.0-0.dll
    #	dll/i686-w64-mingw32/libpangocairo-1.0-0.dll
    #	dll/i686-w64-mingw32/libpangoft2-1.0-0.dll
    #	dll/i686-w64-mingw32/libpangowin32-1.0-0.dll
    #	dll/i686-w64-mingw32/libpcre-1.dll
    #	dll/i686-w64-mingw32/libpixman-1-0.dll
    #	dll/i686-w64-mingw32/libpng16-16.dll
    #	dll/i686-w64-mingw32/libssh2-1.dll
    #	dll/i686-w64-mingw32/libstdc++-6.dll
    #	dll/i686-w64-mingw32/libtiff-5.dll
    #	dll/i686-w64-mingw32/libwebp-7.dll
    #	dll/i686-w64-mingw32/libwinpthread-1.dll
    #	dll/i686-w64-mingw32/libxml2-2.dll
    #	dll/i686-w64-mingw32/libzstd-1.dll
    #	dll/i686-w64-mingw32/ssleay32.dll
    #	dll/i686-w64-mingw32/zlib1.dll
    #	dll/x86_64-w64-mingw32/iconv.dll
    #	dll/x86_64-w64-mingw32/icudt64.dll
    #	dll/x86_64-w64-mingw32/icuin64.dll
    #	dll/x86_64-w64-mingw32/icuuc64.dll
    #	dll/x86_64-w64-mingw32/libarchive-13.dll
    #	dll/x86_64-w64-mingw32/libbz2-1.dll
    #	dll/x86_64-w64-mingw32/libcairo-2.dll
    #	dll/x86_64-w64-mingw32/libcurl-4.dll
    #	dll/x86_64-w64-mingw32/libeay32.dll
    #	dll/x86_64-w64-mingw32/libexpat-1.dll
    #	dll/x86_64-w64-mingw32/libffi-6.dll
    #	dll/x86_64-w64-mingw32/libfontconfig-1.dll
    #	dll/x86_64-w64-mingw32/libfreetype-6.dll
    #	dll/x86_64-w64-mingw32/libgcc_s_seh-1.dll
    #	dll/x86_64-w64-mingw32/libgif-7.dll
    #	dll/x86_64-w64-mingw32/libglib-2.0-0.dll
    #	dll/x86_64-w64-mingw32/libgobject-2.0-0.dll
    #	dll/x86_64-w64-mingw32/libgomp-1.dll
    #	dll/x86_64-w64-mingw32/libharfbuzz-0.dll
    #	dll/x86_64-w64-mingw32/libintl-8.dll
    #	dll/x86_64-w64-mingw32/libjbig-2.dll
    #	dll/x86_64-w64-mingw32/libjpeg-8.dll
    #	dll/x86_64-w64-mingw32/liblept-5.dll
    #	dll/x86_64-w64-mingw32/liblz4-1.dll
    #	dll/x86_64-w64-mingw32/liblzma-5.dll
    #	dll/x86_64-w64-mingw32/liblzo2-2.dll
    #	dll/x86_64-w64-mingw32/libnettle-6.dll
    #	dll/x86_64-w64-mingw32/libnghttp2-14.dll
    #	dll/x86_64-w64-mingw32/libopenjp2.dll
    #	dll/x86_64-w64-mingw32/libpango-1.0-0.dll
    #	dll/x86_64-w64-mingw32/libpangocairo-1.0-0.dll
    #	dll/x86_64-w64-mingw32/libpangoft2-1.0-0.dll
    #	dll/x86_64-w64-mingw32/libpangowin32-1.0-0.dll
    #	dll/x86_64-w64-mingw32/libpcre-1.dll
    #	dll/x86_64-w64-mingw32/libpixman-1-0.dll
    #	dll/x86_64-w64-mingw32/libpng16-16.dll
    #	dll/x86_64-w64-mingw32/libssh2-1.dll
    #	dll/x86_64-w64-mingw32/libstdc++-6.dll
    #	dll/x86_64-w64-mingw32/libtiff-5.dll
    #	dll/x86_64-w64-mingw32/libwebp-7.dll
    #	dll/x86_64-w64-mingw32/libwinpthread-1.dll
    #	dll/x86_64-w64-mingw32/libxml2-2.dll
    #	dll/x86_64-w64-mingw32/libzstd-1.dll
    #	dll/x86_64-w64-mingw32/ssleay32.dll
    #	dll/x86_64-w64-mingw32/zlib1.dll
    #	src/ccutil/errcode.h
    #	src/ccutil/tprintf.h
    #	src/viewer/scrollview.h
    GerHobbelt committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    0b418f5 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    e30195a View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2021

  1. Merge remote-tracking branch 'remotes/ulb-sachsen-anhalt/master'

    # Conflicts:
    #	configure.ac
    GerHobbelt committed Jan 27, 2021
    Configuration menu
    Copy the full SHA
    c84f864 View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	Makefile.am
    #	unittest/Makefile.am
    GerHobbelt committed Jan 27, 2021
    Configuration menu
    Copy the full SHA
    fd58d5a View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2021

  1. Configuration menu
    Copy the full SHA
    1d717cc View commit details
    Browse the repository at this point in the history
  2. updated Pix input format handling

    # Conflicts:
    #	src/api/pdfrenderer.cpp
    Kubiria authored and GerHobbelt committed Jan 29, 2021
    Configuration menu
    Copy the full SHA
    62dfe0b View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2021

  1. Configuration menu
    Copy the full SHA
    d92c8c7 View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2021

  1. Configuration menu
    Copy the full SHA
    f3f83c5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4ee6d50 View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2021

  1. Configuration menu
    Copy the full SHA
    9e725ec View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d632e6c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    045d491 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2edbe0d View commit details
    Browse the repository at this point in the history

Commits on Feb 11, 2021

  1. Configuration menu
    Copy the full SHA
    33e90db View commit details
    Browse the repository at this point in the history
  2. clang fix?

    GerHobbelt committed Feb 11, 2021
    Configuration menu
    Copy the full SHA
    589c139 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2021

  1. Configuration menu
    Copy the full SHA
    34f2eb0 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2021

  1. Configuration menu
    Copy the full SHA
    c570bbf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c319286 View commit details
    Browse the repository at this point in the history
  3. update submodules

    GerHobbelt committed Feb 18, 2021
    Configuration menu
    Copy the full SHA
    d91dd4d View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2021

  1. Configuration menu
    Copy the full SHA
    99be81c View commit details
    Browse the repository at this point in the history

Commits on Feb 22, 2021

  1. updated submodules

    GerHobbelt committed Feb 22, 2021
    Configuration menu
    Copy the full SHA
    1e72f9d View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2021

  1. Implement unpack for lstmf files

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 26, 2021
    Configuration menu
    Copy the full SHA
    ed5e40e View commit details
    Browse the repository at this point in the history
  2. Support lstmf files with more than one line

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 26, 2021
    Configuration menu
    Copy the full SHA
    5876fc4 View commit details
    Browse the repository at this point in the history
  3. Add missing include statement for access

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 26, 2021
    Configuration menu
    Copy the full SHA
    0995870 View commit details
    Browse the repository at this point in the history
  4. Add info command

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 26, 2021
    Configuration menu
    Copy the full SHA
    f439170 View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2021

  1. Configuration menu
    Copy the full SHA
    eb62f07 View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2021

  1. Use Apple Accelerate framework for training and best models

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 28, 2021
    Configuration menu
    Copy the full SHA
    5469248 View commit details
    Browse the repository at this point in the history
  2. Add results for Intel Core i5

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Feb 28, 2021
    Configuration menu
    Copy the full SHA
    f5cb128 View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2021

  1. Implement unpack for lstmf files

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    9abc1bd View commit details
    Browse the repository at this point in the history
  2. Support lstmf files with more than one line

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    abe760d View commit details
    Browse the repository at this point in the history
  3. Add missing include statement for access

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    f689070 View commit details
    Browse the repository at this point in the history
  4. Add info command

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    83a925b View commit details
    Browse the repository at this point in the history
  5. Don't use threads for loading documents

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    9f70fd1 View commit details
    Browse the repository at this point in the history
  6. Use Apple Accelerate framework for training and best models

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    a004e14 View commit details
    Browse the repository at this point in the history
  7. Remove unused code for serialization

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    1414854 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2021

  1. Merge commit '1ab6b0fbc6ec757e9e7be44802448bcfd62df187'

    # Conflicts:
    #	.github/workflows/sw.yml
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    2b9b9b9 View commit details
    Browse the repository at this point in the history
  2. - fix merge

    - make sure everyone uses tprintf() instead of printf() / fprintf(), so any print output gets routed through the fz_error/warning/info calls and lands in a place where we can actually see/do something with it.
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    99eedbe View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'remotes/UB-Mannheim/master'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    49a4d07 View commit details
    Browse the repository at this point in the history
  4. Merge commit '0cde3ede98ca9f63ea0ef94c294aee67243aaaa0'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    2ce0ab9 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e82d245 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8aa85e0 View commit details
    Browse the repository at this point in the history
  7. Merge remote-tracking branch 'remotes/stweil/unpack'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    16c577c View commit details
    Browse the repository at this point in the history
  8. Merge remote-tracking branch 'remotes/stweil/accelerate'

    # Conflicts:
    #	src/arch/simddetect.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    245cf59 View commit details
    Browse the repository at this point in the history
  9. added the tesseract training tools to the MuPDF build project.

    exported each utility as a separate function (to be invoked via `mutool`)
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    771c10a View commit details
    Browse the repository at this point in the history
  10. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	src/ccutil/serialis.cpp
    #	src/ccutil/serialis.h
    #	src/ccutil/unicharcompress.cpp
    GerHobbelt committed Mar 8, 2021
    Configuration menu
    Copy the full SHA
    57c9fe1 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2021

  1. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	src/ccmain/tessedit.cpp
    #	src/ccmain/tesseractclass.h
    #	src/ccutil/strngs.cpp
    #	src/ccutil/strngs.h
    #	src/lstm/lstmrecognizer.cpp
    GerHobbelt committed Mar 11, 2021
    Configuration menu
    Copy the full SHA
    88deb91 View commit details
    Browse the repository at this point in the history
  2. fix errors after merge commit: missing changes that are needed too to…

    … make this codebase compile.
    GerHobbelt committed Mar 11, 2021
    Configuration menu
    Copy the full SHA
    33b0a77 View commit details
    Browse the repository at this point in the history
  3. fix errors after merge commit: missing changes that are needed too to…

    … make this codebase compile.
    GerHobbelt committed Mar 11, 2021
    Configuration menu
    Copy the full SHA
    3216647 View commit details
    Browse the repository at this point in the history
  4. Merge branch 'winpatch1'

    GerHobbelt committed Mar 11, 2021
    Configuration menu
    Copy the full SHA
    964a00e View commit details
    Browse the repository at this point in the history
  5. Update src/wordrec/wordrec.h

    stweil authored Mar 11, 2021
    Configuration menu
    Copy the full SHA
    3921273 View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2021

  1. Configuration menu
    Copy the full SHA
    712953c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    deee736 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2021

  1. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	include/tesseract/baseapi.h
    #	include/tesseract/capi.h
    #	include/tesseract/export.h
    #	src/api/baseapi.cpp
    #	src/api/capi.cpp
    #	src/api/pdfrenderer.cpp
    #	src/api/tesseractmain.cpp
    #	src/arch/simddetect.cpp
    #	src/ccmain/applybox.cpp
    #	src/ccmain/output.cpp
    #	src/ccmain/pageiterator.cpp
    #	src/ccmain/paragraphs.cpp
    #	src/ccmain/resultiterator.cpp
    #	src/ccmain/tessedit.cpp
    #	src/ccmain/tesseractclass.h
    #	src/ccmain/thresholder.cpp
    #	src/ccstruct/boxread.cpp
    #	src/ccstruct/coutln.h
    #	src/ccstruct/imagedata.cpp
    #	src/ccstruct/mod128.cpp
    #	src/ccstruct/ocrblock.cpp
    #	src/ccstruct/points.h
    #	src/ccstruct/polyaprx.cpp
    #	src/ccstruct/rect.h
    #	src/ccutil/errcode.h
    #	src/ccutil/genericvector.h
    #	src/ccutil/host.h
    #	src/ccutil/params.cpp
    #	src/ccutil/scanutils.h
    #	src/ccutil/serialis.cpp
    #	src/ccutil/strngs.cpp
    #	src/ccutil/strngs.h
    #	src/ccutil/tessdatamanager.cpp
    #	src/ccutil/tessdatamanager.h
    #	src/ccutil/tprintf.cpp
    #	src/ccutil/tprintf.h
    #	src/ccutil/unicharcompress.cpp
    #	src/ccutil/unicharcompress.h
    #	src/ccutil/unicharset.cpp
    #	src/dict/dawg.cpp
    #	src/dict/permdawg.cpp
    #	src/dict/stopper.cpp
    #	src/dict/trie.cpp
    #	src/lstm/input.cpp
    #	src/lstm/lstmrecognizer.cpp
    #	src/lstm/recodebeam.cpp
    #	src/lstm/series.cpp
    #	src/lstm/tfnetwork.cpp
    #	src/lstm/tfnetwork.h
    #	src/opencl/oclkernels.h
    #	src/opencl/openclwrapper.h
    #	src/textord/blkocc.cpp
    #	src/textord/drawtord.cpp
    #	src/textord/fpchop.cpp
    #	src/textord/makerow.cpp
    #	src/textord/oldbasel.cpp
    #	src/textord/pithsync.cpp
    #	src/textord/pitsync1.cpp
    #	src/textord/strokewidth.cpp
    #	src/textord/topitch.cpp
    #	src/textord/tordmain.cpp
    #	src/textord/tospace.cpp
    #	src/textord/wordseg.cpp
    #	src/training/ambiguous_words.cpp
    #	src/training/classifier_tester.cpp
    #	src/training/cntraining.cpp
    #	src/training/combine_lang_model.cpp
    #	src/training/combine_tessdata.cpp
    #	src/training/common/commandlineflags.cpp
    #	src/training/common/commandlineflags.h
    #	src/training/common/commontraining.cpp
    #	src/training/common/commontraining.h
    #	src/training/common/mastertrainer.h
    #	src/training/dawg2wordlist.cpp
    #	src/training/lstmeval.cpp
    #	src/training/lstmtraining.cpp
    #	src/training/merge_unicharsets.cpp
    #	src/training/mftraining.cpp
    #	src/training/pango/boxchar.cpp
    #	src/training/pango/boxchar.h
    #	src/training/pango/ligature_table.cpp
    #	src/training/pango/pango_font_info.cpp
    #	src/training/pango/pango_font_info.h
    #	src/training/pango/stringrenderer.cpp
    #	src/training/pango/stringrenderer.h
    #	src/training/set_unicharset_properties.cpp
    #	src/training/text2image.cpp
    #	src/training/unicharset/icuerrorcode.cpp
    #	src/training/unicharset/icuerrorcode.h
    #	src/training/unicharset/normstrngs.cpp
    #	src/training/unicharset/unicharset_training_utils.cpp
    #	src/training/unicharset/validate_grapheme.cpp
    #	src/training/unicharset/validate_myanmar.cpp
    #	src/training/unicharset/validator.cpp
    #	src/training/unicharset_extractor.cpp
    #	src/training/wordlist2dawg.cpp
    #	src/viewer/scrollview.h
    GerHobbelt committed Mar 20, 2021
    Configuration menu
    Copy the full SHA
    ce5345b View commit details
    Browse the repository at this point in the history
  2. code inspection: all printf() -> tprintf() + make sure all error mess…

    …ages are prefixed with 'ERROR:' (and warning messages with 'WARNING:') for proper handling and dispatching in caller code (MuPDF-based tools which feed these messages to log file(s))
    GerHobbelt committed Mar 20, 2021
    Configuration menu
    Copy the full SHA
    1fe5cd5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c983073 View commit details
    Browse the repository at this point in the history
  4. tweak: allow Leptonica to yak in debug builds when the severity envir…

    …onment variable has not been set. This modifies the beehaviour as mentioned in commit SHA-1: 55d87f6
    GerHobbelt committed Mar 20, 2021
    Configuration menu
    Copy the full SHA
    33f7878 View commit details
    Browse the repository at this point in the history
  5. CMake: SW_BUILD=OFF everywhere: we don't have SW. (See also same issu…

    …e in Leptonica: I guess someone created or at least *edited* the CMakefiles for both)
    GerHobbelt committed Mar 20, 2021
    Configuration menu
    Copy the full SHA
    d52dba8 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    cc58bdd View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2021

  1. Configuration menu
    Copy the full SHA
    aec1500 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3229481 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    76d2c1f View commit details
    Browse the repository at this point in the history
  4. Merge remote-tracking branch 'remotes/bhfo/master'

    # Conflicts:
    #	src/api/capi.cpp
    #	src/training/common/commandlineflags.cpp
    GerHobbelt committed Mar 21, 2021
    Configuration menu
    Copy the full SHA
    c382ad9 View commit details
    Browse the repository at this point in the history
  5. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Mar 21, 2021
    Configuration menu
    Copy the full SHA
    059ec69 View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2021

  1. Fix SIMD architecture detection logic.

    Originally, this code would have complained if it was ever compiled
    on a platform that didn't support it.
    
    We changed this so that every file could be built on every platform
    for simplicity of build files. Attempting to build (say) an SSE
    file on a platform that didn't support SSE will just compile away
    to nothing.
    
    Unfortunately, while making this change, I didn't remove the
    slightly strange state whereby it would be impossible to build
    without SSE optimisations on a platform that supported them.
    
    To fix this, I've removed the lines.
    robinwatts committed Mar 22, 2021
    Configuration menu
    Copy the full SHA
    1df5db9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0fac00f View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2021

  1. Configuration menu
    Copy the full SHA
    8fa0c71 View commit details
    Browse the repository at this point in the history
  2. Update training simplemake makefiles to correspond to master

    Also fix a missing \ at line ending
    nickjwhite committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    f0bdc7b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b358a6a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d396db9 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    852345e View commit details
    Browse the repository at this point in the history
  6. simplemake: Ensure version.h is generated first, update to c++20, and…

    … ensure all .h files are correctly included in the build
    nickjwhite committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    c211742 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    0e86e1f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    7ae9001 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    7338bcf View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    68ecee5 View commit details
    Browse the repository at this point in the history
  11. Merge remote-tracking branch 'remotes/Artifex/artifex'

    # Conflicts:
    #	src/arch/dotproductavx.cpp
    #	src/arch/dotproductfma.cpp
    #	src/arch/dotproductsse.cpp
    #	src/arch/intsimdmatrixavx2.cpp
    #	src/arch/intsimdmatrixsse.cpp
    GerHobbelt committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    b42bbe8 View commit details
    Browse the repository at this point in the history
  12. Merge commit '205cd32184dfb3b9c4ad28681405babf76dbd7d0'

    # Conflicts:
    #	src/ccmain/paragraphs.cpp
    #	src/dict/trie.cpp
    #	src/training/unicharset/icuerrorcode.h
    #	src/training/unicharset_extractor.cpp
    GerHobbelt committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    b6f022a View commit details
    Browse the repository at this point in the history
  13. Merge commit '7677b80408db08fcd97399b9f462c783dc018962'

    # Conflicts:
    #	abseil
    #	src/api/baseapi.cpp
    #	src/api/pdfrenderer.cpp
    #	src/api/tesseractmain.cpp
    #	src/ccmain/applybox.cpp
    #	src/ccmain/pagesegmain.cpp
    #	src/ccstruct/imagedata.cpp
    #	src/ccstruct/mod128.cpp
    #	src/ccstruct/pageres.cpp
    #	src/lstm/lstmrecognizer.cpp
    #	src/textord/colpartitiongrid.cpp
    #	src/textord/makerow.cpp
    #	src/textord/oldbasel.cpp
    #	src/textord/strokewidth.cpp
    #	src/textord/topitch.cpp
    #	src/training/combine_tessdata.cpp
    #	src/training/mftraining.cpp
    #	src/training/text2image.cpp
    #	src/training/unicharset/lang_model_helpers.cpp
    #	src/training/unicharset/validate_grapheme.cpp
    #	src/training/unicharset/validate_indic.cpp
    #	src/training/unicharset/validate_javanese.cpp
    GerHobbelt committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    e497820 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    3421157 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    81d965c View commit details
    Browse the repository at this point in the history
  16. Merge remote-tracking branch 'remotes/bhfo/master'

    # Conflicts:
    #	src/training/unicharset/lang_model_helpers.cpp
    GerHobbelt committed Mar 29, 2021
    Configuration menu
    Copy the full SHA
    b743c1c View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    7aa3295 View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2021

  1. Configuration menu
    Copy the full SHA
    16dc9d9 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2021

  1. Merge remote-tracking branch 'remotes/Alan-love/master'

    # Conflicts:
    #	src/lstm/input.cpp
    #	src/lstm/lstmrecognizer.cpp
    #	src/viewer/scrollview.h
    GerHobbelt committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    ebc6c2e View commit details
    Browse the repository at this point in the history

Commits on Apr 3, 2021

  1. Configuration menu
    Copy the full SHA
    734055d View commit details
    Browse the repository at this point in the history

Commits on Apr 7, 2021

  1. Fix function GetFirstWords and modernize function GetPrefixes

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Apr 7, 2021
    Configuration menu
    Copy the full SHA
    054dba3 View commit details
    Browse the repository at this point in the history
  2. Modernize code

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Apr 7, 2021
    Configuration menu
    Copy the full SHA
    8404cf1 View commit details
    Browse the repository at this point in the history
  3. Update submodule abseil to tagged release 20210324.0

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Apr 7, 2021
    Configuration menu
    Copy the full SHA
    af5cb05 View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2021

  1. Add linking fix to Leptonica

    Julian Kent committed Apr 8, 2021
    Configuration menu
    Copy the full SHA
    dfe8410 View commit details
    Browse the repository at this point in the history
  2. updated submodules

    GerHobbelt committed Apr 8, 2021
    Configuration menu
    Copy the full SHA
    d731187 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2021

  1. Configuration menu
    Copy the full SHA
    28756df View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2021

  1. Configuration menu
    Copy the full SHA
    856611d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4658aee View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'remotes/stweil/master'

    # Conflicts:
    #	abseil
    GerHobbelt committed Apr 14, 2021
    Configuration menu
    Copy the full SHA
    da0e033 View commit details
    Browse the repository at this point in the history
  4. Merge commit 'f77b1c68814b15d0a2638b17aafb08ca96e26ccd'

    # Conflicts:
    #	include/tesseract/baseapi.h
    #	include/tesseract/capi.h
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Apr 14, 2021
    Configuration menu
    Copy the full SHA
    0735f9d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e199ec7 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9f826ec View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    bffa449 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    4615014 View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2021

  1. Configuration menu
    Copy the full SHA
    986f3a2 View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2021

  1. Configuration menu
    Copy the full SHA
    a59cefa View commit details
    Browse the repository at this point in the history
  2. whitespace

    GerHobbelt committed Apr 22, 2021
    Configuration menu
    Copy the full SHA
    28daea4 View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2021

  1. Configuration menu
    Copy the full SHA
    217f7cf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7440da5 View commit details
    Browse the repository at this point in the history
  3. include small bash shell script to run CMake with the required path d…

    …efines, etc. so we don't have to re-invent that wheel every time around.
    GerHobbelt committed Apr 26, 2021
    Configuration menu
    Copy the full SHA
    46c4517 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5ea3419 View commit details
    Browse the repository at this point in the history
  5. lstmeval: Improve output by ensuring 'Truth:' text is encoded the sam…

    …e as OCR output
    
    This ensures that transformations like unicode normalisation are done on
    the truth output as well as the OCR output, so that you can compare
    the two properly.
    
    Before this a perfect OCR could show different lines for Truth and OCR
    if the OCR output included characters that were normalised.
    nickjwhite committed Apr 26, 2021
    Configuration menu
    Copy the full SHA
    f49bc18 View commit details
    Browse the repository at this point in the history
  6. lstmeval: Improve output by ensuring 'Truth:' text is encoded the sam…

    …e as OCR output
    
    This ensures that transformations like unicode normalisation are done on
    the truth output as well as the OCR output, so that you can compare
    the two properly.
    
    Before this a perfect OCR could show different lines for Truth and OCR
    if the OCR output included characters that were normalised.
    nickjwhite committed Apr 26, 2021
    Configuration menu
    Copy the full SHA
    fb7542a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    018bd6e View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2021

  1. Configuration menu
    Copy the full SHA
    7aba1e4 View commit details
    Browse the repository at this point in the history
  2. lstmeval: Improve output by ensuring 'Truth:' text is encoded the sam…

    …e as OCR output
    
    This ensures that transformations like unicode normalisation are done on
    the truth output as well as the OCR output, so that you can compare
    the two properly.
    
    Before this a perfect OCR could show different lines for Truth and OCR
    if the OCR output included characters that were normalised.
    nickjwhite authored and GerHobbelt committed Apr 27, 2021
    Configuration menu
    Copy the full SHA
    2acaac4 View commit details
    Browse the repository at this point in the history

Commits on Apr 28, 2021

  1. Configuration menu
    Copy the full SHA
    dfcd8e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2703237 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5bb97f9 View commit details
    Browse the repository at this point in the history
  4. Revert "lstmeval: Improve output by ensuring 'Truth:' text is encoded…

    … the same as OCR output"
    
    This reverts commit 2acaac4.
    
    # Conflicts:
    #	src/training/unicharset/lstmtester.cpp
    GerHobbelt committed Apr 28, 2021
    Configuration menu
    Copy the full SHA
    e7acb56 View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2021

  1. Merge remote-tracking branch 'remotes/tesseract-ocr/master'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Apr 29, 2021
    Configuration menu
    Copy the full SHA
    cf9bc7f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    06e6c72 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2549a72 View commit details
    Browse the repository at this point in the history

Commits on May 1, 2021

  1. Configuration menu
    Copy the full SHA
    c5ce25e View commit details
    Browse the repository at this point in the history

Commits on May 5, 2021

  1. Configuration menu
    Copy the full SHA
    b262c6c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8ea7877 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2021

  1. Remove "v" prefix for version in banner

    Instead of printing the version with an additional "v" (which leads to
    results like `Tesseract Open Source OCR Engine vv4.0.0-beta.4`),
    just print the version string in the banner text of Tesseract.
    
    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 6, 2021
    Configuration menu
    Copy the full SHA
    e6c9f77 View commit details
    Browse the repository at this point in the history

Commits on May 10, 2021

  1. Configuration menu
    Copy the full SHA
    1c19a9d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e91de69 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e3a164d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5c12e2d View commit details
    Browse the repository at this point in the history
  5. updated submodules

    GerHobbelt committed May 10, 2021
    Configuration menu
    Copy the full SHA
    447b31c View commit details
    Browse the repository at this point in the history

Commits on May 11, 2021

  1. Configuration menu
    Copy the full SHA
    9c82cc6 View commit details
    Browse the repository at this point in the history
  2. lstmeval: Improve output by ensuring 'Truth:' text is encoded the sam…

    …e way as OCR output
    
    This ensures that transformations like unicode normalisation are done on
    the truth output as well as the OCR output, so that you can compare
    the two properly.
    
    Before this a perfect OCR result could show different lines for Truth and
    OCR if the OCR output included characters that were normalised.
    nickjwhite committed May 11, 2021
    Configuration menu
    Copy the full SHA
    6a2bf21 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2021

  1. Configuration menu
    Copy the full SHA
    c0eb39a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d17a87f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    756c188 View commit details
    Browse the repository at this point in the history
  4. updated submodules

    GerHobbelt committed May 13, 2021
    Configuration menu
    Copy the full SHA
    3ed444c View commit details
    Browse the repository at this point in the history

Commits on May 15, 2021

  1. Fix function GetFirstWords and modernize function GetPrefixes

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 15, 2021
    Configuration menu
    Copy the full SHA
    b04fbb4 View commit details
    Browse the repository at this point in the history
  2. Support Apple Accelerate framework for training and best models

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 15, 2021
    Configuration menu
    Copy the full SHA
    334ac9f View commit details
    Browse the repository at this point in the history
  3. Run unittest CI on push

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 15, 2021
    Configuration menu
    Copy the full SHA
    abef700 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2021

  1. Configuration menu
    Copy the full SHA
    fdea3ae View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    df498d9 View commit details
    Browse the repository at this point in the history

Commits on May 18, 2021

  1. Configuration menu
    Copy the full SHA
    74705f4 View commit details
    Browse the repository at this point in the history

Commits on May 19, 2021

  1. Configuration menu
    Copy the full SHA
    bdf9db9 View commit details
    Browse the repository at this point in the history
  2. Support image width and height larger than 32767

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 19, 2021
    Configuration menu
    Copy the full SHA
    eb8f13b View commit details
    Browse the repository at this point in the history
  3. Fix warnings from LGTM

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 19, 2021
    Configuration menu
    Copy the full SHA
    28809d7 View commit details
    Browse the repository at this point in the history

Commits on May 21, 2021

  1. Merge commit '19db25e5e5d6af3d50b3eb0971b82500630a3531'

    # Conflicts:
    #	abseil
    GerHobbelt committed May 21, 2021
    Configuration menu
    Copy the full SHA
    949f868 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dec75ee View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7622111 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    b952dd4 View commit details
    Browse the repository at this point in the history
  5. Merge remote-tracking branch 'remotes/stweil/master'

    # Conflicts:
    #	src/arch/simddetect.cpp
    GerHobbelt committed May 21, 2021
    Configuration menu
    Copy the full SHA
    780a2ea View commit details
    Browse the repository at this point in the history
  6. Merge remote-tracking branch 'remotes/amitdo/threshold2'

    # Conflicts:
    #	include/tesseract/publictypes.h
    #	src/ccmain/thresholder.cpp
    GerHobbelt committed May 21, 2021
    Configuration menu
    Copy the full SHA
    dac82e3 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    1859d77 View commit details
    Browse the repository at this point in the history
  8. fix warnings about imprecise float constants' conversion: making sure…

    … to write `0.1` as `0.1f`, etc. as these don't exactly map to an IEEE754 float32 value.
    GerHobbelt committed May 21, 2021
    Configuration menu
    Copy the full SHA
    7c44175 View commit details
    Browse the repository at this point in the history

Commits on May 22, 2021

  1. Configuration menu
    Copy the full SHA
    21f45bd View commit details
    Browse the repository at this point in the history
  2. updated submodules

    GerHobbelt committed May 22, 2021
    Configuration menu
    Copy the full SHA
    e4a89fc View commit details
    Browse the repository at this point in the history

Commits on May 23, 2021

  1. updated submodules

    GerHobbelt committed May 23, 2021
    Configuration menu
    Copy the full SHA
    409b537 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2021

  1. Support image width and height larger than 32767

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 25, 2021
    Configuration menu
    Copy the full SHA
    1e5b5af View commit details
    Browse the repository at this point in the history
  2. Fix warnings from LGTM

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 25, 2021
    Configuration menu
    Copy the full SHA
    c380b75 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2021

  1. Configuration menu
    Copy the full SHA
    4787a96 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2021

  1. Fix serialization for new larger coordinates

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed May 28, 2021
    Configuration menu
    Copy the full SHA
    d78b443 View commit details
    Browse the repository at this point in the history

Commits on May 29, 2021

  1. make sure all the tesseract tools are visible to the outside in monol…

    …ithic build mode: the non-supported ones for a given build will simply report that they are NIL operations anyway.
    GerHobbelt committed May 29, 2021
    Configuration menu
    Copy the full SHA
    8546613 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0f6f954 View commit details
    Browse the repository at this point in the history
  3. fix very nasty obscure crashes inside system std::xhash code when exe…

    …cuting Tesseract Init code, loading the 'tesseract_best' English language file(s):
    
    - https://stackoverflow.com/questions/17885060/passing-reference-to-stl-vector-over-dll-boundary
    
    Though we DO NOT cross a DLL boundary with that stuff (all the relevant code is included in one single "monolithic" DLL and none of that C++ stuff got outside!) we still got inexplicable crashes this way.
    
    The KEY to fixing this: you MUST MAKE SURE ALL RELEVANT MSVC PROJECT FILES HAVE THE **EXACT** **SAME** COMPILER SETTINGS: this has now been 'fixed' for Debug/Win32 build mode only as a PoC: we haven't updated all libraries yet, but this was plenty enough to make the basic bulktest run succeed again (instead of crash fatally) when executing mudraw commands, writing to *.ocr.html output files.
    
    Also note another consequence of our C++ compiler settings fiddling:
    
    - https://stackoverflow.com/questions/5004858/why-is-stdmin-failing-when-windows-h-is-included
    
    we applied the `std::max<int>(a, b)` tweak mentioned there instead of looking for the propr place to plonk a NOMINMAX for windows.h as this was faster and easier, also when we consider future compiler settings changing again as we work on our 'update' script for vcxproj files (TODO!)
    GerHobbelt committed May 29, 2021
    Configuration menu
    Copy the full SHA
    07a7567 View commit details
    Browse the repository at this point in the history

Commits on May 30, 2021

  1. Configuration menu
    Copy the full SHA
    beefde5 View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2021

  1. Configuration menu
    Copy the full SHA
    80bfa8c View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'remotes/nickjwhite/lstmevalbetteroutput…

    …' into lstmevalshowconf
    
    # Conflicts:
    #	src/training/unicharset/lstmtester.cpp
    GerHobbelt committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    7507fb1 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'lstmevalshowconf'

    # Conflicts:
    #	src/training/unicharset/lstmtester.cpp
    GerHobbelt committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    c64754a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1883512 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bbf6c5b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0995a90 View commit details
    Browse the repository at this point in the history
  7. Merge remote-tracking branch 'remotes/Shreeshrii/unpack'

    # Conflicts:
    #	src/api/tesseractmain.cpp
    GerHobbelt committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    ebd1837 View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2021

  1. updated submodules

    GerHobbelt committed Jun 12, 2021
    Configuration menu
    Copy the full SHA
    7cc8ebd View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2021

  1. updated submodules

    GerHobbelt committed Jun 16, 2021
    Configuration menu
    Copy the full SHA
    0a4892a View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2021

  1. CI: Replace g++-8 by g++-11 for MacOS

    g++-8 is no longer installed, therefore CI fails for that compiler.
    
    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    a954e39 View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2021

  1. Run unittest CI on push

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    b11626d View commit details
    Browse the repository at this point in the history
  2. Fix function GetFirstWords and modernize function GetPrefixes

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    c10667d View commit details
    Browse the repository at this point in the history
  3. Support Apple Accelerate framework for training and best models

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    664fc83 View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2021

  1. updated submodules

    GerHobbelt committed Jun 20, 2021
    Configuration menu
    Copy the full SHA
    2a0017c View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'remotes/stweil/master'

    # Conflicts:
    #	src/arch/simddetect.cpp
    GerHobbelt committed Jun 20, 2021
    Configuration menu
    Copy the full SHA
    7f5ed28 View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2021

  1. Configuration menu
    Copy the full SHA
    80a4276 View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2021

  1. Support image width and height larger than 32767

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    2ad9722 View commit details
    Browse the repository at this point in the history
  2. Fix warnings from LGTM

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    9c47d71 View commit details
    Browse the repository at this point in the history
  3. Fix serialization for new larger coordinates

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    5a0e815 View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2021

  1. Fix vector resize with init for all elements (issue tesseract-ocr#3473)

    Fixes: c8b8d26
    Fixes: 9710bc0
    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jun 29, 2021
    Configuration menu
    Copy the full SHA
    b836f30 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2021

  1. Run unittest CI on push

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 3, 2021
    Configuration menu
    Copy the full SHA
    ec6b822 View commit details
    Browse the repository at this point in the history
  2. Fix function GetFirstWords and modernize function GetPrefixes

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 3, 2021
    Configuration menu
    Copy the full SHA
    0231ff7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d5ab698 View commit details
    Browse the repository at this point in the history
  4. Add TFloat

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 3, 2021
    Configuration menu
    Copy the full SHA
    f16e6f1 View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2021

  1. Configuration menu
    Copy the full SHA
    7394908 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7189fe1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dd9b988 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ddaadf9 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    be648f8 View commit details
    Browse the repository at this point in the history
  6. Merge remote-tracking branch 'remotes/stweil/master'

    # Conflicts:
    #	src/arch/simddetect.cpp
    GerHobbelt committed Jul 4, 2021
    Configuration menu
    Copy the full SHA
    eb7071a View commit details
    Browse the repository at this point in the history
  7. Merge remote-tracking branch 'remotes/stweil/tfloat'

    # Conflicts:
    #	src/arch/intsimdmatrixavx2.cpp
    #	src/arch/intsimdmatrixsse.cpp
    GerHobbelt committed Jul 4, 2021
    Configuration menu
    Copy the full SHA
    36307a4 View commit details
    Browse the repository at this point in the history
  8. manual merge fixes

    GerHobbelt committed Jul 4, 2021
    Configuration menu
    Copy the full SHA
    0fdcdc5 View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2021

  1. Add TFloat data type for neural network

    Up to now Tesseract used double for training and recognition
    with "best" models.
    
    This commit replaces double by a new data type TFloat which
    is double by default, but float if FAST_FLOAT is defined.
    
    Ideally this should allow faster training.
    
    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 5, 2021
    Configuration menu
    Copy the full SHA
    b77cd22 View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2021

  1. Merge remote-tracking branch 'remotes/stweil/tfloat' into TFloat

    # Conflicts:
    #	src/arch/dotproductavx.cpp
    #	src/arch/intsimdmatrixavx2.cpp
    #	src/arch/intsimdmatrixsse.cpp
    #	src/arch/simddetect.cpp
    #	src/ccutil/tfloat.h
    #	src/lstm/weightmatrix.cpp
    #	src/lstm/weightmatrix.h
    #	unittest/intsimdmatrix_test.cc
    GerHobbelt committed Jul 10, 2021
    Configuration menu
    Copy the full SHA
    ba769d5 View commit details
    Browse the repository at this point in the history
  2. Merge remote-tracking branch 'remotes/StarUI/master'

    # Conflicts:
    #	src/ccstruct/imagedata.cpp
    GerHobbelt committed Jul 10, 2021
    Configuration menu
    Copy the full SHA
    9edb035 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    71c8c3a View commit details
    Browse the repository at this point in the history
  4. fix shadowed variables (MSVC compiler warning about local vars shadow…

    …ing other local vars of the same name)
    GerHobbelt committed Jul 10, 2021
    Configuration menu
    Copy the full SHA
    e0a9b7c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d156480 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0a574a6 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2021

  1. Part 1: redesigned the TFloat approach using templates for the Serial…

    …ization and Deserialization methods. Tested Deserialization with double (i.e. standard, non-optimized) layout: run-time type == storage type.
    GerHobbelt committed Jul 11, 2021
    Configuration menu
    Copy the full SHA
    d397065 View commit details
    Browse the repository at this point in the history
  2. consistent WIN32/WIN64 define check;

    use a dedicated feature check for PRId32 where applicable (printf);
    support long paths on Windows (> 260 chars) by re-defining MAX_PATH to 4096.
    GerHobbelt committed Jul 11, 2021
    Configuration menu
    Copy the full SHA
    301aa3b View commit details
    Browse the repository at this point in the history
  3. reduce use of SSE,AVX,etc. build defines: let the software discover t…

    …he available and enabled features at run-time. Doesn't cost anything and makes the code a little less cluttered with preprocessor checks.
    GerHobbelt committed Jul 11, 2021
    Configuration menu
    Copy the full SHA
    21d5cbb View commit details
    Browse the repository at this point in the history
  4. CContinued work on SHA-1: d397065 --> Part 2: completed the redesign …

    …of the TFloat approach using templates for the Serialization and Deserialization methods. Tested Deserialization with float (i.e. FAST_FLOAT) layout: run-time type (float) << storage type (double).
    
    Also tweaked the SEE/FMA/AVX/AVX2 code sections to use their optimized code while we use TFlaot=float instead of TFloat=double. (WARNING: edited, compiles okay, but has not been field tested yet!)
    GerHobbelt committed Jul 11, 2021
    Configuration menu
    Copy the full SHA
    31de23d View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2021

  1. building the monolithic unit tests. Tweaking the use of DISABLED_LEGA…

    …CY_ENGINE and HAS_LIBICU define's to ensure both the regular monolithic build and the unit tests compile. Abseil is a mess and ditched for now; the LSTM tests have been tweaked to compile and link without obnoxious errors (abseil::StrCat was ditched as its the same as C++ std::string concatenation)
    GerHobbelt committed Jul 12, 2021
    Configuration menu
    Copy the full SHA
    38d777d View commit details
    Browse the repository at this point in the history
  2. fix bugs in tesseract FAST_FLOAT DotProductSEE implementation

    added quick & hacky benchmark code to tesseract-unittests to check the (relative) performance of the various DotProduct implementations.
    GerHobbelt committed Jul 12, 2021
    Configuration menu
    Copy the full SHA
    9c7d4ed View commit details
    Browse the repository at this point in the history

Commits on Jul 13, 2021

  1. Add TFloat data type for neural network

    Up to now Tesseract used double for training and recognition
    with "best" models.
    
    This commit replaces double by a new data type TFloat which
    is double by default, but float if FAST_FLOAT is defined.
    
    Ideally this should allow faster training.
    
    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    59af8dd View commit details
    Browse the repository at this point in the history
  2. Fix some compiler warnings

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    c64ab2e View commit details
    Browse the repository at this point in the history
  3. Optimize DotProductStdInnerProduct for float

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    78871a9 View commit details
    Browse the repository at this point in the history
  4. Avoid double / float conversion

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    1b9e462 View commit details
    Browse the repository at this point in the history
  5. Implement TFloat for IntSimdMatrix

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    93e9022 View commit details
    Browse the repository at this point in the history
  6. Test more implementations of DotProduct

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    00e4283 View commit details
    Browse the repository at this point in the history
  7. Add unittest for dotproduct

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    e2529dd View commit details
    Browse the repository at this point in the history
  8. Support Apple Accelerate framework for training and best models

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    01ae69e View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    abe2d3b View commit details
    Browse the repository at this point in the history
  10. Merge remote-tracking branch 'remotes/stweil/tfloat' into TFloat

    # Conflicts:
    #	src/arch/dotproduct.h
    #	src/arch/dotproductavx.cpp
    #	src/arch/dotproductfma.cpp
    #	src/arch/dotproductsse.cpp
    #	src/arch/intsimdmatrix.h
    #	src/arch/intsimdmatrixavx2.cpp
    #	src/arch/intsimdmatrixneon.cpp
    #	src/arch/intsimdmatrixsse.cpp
    #	src/arch/simddetect.cpp
    #	src/ccutil/tfloat.h
    #	src/lstm/weightmatrix.cpp
    #	unittest/intsimdmatrix_test.cc
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    91d1f34 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    efc7601 View commit details
    Browse the repository at this point in the history
  12. bugfix of FMA port to FAST_FLOAT: 8 float FPs fit in a single 256bit …

    …vector (8x32) (contrasting 4 double FPs: 4*64)
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    30bf263 View commit details
    Browse the repository at this point in the history
  13. Fix TFloat builds for Apple M1

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    a09531a View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    a220999 View commit details
    Browse the repository at this point in the history
  15. Fix DotProductNative for TFloat

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    1a59b6f View commit details
    Browse the repository at this point in the history
  16. Merge branch 'tfloat-patch-4' into TFloat

    # Conflicts:
    #	src/arch/simddetect.cpp
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    b233ed4 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    c114c1b View commit details
    Browse the repository at this point in the history
  18. same as patch-4 (tesseract-ocr#3494) but now with reduced code duplic…

    …ation: for TFloat to work, we don't need to duplicate the integer work functions as it's only the ExtractResults16[8,16] functions that need different implementations for float vs. double. These are therefor common to both implementations:
    
    ```
    static void PartialMatrixDotVector64(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                         int num_in, TFloat *v) {
    
    static void PartialMatrixDotVector32(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                         int num_in, TFloat *v) {
    
    static void PartialMatrixDotVector16(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                         int num_in, TFloat *v) {
    
    static inline void PartialMatrixDotVector8(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                               int num_in, TFloat *v) {
    
    static void matrixDotVector(int dim1, int dim2, const int8_t *wi, const TFloat *scales,
                                const int8_t *u, TFloat *v) {
    ```
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    d2eb7bd View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    adb1c5a View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    2068d61 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    e9de4a2 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    6d01734 View commit details
    Browse the repository at this point in the history
  23. Merge pull request tesseract-ocr#1 from GerHobbelt/tfloat-patch-2

    bugfix of FMA port to FAST_FLOAT: 8 float FPs fit in a single 256bit
    stweil authored Jul 13, 2021
    Configuration menu
    Copy the full SHA
    b3adfdd View commit details
    Browse the repository at this point in the history
  24. - added tfloat float+double DotProduct benchmark for the various inca…

    …ntations: `unittest/tfloat_benchmark.cc`
    
    - working towards float+double co-existence as desired in stweil#2 (comment) using function templates for DRY as per query in stweil#2 (comment)
    - fix typo mistake in OpenMP code. (Probably me earlier this morning, too hurried.)
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    6b59323 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    9cc4a33 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    29e2379 View commit details
    Browse the repository at this point in the history
  27. bugfixing the AVX2 Extract8+16 codes, where there's lines like `__m25…

    …6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    81b69b0 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    d23ec1d View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    5d16bab View commit details
    Browse the repository at this point in the history
  30. bugfixing the AVX2 Extract8+16 codes, where there's lines like `__m25…

    …6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
    GerHobbelt authored and stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    4e3c112 View commit details
    Browse the repository at this point in the history
  31. HMMM. This is where the float/double co-existence stuff starts to bec…

    …ome NOT NICE: code repetition at another level.
    
    TODO: Better idea? --> Maybe namespaces and double kernel projects or compile via #define+#include-all-source-files hack collective source code pages? (Latter approach may become a problem when debugging, or will the compiler suite cope well? Will know only once done & tested.)
    
    At least this is about the point where the function template solution stops to be useful. The run-time switching desire between float and double is doable, but not by using #ifdef/#else throughout, nor templating all the way up the TFloat usage calltree.
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    8d40552 View commit details
    Browse the repository at this point in the history
  32. Reverting so we have a useful and still 'kinda clean' codebase.

    Revert previous commit: "HMMM. This is where the float/double co-existence stuff starts to become NOT NICE: code repetition at another level."
    
    This reverts commit 8d40552.
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    603831b View commit details
    Browse the repository at this point in the history
  33. Merge branch 'tfloat-AVX-SSE-etc' into TFloat

    # Conflicts:
    #	src/arch/dotproductsse.cpp
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    160949a View commit details
    Browse the repository at this point in the history
  34. Merge branch 'tfloat-patch-4' into TFloat

    # Conflicts:
    #	src/arch/intsimdmatrixavx2.cpp
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    02d94bc View commit details
    Browse the repository at this point in the history
  35. Improve build code for native dotproduct

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    15f7549 View commit details
    Browse the repository at this point in the history
  36. Enhance unittest/dotproduct_test

    Signed-off-by: Stefan Weil <[email protected]>
    stweil committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    3eae6d7 View commit details
    Browse the repository at this point in the history
  37. Merge remote-tracking branch 'remotes/stweil/tfloat' into TFloat

    # Conflicts:
    #	src/arch/dotproduct.cpp
    #	src/arch/dotproductsse.cpp
    #	src/arch/intsimdmatrixavx2.cpp
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    f32b9de View commit details
    Browse the repository at this point in the history
  38. Looks like defined(_OPENMP) is what's known in the MSVC(2019) world…

    …: added that one as another enabling condition since benchmarks have shown MSVC2019's `/openmp:experimental` to deliver. :-) (See tesseract-ocr#3486 benchmark reports on @stweil's DotProductNative() implementation)
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    a5d45b9 View commit details
    Browse the repository at this point in the history
  39. Looks like defined(_OPENMP) is what's known in the MSVC(2019) world…

    …: added that one as another enabling condition since benchmarks have shown MSVC2019's `/openmp:experimental` to deliver. :-) (See tesseract-ocr#3486 benchmark reports on @stweil's DotProductNative() implementation)
    GerHobbelt committed Jul 13, 2021
    Configuration menu
    Copy the full SHA
    d025c78 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    44a8f41 View commit details
    Browse the repository at this point in the history