Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segmentation fault when extracting ts file #1234

Closed
jamoore5 opened this issue Feb 19, 2020 · 26 comments
Closed

[BUG] Segmentation fault when extracting ts file #1234

jamoore5 opened this issue Feb 19, 2020 · 26 comments

Comments

@jamoore5
Copy link

CCExtractor version: 0.88

Necessary information

  • Is this a regression (i.e. did it work before)? NO, First time using the software
  • What platform did you use? Linux
  • What were the used arguments? -out=spupng -quiet

Video links

https://drive.google.com/open?id=1xs7GyYPR-DPd75CP3XJLj7CPPdVJHAVp

Additional information

!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209
Segmentation fault

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

I can't reproduce the issue on master and on 0.88.

@jamoore5
Copy link
Author

jamoore5 commented Feb 19, 2020

I am on a raspberry pi 4 if that helps troubleshoot why I am getting this error.

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

I am on a raspberry pi 4 if that helps troubleshoot why I am getting this error.

How did you install ccextractor?
Did you build it yourself or is it included in the repos (if yes please mention the distribution you're using) or something else?

@jamoore5
Copy link
Author

jamoore5 commented Feb 19, 2020

I followed this tutorial https://magpi.raspberrypi.org/articles/make-comics-from-tv-recordings

cd
git clone https://github.com/CCExtractor/ccextractor.git
sudo apt install -y libglfw3-dev cmake gcc libcurl4-gnutls-dev tesseract-ocr tesseract-ocr-dev libleptonica-dev
cd ccextractor/linux
./build
sudo mv ./ccextractor /usr/local/bin/

However, I had to remove tesseract-ocr-dev from the install

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

However, I had to remove tesseract-ocr-dev from the install

Why? tesseract-ocr-dev (AFAIK) is required to have OCR support.

@jamoore5
Copy link
Author

sudo apt install -y tesseract-ocr-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package tesseract-ocr-dev is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
  libtesseract-dev

E: Package 'tesseract-ocr-dev' has no installation candidate

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

Could you try with the libtesseract-dev package instead? (They changed the name of the package)

@jamoore5
Copy link
Author

installed it but still getting the same error

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

installed it but still getting the same error

segfault? Also from what I can tell, the .ts file you're using doesn't have any subtitles.

Once you've installed libtesseract-dev, you need to rebuild ccextractor.

@MatejMecka
Copy link
Contributor

  Duration: 00:43:07.59, start: 0.400000, bitrate: 1005 kb/s
  Program 1 
    Metadata:
      service_name    : 
      service_provider: 
    Stream #0:0[0x12c](eng): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
    Stream #0:1[0xc8](eng): Audio: mp2 ([3][0][0][0] / 0x0003), 44100 Hz, stereo, fltp, 128 kb/s
    Stream #0:2[0x64]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, bt470bg/bt470bg/bt709, progressive), 720x480 [SAR 32:27 DAR 16:9], 25 fps, 25 tbr, 90k tbn, 50 tbc

Both VLC and ffmpeg detect there is a subtitle stream but when playing through VLC no subtitles were shown.

@jamoore5
Copy link
Author

Also from what I can tell, the .ts file you're using doesn't have any subtitles.
Thanks, I will investigate what went wrong in the convert. But the last one I had with burned-in subtitles it did not error, just no output. In VLC, I see the subtitle file but it does not show up when played. I will investigate what went wrong with the convert.

just thought the error was strange, but my file is definitely bad.

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 19, 2020

just thought the error was strange, but my file is definitely bad.

I don't have an error though. It shouldn't seg fault.

@canihavesomecoffee
Copy link
Member

But why is that assert being triggered? I think that might be the cause of that segfault :)

@cfsmp3
Copy link
Contributor

cfsmp3 commented Feb 19, 2020

I can't reproduce on a Raspberry either.

No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 8.0 (jessie)
Release:        8.0
Codename:       jessie
$ ./ccextractor -out=spupng  S3E01.ts
CCExtractor 0.88, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: S3E01.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .xml] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: S3E01.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode

Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Min PTS:                                00:00:00:400
Max PTS:                                00:00:00:400
Length:                          00:00:00:000
Done, processing time = 5 seconds
$ ./ccextractor --version
CCExtractor 0.88, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
CCExtractor detailed version info
        Version: 0.88
        Git commit: 1b17a04b25dbd6e7306d8cb499b7a28515ce6480
        Compilation date: 2020-02-19
        File SHA256: 80ed2bcab1204a5f05be7d50f9a5902a93bb592d7859c360d7f67651f9cee1eb
Libraries used by CCExtractor
        Tesseract Version: 3.03
        Leptonica Version: leptonica-1.71
        libGPAC Version: 0.7.2-DEV
        zlib: 1.2.11
        utf8proc Version: 2.4.0
        protobuf-c Version: 1.3.1
        libpng Version: 1.6.35
        FreeType
        libhash
        nuklear
        libzvbi

@jamoore5
Copy link
Author

jamoore5 commented Feb 20, 2020

To compare

ccextractor -out=spupng  S3E01.ts
CCExtractor 0.88, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: S3E01.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .xml] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: S3E01.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209
Segmentation fault


ccextractor --version
CCExtractor 0.88, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
CCExtractor detailed version info
	Version: 0.88
	Git commit: 1b17a04b25dbd6e7306d8cb499b7a28515ce6480
	Compilation date: 2020-02-20
	File SHA256: Could not open file
Libraries used by CCExtractor
	Tesseract Version: 4.0.0
	Leptonica Version: leptonica-1.76.0
	libGPAC Version: 0.7.2-DEV
	zlib: 1.2.11
	utf8proc Version: 2.4.0
	protobuf-c Version: 1.3.1
	libpng Version: 1.6.35
	FreeType
	libhash
	nuklear
	libzvbi

Also any tips on how to convert am mp4 with subtitles to a ts with subtitles?

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 21, 2020

Seems to segfault inside of tesseract.

@cfsmp3
Copy link
Contributor

cfsmp3 commented Feb 21, 2020

@jamoore5 Run it with valgrind, maybe it will give info some clue. Also note that you are using tesseract 4, not tesseract 3 which is really like to be the reason we're seeing different things.

valgrind ccextractor -out=spupng  S3E01.ts

Would tell us more.

@NilsIrl Yes, but let's assume it's our code causing that segfault somehow :-)

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 21, 2020

@NilsIrl Yes, but let's assume it's our code causing that segfault somehow :-)

Yes, I was mentioning that so that someone didn't went looking for baseapi.cpp without ever finding it.

@jamoore5
Copy link
Author

@cfsmp3 I got the following output then it hang without exiting

valgrind ccextractor -out=spupng  S3E01.ts
==9483== Memcheck, a memory error detector
==9483== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==9483== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==9483== Command: ccextractor -out=spupng S3E01.ts
==9483==
--9483-- WARNING: Serious error when reading debug info
--9483-- When reading debug info from /lib/arm-linux-gnueabihf/ld-2.28.so:
--9483-- Ignoring non-Dwarf2/3/4 block in .debug_info
--9483-- WARNING: Serious error when reading debug info
--9483-- When reading debug info from /lib/arm-linux-gnueabihf/ld-2.28.so:
--9483-- Last block truncated in .debug_info; ignoring
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401A5D0: index (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401A5D4: index (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4008040: _dl_dst_count (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4008288: expand_dynamic_string_token (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401AA80: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401AA84: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4017F68: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4017F74: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B5E8: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B608: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B618: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B634: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B63C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B664: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B664: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B68C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6A0: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6A4: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6B0: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x40180A4: calloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4017FA8: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401A160: mmap (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Syscall param mmap2(start) contains uninitialised byte(s)
==9483==    at 0x401A174: mmap (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Syscall param mmap2(length) contains uninitialised byte(s)
==9483==    at 0x401A174: mmap (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Syscall param mmap2(offset) contains uninitialised byte(s)
==9483==    at 0x401A174: mmap (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4017F44: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x400BDD0: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B5F4: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B630: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BD7C: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BD98: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401AA14: strdup (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B660: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B660: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B688: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4008E20: _dl_map_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Syscall param openat(filename) contains uninitialised byte(s)
==9483==    at 0x4019F4C: __open64_nocancel (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x40180E4: free (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x400BB84: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BB9C: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BBBC: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BBC0: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BBE0: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BC50: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BC64: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BCA8: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BCC0: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401AA30: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401AA48: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B628: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400BD9C: _dl_new_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4005D98: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005DC4: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005DD0: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005E50: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005E9C: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005EA0: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400602C: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x40060C0: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x40060DC: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4006114: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4006A0C: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x40061D4: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4010FF4: _dl_name_match_p (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4011008: _dl_name_match_p (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401A620: strcmp (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4010FFC: _dl_name_match_p (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4013660: _dl_get_origin (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B690: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B694: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B6B0: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B6B4: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4013674: _dl_get_origin (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x400832C: expand_dynamic_string_token (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x40082E4: expand_dynamic_string_token (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4008114: _dl_dst_substitute (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401A67C: strcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4008198: _dl_dst_substitute (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B65C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B65C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B684: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x400926C: _dl_map_object (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B7A0: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4006630: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x400664C: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B218: memmove (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B2A8: memmove (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B2D0: memmove (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x4006668: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B900: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B850: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B6AC: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6A8: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6AC: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B6B4: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B668: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B668: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B66C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B66C: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x4017FF0: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483==
==9483== More than 100 errors detected.  Subsequent errors
==9483== will still be recorded, but in less detail than before.
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B658: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B658: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Use of uninitialised value of size 4
==9483==    at 0x401B680: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
==9483== Conditional jump or move depends on uninitialised value(s)
==9483==    at 0x401B648: memcpy (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==
--9483-- WARNING: Serious error when reading debug info
--9483-- When reading debug info from /lib/arm-linux-gnueabihf/libm-2.28.so:
--9483-- Ignoring non-Dwarf2/3/4 block in .debug_info
--9483-- WARNING: Serious error when reading debug info
--9483-- When reading debug info from /lib/arm-linux-gnueabihf/libm-2.28.so:
--9483-- Last block truncated in .debug_info; ignoring
==9483== Use of uninitialised value of size 4
==9483==    at 0x4005FCC: _dl_map_object_from_fd (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==9483==

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 21, 2020

I can't reproduce on x86 (NixOS) with tesseract 4.1.0 (in addition to tesseract 3). I'd expect valgrind to give a backtrace.

Can you try to compile with linux/builddebug instead of linux/build and rerun with valgrind? (nevermind, linux/build also gives a backtrace so the problem seems to be elsewhere.)

@canihavesomecoffee
Copy link
Member

@jamoore5 The triggered assert looks similar to tesseract-ocr/tesseract#1670.

Can you try to explicitely set the LANG variable to "C" before running CCExtractor?

@jamoore5
Copy link
Author

I installed my packages for tesseract, and installed the binaries which seemed to have fixed the issue.

however I got the latest version

tesseract -v
tesseract 5.0.0-alpha
 leptonica-1.76.0
  libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found OpenMP 201511
 Found libcurl/7.64.0 GnuTLS/3.6.7 zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.5) libssh2/1.8.0 nghttp2/1.36.0 librtmp/2.3

and ccextractor still sees it as 4.0.0

CCExtractor detailed version info
	Version: 0.88
	Git commit: 1b17a04b25dbd6e7306d8cb499b7a28515ce6480
	Compilation date: 2020-02-20
	File SHA256: Could not open file
Libraries used by CCExtractor
	Tesseract Version: 4.0.0
	Leptonica Version: leptonica-1.76.0
	libGPAC Version: 0.7.2-DEV
	zlib: 1.2.11
	utf8proc Version: 2.4.0
	protobuf-c Version: 1.3.1
	libpng Version: 1.6.35
	FreeType
	libhash
	nuklear
	libzvbi

@NilsIrl
Copy link
Contributor

NilsIrl commented Feb 22, 2020

ccextractor still sees it as 4.0.0

Could it be that you have both versions installed?

@cfsmp3
Copy link
Contributor

cfsmp3 commented Feb 22, 2020

Labeling as medium because we can't reproduce.

@shikharmn
Copy link

Hey!
I would like to contribute to this issue. Can someone give me a direction as to where to start?

@canihavesomecoffee
Copy link
Member

@cfsmp3 Closing this (reopen if you don't agree).

Original poster fixed the issue by installing tesseract 5.0 (this is also mentioned here: tesseract-ocr/tesseract#1670 (comment)).

I encountered this on the SP (Tesseract 4.0.0), and there it's fixed by exporting LC_ALL.

export LC_ALL=C

So either we can add this in our README/compilation guide somewhere, but there's nothing to fix in our code I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants