Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent self-termination of script using regex. #8330

Closed
p5pRT opened this issue Feb 8, 2006 · 7 comments
Closed

Silent self-termination of script using regex. #8330

p5pRT opened this issue Feb 8, 2006 · 7 comments

Comments

@p5pRT
Copy link

p5pRT commented Feb 8, 2006

Migrated from rt.perl.org#38470 (status was 'resolved')

Searchable as RT38470$

@p5pRT
Copy link
Author

p5pRT commented Feb 8, 2006

From [email protected]

The following script self terminates silently. No segfault or warning or Panic.
On my system it occurs when the repetition count moves above 21165.

On someone elses Win32 system it aborts at around 31300.

#! perl -w
use strict;
#use re 'debug';

my $str = 'a' x 100000;
my $char = 'a';

for my $n ( map{ $_ * 1000 + 165 } 1 .. 30 ) {
  pos($str)=0;
  $str =~ /(?​:.*?$char){$n}/g;
  print "Pos of '$char' #$n in \$str is ", pos($str), "\n";
}

__END__
C​:\test>junk
Pos of 'a' #1165 in $str is 1165
Pos of 'a' #2165 in $str is 2165
Pos of 'a' #3165 in $str is 3165
Pos of 'a' #4165 in $str is 4165
Pos of 'a' #5165 in $str is 5165
Pos of 'a' #6165 in $str is 6165
Pos of 'a' #7165 in $str is 7165
Pos of 'a' #8165 in $str is 8165
Pos of 'a' #9165 in $str is 9165
Pos of 'a' #10165 in $str is 10165
Pos of 'a' #11165 in $str is 11165
Pos of 'a' #12165 in $str is 12165
Pos of 'a' #13165 in $str is 13165
Pos of 'a' #14165 in $str is 14165
Pos of 'a' #15165 in $str is 15165
Pos of 'a' #16165 in $str is 16165
Pos of 'a' #17165 in $str is 17165
Pos of 'a' #18165 in $str is 18165
Pos of 'a' #19165 in $str is 19165
Pos of 'a' #20165 in $str is 20165
Pos of 'a' #21165 in $str is 21165

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.6:

Configured by ActiveState at Mon Dec 13 09:51:32 2004.

Summary of my perl5 (revision 5 version 8 subversion 6) configuration:
  Platform:
    osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread
    uname=''
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    usethreads=define use5005threads=undef useithreads=define 
usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cl', ccflags ='-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -
DNO_STRICT -DHAVE_DES_FCRYPT  -DNO_HASH_SEED -DPERL_IMPLICIT_CONTEXT -
DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX',
    optimize='-MD -Zi -DNDEBUG -O1',
    cppflags='-DWIN32'
    ccversion='', gccversion='', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', 
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf  -libpath:"C:
\Perl\lib\CORE"  -machine:x86'
    libpth=\lib
    libs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib  
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib 
uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib 
msvcrt.lib
    perllibs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib  
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib  netapi32.lib 
uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib odbc32.lib odbccp32.lib 
msvcrt.lib
    libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib
    gnulibc_version='undef'
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf  -
libpath:"C:\Perl\lib\CORE"  -machine:x86'

Locally applied patches:
    ACTIVEPERL_LOCAL_PATCHES_ENTRY
    21540 Fix backward-compatibility issues in if.pm
    23565 Wrong MANIFEST.SKIP


@INC for perl v5.8.6:
    C:/Perl/lib
    C:/Perl/site/lib
    .


Environment for perl v5.8.6:
    HOME (unset)
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=C:\Perl\bin\;C:\PROGRA~1\COMMON~1\GTK\2.0\bin;c:\ruby\bin;c:\cl;C:
\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\UnixTools;c:\Mozart\bin;c:\ghc
\ghc-6.4\bin;c:\svn\bin;c:\parrot;c:\dmd\bin;c:\dm\bin;;C:\Program Files\Common 
Files\GTK\2.0\bin;C:\OCaml\Objective Caml\bin;C:\OCaml\bin;
    PERL_BADLANG (unset)
    SHELL (unset)




@p5pRT
Copy link
Author

p5pRT commented Feb 18, 2006

From @nwc10

On Wed, Feb 08, 2006 at 03​:15​:30PM -0800, nigelsandever @​ btconnect. com wrote​:

The following script self terminates silently. No segfault or warning or Panic.
On my system it occurs when the repetition count moves above 21165.

On someone elses Win32 system it aborts at around 31300.

#! perl -w
use strict;
#use re 'debug';

my $str = 'a' x 100000;
my $char = 'a';

for my $n ( map{ $_ * 1000 + 165 } 1 .. 30 ) {
pos($str)=0;
$str =~ /(?​:.*?$char){$n}/g;
print "Pos of '$char' #$n in \$str is ", pos($str), "\n";
}

__END__

Curiously blead on FreeBSD under valgrind it terminates without any perl
error, or valgrind errors about unitialised reads etc.
Without valgrind it exits with a SIGBUS

The stack trace under gdb looks like this​:

#245 0x08171e24 in S_regmatch (my_perl=0x81f0000, prog=0x81ee258)
  at regexec.c​:4026
#246 0x0816f4cd in S_regmatch (my_perl=0x81f0000, prog=0x81ee264)
  at regexec.c​:3520
#247 0x08171e24 in S_regmatch (my_perl=0x81f0000, prog=0x81ee258)
  at regexec.c​:4026
#248 0x0816f4cd in S_regmatch (my_perl=0x81f0000, prog=0x81ee264)
  at regexec.c​:3520

etc

Line 3520 is the call to regmatch in​:

  if (n < cc->min) {
  cc->cur = n;
  cc->lastloc = locinput;
  if (regmatch(cc->scan))
  sayYES;
  cc->cur = n - 1;
  cc->lastloc = lastloc;
  sayNO;
  }

Line 4026 is the call to regmatch in the TRYPAREN in​:

  /* PL_reginput == old now */
  if (locinput != old) {
  ln = 1; /* Did some */
  if (regrepeat(scan, count) < count)
  sayNO;
  }
  /* PL_reginput == locinput now */
  TRYPAREN(paren, ln, locinput);
  PL_reginput = locinput; /* Could be reset... */
  REGCP_UNWIND(lastcp);
  /* Couldn't or didn't -- move forward. */
  old = locinput;
  if (do_utf8)
  locinput += UTF8SKIP(locinput);
  else
  locinput++;
  count = 1;

I don't know why it's blowing the stack like this.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Feb 18, 2006

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2006

From [email protected]

2006-02-18 16​:48​:59, "Nicholas Clark via RT" <perlbug-followup@​perl.org> wrote​:

On Wed, Feb 08, 2006 at 03​:15​:30PM -0800, nigelsandever @​ btconnect. com wrote​:

The following script self terminates silently. No segfault or warning or
Panic.
On my system it occurs when the repetition count moves above 21165.

On someone elses Win32 system it aborts at around 31300.

#! perl -w
use strict;
#use re 'debug';

my $str = 'a' x 100000;
my $char = 'a';

for my $n ( map{ $_ * 1000 + 165 } 1 .. 30 ) {
pos($str)=0;
$str =~ /(?​:.*?$char){$n}/g;
print "Pos of '$char' #$n in \$str is ", pos($str), "\n";
}

__END__

Curiously blead on FreeBSD under valgrind it terminates without any perl
error, or valgrind errors about unitialised reads etc.
Without valgrind it exits with a SIGBUS

The stack trace under gdb looks like this​:

#245 0x08171e24 in S_regmatch (my_perl=0x81f0000, prog=0x81ee258)
at regexec.c​:4026
#246 0x0816f4cd in S_regmatch (my_perl=0x81f0000, prog=0x81ee264)
at regexec.c​:3520
#247 0x08171e24 in S_regmatch (my_perl=0x81f0000, prog=0x81ee258)
at regexec.c​:4026
#248 0x0816f4cd in S_regmatch (my_perl=0x81f0000, prog=0x81ee264)
at regexec.c​:3520

etc

Line 3520 is the call to regmatch in​:

           if \(n \< cc\->min\) \{
               cc\->cur = n;
               cc\->lastloc = locinput;
               if \(regmatch\(cc\->scan\)\)
                   sayYES;
               cc\->cur = n \- 1;
               cc\->lastloc = lastloc;
               sayNO;
           \}

Line 4026 is the call to regmatch in the TRYPAREN in​:

                   /\* PL\_reginput == old now \*/
                   if \(locinput \!= old\) \{
                       ln = 1;     /\* Did some \*/
                       if \(regrepeat\(scan\, count\) \< count\)
                           sayNO;
                   \}
                   /\* PL\_reginput == locinput now \*/
                   TRYPAREN\(paren\, ln\, locinput\);
                   PL\_reginput = locinput; /\* Could be reset\.\.\. \*/
                   REGCP\_UNWIND\(lastcp\);
                   /\* Couldn't or didn't \-\- move forward\. \*/
                   old = locinput;
                   if \(do\_utf8\)
                       locinput \+= UTF8SKIP\(locinput\);
                   else
                       locinput\+\+;
                   count = 1;

I don't know why it's blowing the stack like this.

Nicholas Clark

Nichlolas,

Whether this is useful or not I'm not sure, but if I run the snippet under a
debugger I get​:

Perl.exe has encountered an exception code of -1073741571 (0xC00000FD)at
0x2805B3A1.

and it shows a stack overflow​:

PID​: 1932 (0x78C) TID​: 1552 (0x610) Exception​: STACK_OVERFLOW Address​: 2805B3A1

and the stack trace lists 42340 calls to Perl_regexec_flags(), which is pretty
close to 2 times the 21165 I mentioned in the original report. The last 3 of
which are listed as​:

PID​: 656 TID​: 1812 - Stack Contents for 0x2805B3A1
0x2805B3A1​: perl58.dll​:Perl_regexec_flags + 0x0CED
0x2805DA71​: perl58.dll​:Perl_regexec_flags + 0x33BD
0x2805CB80​: perl58.dll​:Perl_regexec_flags + 0x24CC

The last two of whihc repeat 43300 times until

0x2805DA71​: perl58.dll​:Perl_regexec_flags + 0x33BD
0x2805CB80​: perl58.dll​:Perl_regexec_flags + 0x24CC
0x2805CAE3​: perl58.dll​:Perl_regexec_flags + 0x242F
0x2805B368​: perl58.dll​:Perl_regexec_flags + 0x0CB4
0x2803E54C​: perl58.dll​:Perl_sv_compile_2op + 0x4E60
0x2805FE6E​: perl58.dll​:Perl_runops_standard + 0x000C
0x2808A5A0​: perl58.dll​:RunPerl + 0x0086
0x00401012​: perl.exe+0x00001012
0x77E814C7​: GetCurrentDirectoryW + 0x0044

Which looks to me like runaway recursion, but the source doesn't show any direct
recursion, so it could be that the stack trace just refects corruption.

If I can help in any way, tell me what I can do?

Cheers, njs

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2006

From @nwc10

On Mon, Feb 20, 2006 at 09​:56​:46PM -0000, Nigel Sandever wrote​:

Which looks to me like runaway recursion, but the source doesn't show any direct
recursion, so it could be that the stack trace just refects corruption.

If I can help in any way, tell me what I can do?

I'm afraid I can't help with any suggestions here. Having identified it as
runaway recursion in the regexp engine, I know that knowing how to find the
fix is well beyond my current knowledge.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2006

From @smpeters

[nigelsandever@​btconnect.com - Wed Feb 08 15​:15​:30 2006]​:

To​: perlbug@​perl.org
Subject​: Silent self-termination of script using regex.
Cc​: support@​ActiveState.com
Reply-To​: c52qlbm02 @​ sneakemail . com
Message-Id​: <5.8.6_548_1139438697@​D4KG9X0J>

This is a bug report for perl from c52qlbm02 @​ sneakemail . com,
generated with the help of perlbug 1.35 running under perl v5.8.6.

-----------------------------------------------------------------
[Please enter your report here]

The following script self terminates silently. No segfault or warning
or Panic.
On my system it occurs when the repetition count moves above 21165.

On someone elses Win32 system it aborts at around 31300.

#! perl -w
use strict;
#use re 'debug';

my $str = 'a' x 100000;
my $char = 'a';

for my $n ( map{ $_ * 1000 + 165 } 1 .. 30 ) {
pos($str)=0;
$str =~ /(?​:.*?$char){$n}/g;
print "Pos of '$char' #$n in \$str is ", pos($str), "\n";
}

__END__
C​:\test>junk
Pos of 'a' #1165 in $str is 1165
Pos of 'a' #2165 in $str is 2165
Pos of 'a' #3165 in $str is 3165
Pos of 'a' #4165 in $str is 4165
Pos of 'a' #5165 in $str is 5165
Pos of 'a' #6165 in $str is 6165
Pos of 'a' #7165 in $str is 7165
Pos of 'a' #8165 in $str is 8165
Pos of 'a' #9165 in $str is 9165
Pos of 'a' #10165 in $str is 10165
Pos of 'a' #11165 in $str is 11165
Pos of 'a' #12165 in $str is 12165
Pos of 'a' #13165 in $str is 13165
Pos of 'a' #14165 in $str is 14165
Pos of 'a' #15165 in $str is 15165
Pos of 'a' #16165 in $str is 16165
Pos of 'a' #17165 in $str is 17165
Pos of 'a' #18165 in $str is 18165
Pos of 'a' #19165 in $str is 19165
Pos of 'a' #20165 in $str is 20165
Pos of 'a' #21165 in $str is 21165

It sounds like you were getting some sort of silent GPF or something
related. This problem seems to have been resolved with change #27598.

steve@​kirk​:~/smoke/perl-current$ perl rt_38470.pl
Pos of 'a' #1165 in $str is 1165
Segmentation fault (core dumped)
steve@​kirk​:~/smoke/perl-current$ ./perl -Ilib rt_38470.pl
Pos of 'a' #1165 in $str is 1165
Pos of 'a' #2165 in $str is 2165
Pos of 'a' #3165 in $str is 3165
Pos of 'a' #4165 in $str is 4165
Pos of 'a' #5165 in $str is 5165
Pos of 'a' #6165 in $str is 6165
Pos of 'a' #7165 in $str is 7165
Pos of 'a' #8165 in $str is 8165
Pos of 'a' #9165 in $str is 9165
Pos of 'a' #10165 in $str is 10165
Pos of 'a' #11165 in $str is 11165
Pos of 'a' #12165 in $str is 12165
Pos of 'a' #13165 in $str is 13165
Pos of 'a' #14165 in $str is 14165
Pos of 'a' #15165 in $str is 15165
Pos of 'a' #16165 in $str is 16165
Pos of 'a' #17165 in $str is 17165
Pos of 'a' #18165 in $str is 18165
Pos of 'a' #19165 in $str is 19165
Pos of 'a' #20165 in $str is 20165
Pos of 'a' #21165 in $str is 21165
Pos of 'a' #22165 in $str is 22165
Pos of 'a' #23165 in $str is 23165
Pos of 'a' #24165 in $str is 24165
Pos of 'a' #25165 in $str is 25165
Pos of 'a' #26165 in $str is 26165
Pos of 'a' #27165 in $str is 27165
Pos of 'a' #28165 in $str is 28165
Pos of 'a' #29165 in $str is 29165
Pos of 'a' #30165 in $str is 30165

@p5pRT
Copy link
Author

p5pRT commented Mar 29, 2006

@smpeters - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant