Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SET and $INCREMENT() work correctly without abnormal process termination in a very rare case #217

Closed
nars1 opened this issue Apr 27, 2018 · 0 comments
Assignees
Milestone

Comments

@nars1
Copy link
Collaborator

nars1 commented Apr 27, 2018

Final Release Note

SET and $INCREMENT() operations on a global variable work correctly without abnormal termination in a very rare case. Previously, it was possible a process doing the operation to terminate abnormally with a SIG-11. This was only observed in internal testing, and there was no risk of database damage from this issue. (#217)

Description

A test failure (in in-house testing) exposed a longstanding rare issue in the database code. It is possible for a process to terminate abnormally with a SIG-11 while updating a database (SET or $INCR on a gvn) in an environment where concurrent processes are updating the same database. While the process will terminate with a core in these extremely rare circumstances, the database is guaranteed to be clean (i.e. no database damage). Nevertheless, this is an issue that needs to be fixed.

Draft Release Note

SET and $INCR operations on a gvn work correctly in YottaDB. Previously, it was possible in very rare situations for a process doing the SET/$INCR to terminate abnormally with a SIG-11. This was only observed in internal testing. There was no risk of database damage due to this issue.

@nars1 nars1 added this to the r130 milestone Apr 27, 2018
@nars1 nars1 self-assigned this Apr 27, 2018
nars1 added a commit to nars1/YottaDB that referenced this issue Apr 30, 2018
nars1 added a commit to nars1/YottaDB that referenced this issue Apr 30, 2018
…use SIG-11 otherwise)

Below is a test case that demonstrates the SIG-11 TERM1 and TERM2 are two terminals.

TERM1: > rm mumps.gld mumps.dat
TERM1: > setenv ydb_gbldir mumps.gld
TERM1: > $ydb_dist/gde exit
TERM1: > $ydb_dist/mupip create
TERM1: > $ydb_dist/mumps -run x
TERM1: > gdb $ydb_dist/mumps
TERM1: (gdb) b gvcst_expand_prev_key
       Function "gvcst_expand_prev_key" not defined.
       Make breakpoint pending on future shared library load? (y or [n]) y
       Breakpoint 1 (gvcst_expand_prev_key) pending.
TERM1: (gdb) r -run onemore^x
       Starting program: mumps -run onemore^x
       [Thread debugging using libthread_db enabled]
       Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
       Breakpoint 1, gvcst_expand_prev_key (pStat=0x6ca8c8, srch_key=0x62c040, exp_key=0x62b840) at sr_port/gvcst_expand_key.h:40
       40 {

Now switch to TERM2 terminal

TERM2: > setenv ydb_gbldir mumps.gld
TERM2: > mumps -run kill^x

Now switch back to TERM1 terminal

TERM1: (gdb) cont
       Continuing.
       Program received signal SIGSEGV, Segmentation fault.
       0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       2374                                    n = (int)((sm_long_t)*cp2 - (sm_long_t)*cp1);
       (gdb) where
       #0  0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       #1  0x00007ffff6be5cd7 in gvcst_put (val=0x62e860) at sr_port/gvcst_put.c:300
       #2  0x00007ffff6cfc930 in op_gvput (var=0x62e860) at sr_port/op_gvput.c:74
       YottaDB#3  0x00007ffff7ff61c8 in ?? ()
       YottaDB#4  0x00007ffff7373490 in ?? () from libyottadb.so
       YottaDB#5  0x00007fffffff9860 in ?? ()
       YottaDB#6  0x0000000000000000 in ?? ()

And you see the SIG-11 without the code fixes in this commit.

> cat x.m
init    ;
        for i=1:1:825 set ^x(i)=i
        quit
kill    ;
        kill ^x(825)
        set i=826,^x(i)=""
        quit
onemore ;
        tstart ():serial
        set ^x(826)=""
        tcommit
        quit
nars1 added a commit that referenced this issue May 1, 2018
…-11 otherwise)

Below is a test case that demonstrates the SIG-11 TERM1 and TERM2 are two terminals.

TERM1: > rm mumps.gld mumps.dat
TERM1: > setenv ydb_gbldir mumps.gld
TERM1: > $ydb_dist/gde exit
TERM1: > $ydb_dist/mupip create
TERM1: > $ydb_dist/mumps -run x
TERM1: > gdb $ydb_dist/mumps
TERM1: (gdb) b gvcst_expand_prev_key
       Function "gvcst_expand_prev_key" not defined.
       Make breakpoint pending on future shared library load? (y or [n]) y
       Breakpoint 1 (gvcst_expand_prev_key) pending.
TERM1: (gdb) r -run onemore^x
       Starting program: mumps -run onemore^x
       [Thread debugging using libthread_db enabled]
       Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
       Breakpoint 1, gvcst_expand_prev_key (pStat=0x6ca8c8, srch_key=0x62c040, exp_key=0x62b840) at sr_port/gvcst_expand_key.h:40
       40 {

Now switch to TERM2 terminal

TERM2: > setenv ydb_gbldir mumps.gld
TERM2: > mumps -run kill^x

Now switch back to TERM1 terminal

TERM1: (gdb) cont
       Continuing.
       Program received signal SIGSEGV, Segmentation fault.
       0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       2374                                    n = (int)((sm_long_t)*cp2 - (sm_long_t)*cp1);
       (gdb) where
       #0  0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       #1  0x00007ffff6be5cd7 in gvcst_put (val=0x62e860) at sr_port/gvcst_put.c:300
       #2  0x00007ffff6cfc930 in op_gvput (var=0x62e860) at sr_port/op_gvput.c:74
       #3  0x00007ffff7ff61c8 in ?? ()
       #4  0x00007ffff7373490 in ?? () from libyottadb.so
       #5  0x00007fffffff9860 in ?? ()
       #6  0x0000000000000000 in ?? ()

And you see the SIG-11 without the code fixes in this commit.

> cat x.m
init    ;
        for i=1:1:825 set ^x(i)=i
        quit
kill    ;
        kill ^x(825)
        set i=826,^x(i)=""
        quit
onemore ;
        tstart ():serial
        set ^x(826)=""
        tcommit
        quit
@nars1 nars1 closed this as completed May 1, 2018
@ksbhaskar ksbhaskar changed the title SIG-11 from YottaDB in rare cases while doing SET or $INCR on a gvn SET and $INCREMENT() in a very rare case work correctly without abnormal process termination May 8, 2018
@ksbhaskar ksbhaskar changed the title SET and $INCREMENT() in a very rare case work correctly without abnormal process termination SET and $INCREMENT() work correctly without abnormal process termination in a very rare case May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant