Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help Wanted with Calls to SimpleAPI #2

Closed
jpaulking opened this issue Mar 5, 2018 · 2 comments
Closed

Help Wanted with Calls to SimpleAPI #2

jpaulking opened this issue Mar 5, 2018 · 2 comments

Comments

@jpaulking
Copy link

jpaulking commented Mar 5, 2018

We have a simple C application prototype which takes input over a (ZeroMQ) TCP socket, interprets incoming data requests, and calls ydb_get_s() or ydb_set_s() as appropriate.

With a one second delay after each call to YottaDB, it works. But if the program returns immediately to the top of the message-reading loop, the call to read the next message fails, citing "Interrupted system call" as the cause.

The error returns if the delay call is placed above the ydb_* call.
The error vanishes if the ydb_* call is omitted.
In all cases ydb_set_s and ydb_get_s return YDB_OK.

We are running YottaDB as built from the simpleapi branch of nars1/YottaDB circa 28 Feb, 2018. We are building and testing primarily on CentOS 7 Linux, with GCC version 5.1, though we have attempted it on a Debian system also and with the same outcome.

A rough pseudo-code outline follows below. In the interest of clarity, I've omitted data transformation, error checking, debug messaging and the like.

Any thoughts or suggestions would be deeply appreciated.


// variable declarations and initializations
int result = ydb_init();
// Check good result or print error message.

while(1)
{
    // This call blocks, waiting on an incoming message.
    // Without the one second delay after the call to ydb_get_s,
    // this call returns an error code and reports 'Interrupted system call'.
    result = zmq_recv(incomingBuffer);
    // Check good result or print last error message.
    
    // Parse the incoming message here and prepare.
    // the needed ydb_buffer_t objects.
    
    result = ydb_get_s(...);
    // Check good result or print error message.
    
    // Delay for one second. With this call in place, the application
    // works, but without it, we see the aforementioned error.
    sleep(1);
    
    // Translate the result to text for the client.
        
    // Reply to the client.
    result = zmq_send(outgoingBuffer);
    // Check good result or print last error message.
}

@nars1
Copy link
Owner

nars1 commented Mar 5, 2018

ydb_set_s() on a global would start a timer (to flush the dirty database buffers to disk) and that is most likely interrupting the recv() call on the socket.

You need to restart the recv() call in case of EINTR. Like is done in the below file in the YottaDB source repo. There are various other system calls that can get EINTR and need to be handled in the C program by restarting the system call. See eintr_wrappers.h for other such call usages. Hope this helps.

sr_port/eintr_wrappers.h
    198 #define RECV(SOCKET, BUF, LEN, FLAGS, RC)                       \
    199 {                                                               \
    200         do                                                      \
    201         {                                                       \
    202                 RC = (int)recv(SOCKET, BUF, (int)(LEN), FLAGS); \
    203         } while (-1 == RC && EINTR == errno);                   \
    204 }
    205 

@jpaulking
Copy link
Author

Thanks for the speedy reply. Very helpful indeed!

nars1 added a commit that referenced this issue Apr 30, 2018
…use SIG-11 otherwise)

Below is a test case that demonstrates the SIG-11 TERM1 and TERM2 are two terminals.

TERM1: > rm mumps.gld mumps.dat
TERM1: > setenv ydb_gbldir mumps.gld
TERM1: > $ydb_dist/gde exit
TERM1: > $ydb_dist/mupip create
TERM1: > $ydb_dist/mumps -run x
TERM1: > gdb $ydb_dist/mumps
TERM1: (gdb) b gvcst_expand_prev_key
       Function "gvcst_expand_prev_key" not defined.
       Make breakpoint pending on future shared library load? (y or [n]) y
       Breakpoint 1 (gvcst_expand_prev_key) pending.
TERM1: (gdb) r -run onemore^x
       Starting program: mumps -run onemore^x
       [Thread debugging using libthread_db enabled]
       Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
       Breakpoint 1, gvcst_expand_prev_key (pStat=0x6ca8c8, srch_key=0x62c040, exp_key=0x62b840) at sr_port/gvcst_expand_key.h:40
       40 {

Now switch to TERM2 terminal

TERM2: > setenv ydb_gbldir mumps.gld
TERM2: > mumps -run kill^x

Now switch back to TERM1 terminal

TERM1: (gdb) cont
       Continuing.
       Program received signal SIGSEGV, Segmentation fault.
       0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       2374                                    n = (int)((sm_long_t)*cp2 - (sm_long_t)*cp1);
       (gdb) where
       #0  0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       #1  0x00007ffff6be5cd7 in gvcst_put (val=0x62e860) at sr_port/gvcst_put.c:300
       #2  0x00007ffff6cfc930 in op_gvput (var=0x62e860) at sr_port/op_gvput.c:74
       YottaDB#3  0x00007ffff7ff61c8 in ?? ()
       YottaDB#4  0x00007ffff7373490 in ?? () from libyottadb.so
       YottaDB#5  0x00007fffffff9860 in ?? ()
       YottaDB#6  0x0000000000000000 in ?? ()

And you see the SIG-11 without the code fixes in this commit.

> cat x.m
init    ;
        for i=1:1:825 set ^x(i)=i
        quit
kill    ;
        kill ^x(825)
        set i=826,^x(i)=""
        quit
onemore ;
        tstart ():serial
        set ^x(826)=""
        tcommit
        quit
nars1 added a commit that referenced this issue May 1, 2018
…use SIG-11 otherwise)

Below is a test case that demonstrates the SIG-11 TERM1 and TERM2 are two terminals.

TERM1: > rm mumps.gld mumps.dat
TERM1: > setenv ydb_gbldir mumps.gld
TERM1: > $ydb_dist/gde exit
TERM1: > $ydb_dist/mupip create
TERM1: > $ydb_dist/mumps -run x
TERM1: > gdb $ydb_dist/mumps
TERM1: (gdb) b gvcst_expand_prev_key
       Function "gvcst_expand_prev_key" not defined.
       Make breakpoint pending on future shared library load? (y or [n]) y
       Breakpoint 1 (gvcst_expand_prev_key) pending.
TERM1: (gdb) r -run onemore^x
       Starting program: mumps -run onemore^x
       [Thread debugging using libthread_db enabled]
       Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
       Breakpoint 1, gvcst_expand_prev_key (pStat=0x6ca8c8, srch_key=0x62c040, exp_key=0x62b840) at sr_port/gvcst_expand_key.h:40
       40 {

Now switch to TERM2 terminal

TERM2: > setenv ydb_gbldir mumps.gld
TERM2: > mumps -run kill^x

Now switch back to TERM1 terminal

TERM1: (gdb) cont
       Continuing.
       Program received signal SIGSEGV, Segmentation fault.
       0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       2374                                    n = (int)((sm_long_t)*cp2 - (sm_long_t)*cp1);
       (gdb) where
       #0  0x00007ffff6bf9d55 in gvcst_put2 (val=0x62e860, parms=0x7fffffff6c00) at sr_port/gvcst_put.c:2374
       #1  0x00007ffff6be5cd7 in gvcst_put (val=0x62e860) at sr_port/gvcst_put.c:300
       #2  0x00007ffff6cfc930 in op_gvput (var=0x62e860) at sr_port/op_gvput.c:74
       YottaDB#3  0x00007ffff7ff61c8 in ?? ()
       YottaDB#4  0x00007ffff7373490 in ?? () from libyottadb.so
       YottaDB#5  0x00007fffffff9860 in ?? ()
       YottaDB#6  0x0000000000000000 in ?? ()

And you see the SIG-11 without the code fixes in this commit.

> cat x.m
init    ;
        for i=1:1:825 set ^x(i)=i
        quit
kill    ;
        kill ^x(825)
        set i=826,^x(i)=""
        quit
onemore ;
        tstart ():serial
        set ^x(826)=""
        tcommit
        quit
nars1 pushed a commit that referenced this issue Aug 15, 2018
…B#334)

Modules changes:

- sr_port/bm_getfree.c - Issues: free_bit (line 216), map_size (line 216), cs1 (line 223), and depth (line 221) are all uninitialized. Additionally, blkhist.cr and blkhist.cycle undefined at line 238.
  - Loop changed so always do one iteration. This has fixed all of the above issues (no longer reported by scan).

- sr_port/dse_m_rest.c - Issue: The variable r_top was flagged as uninitialized in line 87.
  - The variable r_top was re-initialized a few lines further down so line 87 was not necessary and was removed.
  - Some reformats due to non-standard formatting pre-existing.

- sr_port/dse_shift.c - Issue: lbp undefined in line 99
  - Initialize 'lbp' to NULL earlier so no paths to it are left for it to be undefined.

- sr_port/gtmsource_ctl_init.c - Issue: tmp_ctl resource leak at line 182.
  - Free 'tmp_ctl' when returning or erroring out.

- sr_port/is_canonic_name.c - Issue: Issue was noted via an issue in a different module. In the parse_gv_name_and_subscripts() routine, near the top, seq is set to *subscripts though is never referenced again.
  - Remove variable 'seq' which is not used and may have issues dereferencing 'subscripts' when not provided.

- sr_port/mu_int_blk.c - Issue: free_blk_base resource leak going out of scope line 323.
  - Free malloc'd storage (in 'free_blk_base') before return.

- sr_port/mupip_set_journal.c - Issue: Resource leak rewriting gds_info, gds_info not cleaned up.
  - Remove unused gds_info allocation (not plugged into anything) and changed entire clause to FILE_CNTL_INIT_IF_NULL() macro to do initialization correctly.

- sr_port/mur_output_show.c - Issue: first_time uninitialized in line 328.
  - Initialize 'first_time' to TRUE so no path to it being uninitialized.

- sr_port/op_fnzsocket.c - Issue: tls_options_mask uninitialized at 598.
  - Move initialization of 'tls_options_mask' up earlier so no paths to uninitialized use.

- sr_port/op_indincr.c - Issue: s is uninitialized at line 101
  - Only do put_tref() if something to put. While this leaves v undefined if there was an error, this does not matter because comp_fini() ignores v in that case.

- sr_port/stp_gcol_src.h - Issue: uninitialized value of cstr at line 1063.
  - Make sure cstr is initialized whether expansion succeeds or fails by moving point where cstr is set.

- sr_port/tcp_open.c - Issues: 1. resource leak - ai_ptr going out of scope at line 190.
                               2. remote_ai_head uninitialized at line 302.
                               3. calling close (gtm_close) without checking return value at line 355.
                               4. resource leak - remote_ai_head out of scope at line 340.
  - Remove some old Tru64 stuff.
  - Remove unnecessary initialization of remote_ai_ptr.
  - Remove some unused variables.
  - Move hostname validation to client section and remove the conditional parsing of the host (required for client) which solves #2.
  - Change close() to CLOSEFILE() macros which solves YottaDB#3.
  - Make sure to release linked list at ai_ptr which solves #1.
  - Remove assert when erroring with timeout (now tested by online_bkup/online6). This was required since the tests in online6 now timeout instead getting GETADDRINFO for invalid passive hosts being specified.
  - Make sure release linked list at remote_ai_head (which solves YottaDB#4).

- sr_port/ydberrors.msg
  - Added TCPCONNTIMEOUT message as the other timeout messages didn't quite do what was needed. There were a couple util_out_print() timeout message in tcp_open() but it really needed to be a real error - especially since it was now being tested in a test (online_bkup/online6).

- sr_unix/anticipatory_freeze.c - Issue: Resource leak - handle goes out of scope at line 358.
  - Add an FCLOSE() and error check for same prior to module returns.
  - Modify CLEAR_ANTICIPATORY_FREEZE() macro to set FREEZE_CLEARED to FALSE if it doesn't set it to TRUE so FREEZE_CLEARED is ALWAYS set by this macro.

- sr_unix/bin_load.c - Issue: Several allocated blocks and buffers need cleanup when leave.
  - Implement mechanism to track all of the allocated buffers and free any that are allocated if an error occurs or on normal routine return.

- sr_unix/cli_lex.c - Issue: retptr uninitialized at line 488 (return from module).
  - Initialize return value to null so is initialized when nothing is read.

- sr_unix/gtm_getpwuid.c
  - Removed an unneeded return value check from malloc (gtm_malloc does not return if no storage).
  - Reformatted an #ifdef block.

- sr_unix/gtmcrypt_entry.c - Issue: Resource leak - handle goes out of scope at line 135.
  - release/close the handle before return

- sr_unix/gtmrecv_end.c - Issue: jnlpool_strm_seqno[idx] uninitialized at line 194.
  - Only do the loop to dump the streams if the journal pool exists.

- sr_unix/gtmsecshr.c - Issue: Resource leak - procstrm goes out of scope at line 1026.
  - Close 'procstrm' before return on error.

- sr_unix/gtmsource_shutdown.c - Issue: maxindex is uninitialized at line 317.
  - Add '!auto_shutdown' to clause as 'maxindex' is only set when !auto_shutdown.

- sr_unix/iosocket_tls.c - Issue: 1. errlen2 is uninitialized at line 416.
                                  2. errlen is unintialized at line 466.
  - Errors aren't using the correct method of getting error out there. The 'errlen' var is only seldomly set. Change to use LEN_AND_STR() macro to provide length.

- sr_unix/mu_all_version_standalone.c - Issue: save_errno is unintialized at line 172.
  - Save 'errno' to 'save_errno' to initialize it for the error message.

- sr_unix/mucblkini.c - Issue: Vars bp1, bp2, and bmp were not released before errors.
  - Free these vars appropriately before leaving routine.

- sr_unix/op_zlink.c - Issue: srcnamelen uninitialized line 227.
  - Move initialization of 'srcstr' closer to where used and only for those options where srcnamelen is set.

- sr_unix/relinkctl.c - Issue: shm_hdr uninitialized in line 1069
  - Initialize 'shm_hdr' to prevent usage when is still uninitialized.
  - Add error checks and messages for SHMDT invocations.

- sr_unix/ss_anal_shdw_file.c - Issue: bitmap_buffer and bp allocations not freed prior to error returns (lines 118 and 123).
  - Add frees for bitmap_buffer and bp before error returns.

- sr_unix/trigger_source_read_andor_verify.c - Issue: rttabent uninitialized on line 356.
  - Initialize 'rttabent' to NULL
  - If not initialized, set 'rttabent' to our rtn_names list to begin search.

- sr_unix/util_output.c - Issue: Origin of this change is lost. It was from an issue raised for another module and digging down through the calls, we ended up here where a var as uninitialized after this loop because of the lack of a default clause such that 'length' didn't get set IIRC (the chose switch value in the scan's simulation did not match a case).
  - Add default clause to switch with an assert in it. Also added code to keep static scan from complaining about 'length'.

- sr_unix/wait_for_disk_space.c - Issue: freeze_cleared uninitialized at line 161.
 - Add a default for 'freeze_cleared' so is initialized.

- sr_unix_cm/gtcm_bgn_net.c - Issue: Resource leak - ai_ptr goes out of scope at 135, 148.
  - Release 'ai_ptr' linked list before normal and returns.

- sr_unix_cm/omi_prc_conn.c - Issue: Neither agname or ag_pass allocated memory is released before normal or error return.
  - Define OMI_FREE() macro for better cleanup. Add to any place before error or normal
    return.
  - Initialize 'agname' and 'ag_pass' so know if they are allocated.
  - Remove clauses that test memory allocation result. Note gtm_malloc() does not return
    if allocation fails.

- sr_unix_cm/rc_srvc_xct.c - Issue: Resource leak - elst var goes out of scope at line 198.
  - Added a macro to do the cleanups and added it before all return points.

- sr_unix_gnp/cmi_init.c - Issue: Resource leak - local_ai_ptr out of scope lines 67, 74, 85, 95,122.
  - Some minor formatting changes for code standards.
  - Make sure local_ai_ptr linked list is freed before leave routine.

- sr_unix_gnp/cmi_open.c - Issue: Resource leak - ai_head out of scope line 123.
  - Some minor formatting changes for code standards.
  - Make sure ai_head linked list is freed before leave routine.

- sr_unix_gnp/cmj_get_port.c - Issue: Resource leak - ai_ptr out of scope line 123.
  - Removed - no users of this routine were found.

- sr_unix_gnp/cmj_getsockaddr.c - Issue: Resource leak - ai_ptr out of scope line 152.
  - Make sure ai_ptr linked list is freed before leave routine with this error return.

- sr_unix_gnp/cmu_getclb.c - Issue: Resource leak - ai_ptr out of scope lines 52, 57.
  - Minor changes for coding standards.
  - Make sure ai_ptr linked list is freed before leave routine.

- sr_unix_gnp/gtcm_gnp_server_main.c - Issue: status uninitialized line 287.
  - Change static routines to STATICFN{DCL,DEF} (coding standards)
  - The 'status' variable was not set until after it was referenced. Removed the test of the uninitialized status after the fetch.

- sr_unix_gnp/gtcm_open_cmerrlog.c - Issue: Potential to overrun end of lfn_path at line 80.
  - Change max usable length before truncation to make room for a null terminator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants