Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NARS1] [ksbhaskar] %HD should also work on hexadecimal numbers with a 0x or 0X prefix #45

Merged
merged 4 commits into from
Oct 4, 2017

Conversation

nars1
Copy link
Collaborator

@nars1 nars1 commented Oct 4, 2017

No description provided.

@@ -1,6 +1,9 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; ;
; Copyright 1987,2001 Sanchez Computer Associates, Inc. ;
; Copyright 1987,2001 Sanchez Computer Associates, Inc. ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a matter of esthetics, but the semi-colon box doesn't line up, at least on the Github review.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is likely because I have replaced the end of the line with tabs (instead of spaces) and that is how all lines in the rest of the copyright notice are. I do see all lines aligned in the actual code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

sr_port/hd.mpt Outdated
@@ -11,13 +14,17 @@
%HD ;GT.M %HD utility - hexadecimal to decimal conversion program
;invoke at INT with %HD in hexadecimal to return %HD in decimal
;invoke at FUNC as an extrinsic function
;Note: works even if input hex number contains 0x or 0X prefix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Ignores any prefix of 0x or 0X

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in next commit. Please review that.

;if you make heavy use of this routine, consider $ZCALL
;
s %HD=$$FUNC(%HD)
q
INT r !,"Hexidecimal: ",%HD s %HD=$$FUNC(%HD)
q
FUNC(h)
n prefix
set prefix=$zextract(h,1,2)
set:(("0X"=prefix)!("0x"=prefix)) h=($zextract(h,3,99))
q:$tr(h,"E","e")<0 ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is not part of this change, but this seems like a strange and inefficient way to test whether the input is a decimal number less than zero. Why not just something like q:+h<0 ""? Or do I not understand what this line is doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is most likely to avoid NUMOFLOW errors. See example below E vs e.

YDB>s h="1E98" 
YDB>w h<0
%GTM-E-NUMOFLOW, Numeric overflow

YDB>s h="1e98"
YDB>w h<0
0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

;if you make heavy use of this routine, consider $ZCALL
;
s %HD=$$FUNC(%HD)
q
INT r !,"Hexidecimal: ",%HD s %HD=$$FUNC(%HD)
q
FUNC(h)
n prefix
set prefix=$zextract(h,1,2)
set:(("0X"=prefix)!("0x"=prefix)) h=($zextract(h,3,99))
q:$tr(h,"E","e")<0 ""
n c,d,dg
s d=0,h=$tr(h,"abcdef","ABCDEF")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of inserting lines 25 through 27, why not change line 30 to: s d=0,h=$tr(h,"abcdefx","ABCDEFX") s:"0X"=$e(h,1,2) $e(h,3,$l(h))?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do.

@nars1
Copy link
Collaborator Author

nars1 commented Oct 4, 2017

Is it okay if FUNC^%HD returns 0 for negative numbers? It used to return "" in that case before? Below is new behavior.

YDB>w $$FUNC^%HD("-1")
0

sr_port/hd.mpt Outdated
n c,d,dg,l
s d=0,h=$tr(h,"abcdefx","ABCDEFX"),l=$l(h)
s:("0X"=$e(h,1,2)) h=($zextract(h,3,l)),l=l-2
f c=1:1:l s dg=$f("0123456789ABCDEF",$e(h,c)) q:'dg s d=(d*16)+(dg-2)
q d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be the most efficient implementation of FUNC(h), making use of compile time evaluation of operations on constants:

FUNC(h)
        q:$t(h,"E","e")<0 ""
        new p,z
        s h=$tr(h,"0123456789ABCDEFabcdefXx",$zch(0)_$zch(1)_$zch(2)_$zch(3)_$zch(4)_$zch(5)_$zch(6)_$zch(7)_$zch(8)_$zch(9)_$zch(10)_$zch(11)_$zch(12)_$zch(13)_$zch(14)_$zch(15)_$zch(10)_$zch(11)_$zch(12)_$zch(13)_$zch(14)_$zch(15))
        s z=0,p=1/16 f i=$l(h):-1:1 s p=p*16,z=p*$a($e(h,i))+z
        q z

@ksbhaskar
Copy link
Member

ksbhaskar commented Oct 4, 2017 via email

@nars1
Copy link
Collaborator Author

nars1 commented Oct 4, 2017

I noticed test failures after incorporating your suggestion so I have reverted that. And as discussed have gone ahead with committing this change. You can commit enhancements to this, if any, afterwards as a separate change.

@nars1 nars1 merged commit d167deb into YottaDB:master Oct 4, 2017
@nars1 nars1 deleted the hd_enhanced branch October 4, 2017 21:33
chathaway-codes pushed a commit that referenced this pull request Nov 27, 2018
…ops while a V4 block is dirty and SimpleThreadAPI is active

The simplethreadapi/lockst subtest failed as follows when the test framework randomly chose V4 format blocks.

%YDB-F-MAXRTSERRDEPTH Error loop detected - aborting image with core

And produced a core with the following C-stack

 #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f7ff40) at sr_unix/send_msg.c:123
 #3  send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #4  gtm_assert2 (condlen=17, condtext=0x7f25a62dc0d6 "!timer_in_handler", file_name_len=45, file_name=0x7f25a62dc0a8 "sr_unix/send_msg.c", line_no=125) at sr_port/gtm_assert2.c:34
 #5  send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f80570) at sr_unix/send_msg.c:125
 #6  send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #7  gtm_assert2 (condlen=17, condtext=0x7f25a62dc0d6 "!timer_in_handler", file_name_len=45, file_name=0x7f25a62dc0a8 "sr_unix/send_msg.c", line_no=125) at sr_port/gtm_assert2.c:34
 .
 .
 #32 send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f83d20) at sr_unix/send_msg.c:125
 #33 send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #34 gtm_assert2 (condlen=17, condtext=0x7f25a6305658 "!timer_in_handler", file_name_len=51, file_name=0x7f25a6305320 "sr_port/gtm_malloc_src.h", line_no=685) at sr_port/gtm_assert2.c:34
 #35 gtm_malloc (size=4096) at sr_port/gtm_malloc_src.h:685
 #36 wcs_wtstart (region=0x5558a88e1170, writes=0, cr_list_ptr=0x0, cr2flush=0x0) at sr_unix/wcs_wtstart.c:538
 #37 wcs_stale (tid=93839273365872, hd_len=8, region=0x5558a8979628) at sr_port/t_end_sysops.c:1387
 #38 timer_handler (why=14) at sr_unix/gt_timers.c:815
 #39 <signal handler called>
 #40 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5558a88e2ea8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 #41 __pthread_cond_wait_common (abstime=0x0, mutex=0x5558a88e2e40, cond=0x5558a88e2e80) at pthread_cond_wait.c:502
 #42 __pthread_cond_wait (cond=0x5558a88e2e80, mutex=0x5558a88e2e40) at pthread_cond_wait.c:655
 #43 ydb_stm_thread (parm=0x0) at sr_unix/ydb_stm_thread.c:92
 #44 start_thread (arg=0x7f25a0f85700) at pthread_create.c:463
 #45 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The primary issue is failure in the assert "assert(!timer_in_handler)" inside the PTHREAD_MUTEX_LOCK_IF_NEEDED
macro at frame # 35 (gtm_malloc). This is because we are about to make a pthread_mutex_lock() call which should
not be done while inside a signal handler (according to the man pages). To fix this issue, we skip flushing
that particular cache-record in wcs_wtstart() thereby avoiding the need to call gtm_malloc() (and in turn
the pthread_mutex_lock() call) while inside the timer handler.
nars1 added a commit that referenced this pull request Aug 7, 2020
…that can fail in rare cases

* The below assert failed once in 1500 r126/ydb464 subtest runs across a dozen boxes.

  ```c
  assert((1 != forced_exit) || GET_DEFERRED_EXIT_CHECK_NEEDED);

  #27 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
  #28 wcs_wtstart (region=0x7f6970, writes=0, cr_list_ptr=0x0, cr2flush=0x0) at sr_unix/wcs_wtstart.c:830
  #29 wcs_stale (tid=8350064, hd_len=8, region=0x872fe8) at sr_port/t_end_sysops.c:1386
  #30 timer_handler (why=14, info=0x7fadda7caea8 <stapi_signal_handler_oscontext+10728>, context=0x7fadda7caf28 <stapi_signal_handler_oscontext+10856>) at sr_unix/gt_timers.c:773
  #31 ydb_os_signal_handler (sig=14, info=0x7ffdee359eb0, context=0x7ffdee359d80) at sr_unix/ydb_os_signal_handler.c:66
  #32 <signal handler called>
  #33 generic_signal_handler (sig=2, info=0x7fadda7c9588 <stapi_signal_handler_oscontext+4296>, context=0x7fadda7c9608 <stapi_signal_handler_oscontext+4424>) at sr_unix/generic_signal_handler.c:244
  #34 ydb_os_signal_handler (sig=2, info=0x7ffdee35a870, context=0x7ffdee35a740) at sr_unix/ydb_os_signal_handler.c:88
  #35 <signal handler called>
  #36 clock_nanosleep () from /usr/lib64/libc.so.6
  #37 m_usleep (useconds=1000) at sr_unix/sleep.c:28
  #38 wcs_sleep (sleepfactor=1) at sr_port/wcs_sleep.c:28
  #39 wcs_phase2_commit_wait (csa=0x7f4840, cr=0x7fadd7b88a50) at sr_port/wcs_phase2_commit_wait.c:268
  #40 t_qread (blk=6, cycle=0x7ffdee35c568, cr_out=0x7ffdee35c558) at sr_port/t_qread.c:668
  #41 gvcst_search (pKey=0x7f6040, pHist=0x0) at sr_port/gvcst_search.c:438
  #42 gvcst_put2 (val=0x7fadda7d3940 <increment_delta_mval>, parms=0x7ffdee35f8b0) at sr_port/gvcst_put.c:980
  #43 gvcst_put (val=0x7fadda7d3940 <increment_delta_mval>) at sr_port/gvcst_put.c:299
  #44 gvcst_incr (increment=0x7ffdee362030, result=0x7ffdee362050) at sr_port/gvcst_incr.c:60
  #45 op_gvincr (increment=0x7ffdee362030, result=0x7ffdee362050) at sr_port/op_gvincr.c:60
  #46 ydb_incr_s (varname=0x7ffdee362b10, subs_used=0, subsarray=0x7ffdee3629d0, increment=0x7ffdee362aa0, ret_value=0x7ffdee362a90) at sr_unix/ydb_incr_s.c:156
  #47 runProc (settings=0x7ffdee363490, curDepth=0) at /usr/library/gtm_test/T998/simpleapi/inref/randomWalk.c:447
  #48 runProc_driver (settings=0x7ffdee363490) at /usr/library/gtm_test/T998/simpleapi/inref/randomWalk.c:145
  #49 main () at /usr/library/gtm_test/T998/simpleapi/inref/randomWalk.c:93

  (gdb) f 28
  #28 wcs_wtstart (region=0x7f6970, writes=0, cr_list_ptr=0x0, cr2flush=0x0) at sr_unix/wcs_wtstart.c:830
  830             ENABLE_INTERRUPTS(INTRPT_IN_WCS_WTSTART, prev_intrpt_state);

  (gdb) p forced_exit
  $5 = 1

  (gdb) p deferred_signal_handling_needed
  $6 = 0

  ```

* This is because the signal handler for SIGALRM (sig=14) happened after line 183 but before line 193.

    ```c
    sr_port/have_crit.h
    -------------------
    183         forced_exit = 1;                                                                        \
    184         if (in_os_signal_handler)                                                               \
    185         {       /* If we are inside an OS signal handler and therefore had to defer exit        \
    186                  * handling, set "outofband" also to TRUE as this is checked by lots of         \
    187                  * potentially long-running commands in the runtime (e.g. HANG etc.) and we     \
    188                  * want all of those to automatically trigger process exit handling.            \
    189                  */                                                                             \
    190                 outofband = deferred_signal;                                                    \
    191         }                                                                                       \
    192         /* Whenever "forced_exit" gets set to 1, set the corresponding deferred event too */    \
    193         SET_DEFERRED_EXIT_CHECK_NEEDED;                                                         \
    ```

* We cannot easily protect against this situation and so it is better to remove the assert.
nars1 added a commit that referenced this pull request Jan 19, 2022
… failure in a Simple Threaded API application

Background
----------
* The ideminter_rolrec/interrupted_rollback_or_recover subtest failed on a RHEL8 x86_64 in-house system
  as well as an AARCH64 system with the following symptom.

  ```diff
  107a108,239
  > ideminter_rolrec_0/interrupted_rollback_or_recover/savedjoboutput_0_20220107_233728/impjob_imptp0.mje4
  > %YDB-F-MAXRTSERRDEPTH Error loop detected - aborting image with core
  ```

* The C-stack of the core file had the following (frames with repeating traces not shown below)

  ```c
  #0  pthread_kill () from /usr/lib64/libpthread.so.0
  #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:74
  #2  rts_error_va (csa=0x0, argcnt=7, var=0x7ffc809d7070) at sr_unix/rts_error.c:144
  #3  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
  #4  ydb_stm_invoke_deferred_signal_handler () at sr_unix/ydb_stm_invoke_deferred_signal_handler.c:95
  .
  .
  #38 rts_error_va (csa=0x0, argcnt=7, var=0x7ffc809dadb0) at sr_unix/rts_error.c:163
  #39 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
  #40 ydb_stm_invoke_deferred_signal_handler () at sr_unix/ydb_stm_invoke_deferred_signal_handler.c:95
  #41 deferred_signal_handler () at sr_port/deferred_signal_handler.c:64
  #42 eintr_handling_check () at sr_port/eintr_handling_check.c:29
  #43 gtm_fprintf (stream=0x7f7f6c14b600 <_IO_2_1_stderr_>, format=0x7f7f6d7e1005 "%s") at sr_unix/gtm_stdio.c:75
  #44 util_out_print_vaparm (message=0x0, flush=1, var=0x7ffc809dbb20, faocnt=2147483647) at sr_unix/util_output.c:873
  #45 util_out_print (message=0x0, flush=1) at sr_unix/util_output.c:911
  #46 util_cond_flush () at sr_unix/util_output.c:952
  #47 ch_cond_core () at sr_unix/ch_cond_core.c:76
  #48 rts_error_va (csa=0x0, argcnt=7, var=0x7ffc809dbd20) at sr_unix/rts_error.c:198
  #49 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
  #50 ydb_stm_invoke_deferred_signal_handler () at sr_unix/ydb_stm_invoke_deferred_signal_handler.c:95
  #51 deferred_signal_handler () at sr_port/deferred_signal_handler.c:64
  #52 grab_lock (reg=0xe8e960, is_blocking_wait=1, onln_rlbk_action=2) at sr_unix/grab_lock.c:111
  #53 tp_tend () at sr_port/tp_tend.c:1261
  #54 op_tcommit () at sr_port/op_tcommit.c:494
  #55 stkok3 () at sr_x86_64/opp_tcommit.s:30

  (gdb) f 50
  #50 0x00007f7f6d51efb2 in ydb_stm_invoke_deferred_signal_handler () at sr_unix/ydb_stm_invoke_deferred_signal_handler.c:95
  95                      assert((SIGINT == stapi_signal_handler_oscontext[sig_hndlr_ctrlc_handler].sig_num) || USING_ALTERNATE_SIGHANDLING);

  (gdb) list
  93       if (STAPI_IS_SIGNAL_HANDLER_DEFERRED(sig_hndlr_ctrlc_handler))
  94       {
  95   -->         assert((SIGINT == stapi_signal_handler_oscontext[sig_hndlr_ctrlc_handler].sig_num) || USING_ALTERNATE_SIGHANDLING);
  96               ydb_stm_invoke_deferred_signal_handler_type = sig_hndlr_continue_handler;
  97               ctrlc_handler(DUMMY_SIG_NUM, NULL, NULL);
  98               ydb_stm_invoke_deferred_signal_handler_type = sig_hndlr_none;
  99       }

  (gdb) p/x stapi_signal_handler_deferred
  $15 = 0x404
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_none].sig_num
  $3 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_continue_handler].sig_num
  $4 = 18
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_ctrlc_handler].sig_num
  $5 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_dbcertify_signal_handler].sig_num
  $6 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_generic_signal_handler].sig_num
  $7 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_jobexam_signal_handler].sig_num
  $8 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_jobinterrupt_event].sig_num
  $9 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_job_term_handler].sig_num
  $10 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_op_fnzpeek_signal_handler].sig_num
  $11 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_suspsigs_handler].sig_num
  $12 = 0
  (gdb) p stapi_signal_handler_oscontext[sig_hndlr_timer_handler].sig_num
  $13 = 14
  (gdb) p simpleThreadAPI_active
  $1 = 1
  ```

* As can be seen from the gdb analysis of frame 50, the global variable `stapi_signal_handler_deferred` has 2 bits set.
  The ones corresponding to `sig_hndlr_ctrlc_handler` and `sig_hndlr_timer_handler`.

  But the `stapi_signal_handler_oscontext[]` global variable indicates that the 2 signals that were deferred corresponded to
  `sig_hndlr_continue_handler` and `sig_hndlr_timer_handler`.

  This disconnect between `sig_hndlr_ctrlc_handler` and `sig_hndlr_continue_handler` across the 2 global variables
  contributed to the assert failure.

Issue
-----
* After a code review, I have a theory on the cause of the issue. And that is the below macro got invoked
  by multiple threads at the same time for 2 SIGCONT signals (multiple threads possible since this is
  a Simple Thread API application as seen by the non-zero value of `simpleThreadAPI_active` in gdb).

  **sr_unix/sig_init.h**
  ```c
     80 #define STAPI_SET_SIGNAL_HANDLER_DEFERRED(SIGHNDLRTYPE, SIG_NUM, INFO, CONTEXT)         \
     81 {                                                                                       \
     82         SAVE_OS_SIGNAL_HANDLER_SIGNUM(SIGHNDLRTYPE, SIG_NUM);                           \
     83         SAVE_OS_SIGNAL_HANDLER_INFO(SIGHNDLRTYPE, INFO);                                \
     84         SAVE_OS_SIGNAL_HANDLER_CONTEXT(SIGHNDLRTYPE, CONTEXT);                          \
     85         SHM_WRITE_MEMORY_BARRIER;                                                       \
     86         stapi_signal_handler_oscontext[SIGHNDLRTYPE].sig_forwarded = TRUE;              \
     87         SHM_WRITE_MEMORY_BARRIER;                                                       \
     88         /* Need interlocked add below since multiple threads can execute this code      \
     89          * at the same time (e.g. if SIGALRM and SIGTERM happen at same time).          \
     90          * Just like SET_DEFERRED_TIMERS_CHECK_NEEDED macro in "have_crit.h",           \
     91          * we need to check the value before doing the interlocked add.                 \
     92          */                                                                             \
     93         if (!STAPI_IS_SIGNAL_HANDLER_DEFERRED(SIGHNDLRTYPE))                            \
     94                 INTERLOCK_ADD(&stapi_signal_handler_deferred, (1 << SIGHNDLRTYPE));     \
     95         SHM_WRITE_MEMORY_BARRIER;                                                       \
     96         SET_DEFERRED_STAPI_CHECK_NEEDED;                                                \
     97 }
  ```

* Threads 1 and 2 both went to line 93 and found `STAPI_IS_SIGNAL_HANDLER_DEFERRED(sig_hndlr_continue_handler)`
  FALSE. And then both went to proceed with the `INTERLOCK_ADD` call resulting in TWO increments for the
  same SIGCONT signal. This caused the value of the global variable `stapi_signal_handler_deferred` to
  have the bit following `sig_hndlr_continue_handler` set i.e. the `sig_hndlr_ctrlc_handler` bit (in the
  enum sequence). Thus we ended up with `stapi_signal_handler_deferred` having the least significant 4
  in its `0x404` value even though this process never had a Ctrl-C sent by the test (but SIGCONT could be sent
  by processes in the test at any time and not just one but more than one is also possible and is likely what
  happened in the test).

* While `INTERLOCK_ADD` is an interlocked operation, the decision to do it is based on checking the
  value of the global variable and that happens outside of the interlock and so the two together do
  not constitute an atomic operation like they should be.

Fix
---
* As part of e175d54, the global variables `deferred_signal_handling_needed`
  and `stapi_signal_handler_deferred` were updated to use `INTERLOCK_ADD` to prevent out-of-design values
  with multiple threads updating the same global variable at the same time.

* As part of a later commit 98e8638, `deferred_signal_handling_needed` was
  updated to use `COMPSWAP_LOCK` instead of `INTERLOCK_ADD`.

* But `stapi_signal_handler_deferred` continued to use `INTERLOCK_ADD`.

* The fix is simple and is to make `stapi_signal_handler_deferred` use `COMPSWAP_LOCK` instead of `INTERLOCK_ADD`.

* Towards this, the global variable now has a type of `global_latch_t`.

* A new `BIT_SET_INTERLOCKED` macro has been introduced in `sr_unix/sig_init.h`. This is based largely on
  the pre-existing `SET_DEFERRED_CONDITION` macro except that the new macro allows for the latch variable
  to be passed as a parameter.

  The following macros have been changed to use the new `BIT_SET_INTERLOCKED` macro.
  - STAPI_SET_SIGNAL_HANDLER_DEFERRED macro in sr_unix/sig_init.h
  - STAPI_CLEAR_SIGNAL_HANDLER_DEFERRED macro in sr_unix/sig_init.h
  - SET_DEFERRED_CONDITION macro in sr_port/have_crit.h
  - CLEAR_DEFERRED_CONDITION macro in sr_port/have_crit.h

* A new `BIT_GET_INTERLOCKED` macro gets the latch value. This is based on the pre-existing
  `GET_DEFERRED_CONDITION` macro in sr_port/have_crit.h.

  All places that need to check the value of the `stapi_signal_handler_deferred` global variable now use this
  new macro. the pre-existing `GET_DEFERRED_CONDITION` macro has also been changed to use this new macro.

* Unnecessary `GBLREF`s of `stapi_signal_handler_deferred` in a few files have been removed.

* Note that these changes are done based on a theory of what could have caused the failure. It is not yet
  known for sure if that is indeed the cause as the failure happens very rarely (it is not reproducible
  on demand even after running dozens of tests on lots of systems). Only time will tell if this fix is correct.

Misc changes
------------
* Noticed a longstanding typo in `sr_unix/ydb_stm_invoke_deferred_signal_handler.c` and fixed it.
  Not yet sure though if there is a user visible issue due to this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants