Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Certain Known Inefficiencies in Call-Ins #32 (callinperf) #41

Merged
merged 1 commit into from
Oct 12, 2017

Commits on Oct 12, 2017

  1. Remove Certain Known Inefficiencies in Call-Ins YottaDB#32 (callinperf)

    ```
    Files changes in this commit:
    
    - sr_i386/g_msf.si
      1. Remove SFF_CI flag and resequence flags
    
    - sr_port/alias_funcs.c
      1. Change SFF_CI flag usage to SFT_CI type usage & restructure checks due to GTM$CI
         frame no longer being there.
    
      A review question asked why this change removed an "fp = fp->old-frame_pointer" statement. The answer is that it
      has to do with the elimination of the GTM$CI frame. The frame chain used to look like this:
    
        CI-base-frame <- GTM$CI <- M routine
    
      Now it looks like:
    
        CI-base-frame <- M routine
    
      In the previous scheme the GTM$CI frame had a "flag value" SFF_CI that indicated the M call was a call-in frame.
      In the new scheme the the base frame has a "type value" with SFT_CI set that indicates this is a call-in frame.
      The removed statement was because once we have noted the frame with the type set for a call-in frame, we had to
      back one more frame to get to its base frame. We no longer have to do that as the frame with the type flag IS
      the base frame.
    
    - sr_port/dollar_zlevel.c
      1. Spruce up some comments to make more sense.
    
    - sr_port/f_text.c
      1. Remove GTM$CI from the list of supressed values (no longer exists).
    
    - sr_port/fgncal.h
      1. Remove declaration for fgncal_lookup.c (no longer used).
    
    - sr_port/fgncal_lookup.c
      1. Deleted - no longer used
    
    - sr_port/gbldefs.c
      1. Remove param_list global (no longer used).
    
    - sr_port/mdb_condition_handler.c
      1. Cleanups - remove #ifdef for UNIX (leaving code in place since all is UNIX now).
      2. Cleanups - remove VMS conditional code.
      3. Change SFF_CI flag usage to SFT_CI type usage.
      4. Fix loop looking for $ETRAP defined when $ZTRAP set to explicit NULL to stop at frame 0
         instead of frame 1. This allows it to find call-in base frame (now frame 0 instead of 1)
         and also the mumps -run base frame.
    
    - sr_port/mdef.h
      1. Move DBGALS macro here from alias.h so can define DEBUG_ALIAS right here instead of
        requiring it to be a compiler option.
    
    - sr_port/mprof_funcs.c
      1. Cleanups - remove #ifdef for UNIX (leaving code in place since all is UNIX now).
      2. Cleanups - remove VMS conditional code.
      3. When running the stack, skip over both TRIGGER and CALLIN base frames.
    
    - sr_port/op_bindparm.c
      1. Minor cosmetic changes and comment clarifications.
    
    - sr_port/op_clralsvals.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
      2. Modify the loop to no longer assume a GTM$CI frame when call-ins are in use (no longer
         need to unwind one more frame to the base frame as the flagged frame IS the base frame).
    
    - sr_port/op_halt.c
      2. If call-in mode, do NOT just exit - make sure to return to the call-in caller.
    
    - sr_port/op_unwind.c
      2. Add include of gtmio.h for debugging macro(s).
    
    - sr_port/op_zg1.c
      2. If call-in mode, restart the frame we unwind to. This returns to the caller.
    
    - sr_port/op_zgoto.c
      1. Cleanups - remove #ifdef for UNIX (leaving code in place since all is UNIX now).
      2. Cleanups - remove VMS conditional code.
      3. Change SFF_CI flag usage to SFT_CI type usage.
    
    - sr_port/op_zhalt.c
      1. If call-in mode, do NOT just exit - make sure to return to the call-in caller.
    
    - sr_port/parm_pool.c
      1. Move the push_parm() routine out to sr_port/push_parm_src.h so it can be include in
         two forms:
         a. Form that gets its arguments from the varargs parameter list.
         b. Form that gets its arguments from a parmblk_struct.
    
    - sr_port/stack_frame.h
      1. Remove SFF_CI flag and resequence flags to fill the gap.
      2. Add SFT_CI type id.
      3. Modify SKIP_BASE_FRAME() macro to skip both TRIGGER and CALL-IN base frames.
    
    - sr_port/tp_unwind.c
      1.  Remove #ifdef DEBUG wrapper on debugging includes so they work with a pro build too.
    
    - sr_port/unw_mv_ent.c
      1. Change debugging output to be more clear which path is being taken.
    
    - sr_port/zlput_rname.c
      1. Since SKIP_BASE_FRAME macro now handles call-in base frames and because there's one fewer
         frames than there used to be when GTM$CI was used, reorganize the loop(s) traversing the
         M stack so it is more effectively utilized.
    
    - sr_port/zshow_stack.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
      2. Add marker for when call-in base frame shows up in stack.
    
    - sr_unix/ci_ret_code.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
      2. Remove the ci_ret_code() and ci_ret_code_exit() routines as not needed.  The ci_ret_code()
         routine did a longjmp() to "return" from the call-in - now replaced by a normal unwind
         procedure with no system calls. The ci_ret_code_exit() routine was set into the call-in
         base-frame and driven when ZGOTO 0 was done. It is removed because it too did a longjmp()
         which we are trying to avoid. That condition is now special cased in ZGOTO.
      3. The ci_ret_code_quit() routine (called to unwind the CALL-IN base frame) had comments
         added to better describe the dependencies
    
    - sr_unix/error_return.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
      2. Don't let CALL-IN mode do an EXIT here - force return to caller.
    
    - sr_unix/fgncalsp.h
      1. Get rid of no longer needed pointer fields (now passed directly)
    
    - sr_unix/gtm_startup.c
      1. Add comments on odd happenings.
    
    - sr_unix/gtm_trigger.c
      1. Change an M stack running loop to work without GTM$CI frame.
    
    - sr_unix/gtm_unlink_all.c
      1. Remove check for GTM$CI executable.
    
    - sr_unix/gtmci.c
      1. Removed use of longjmp() to effect return from call-ins.
      2. Change SFF_CI flag usage to SFT_CI type usage.
      3. Changed parameter handling to eliminate need for GTM$CI to reformat parameters
         appropriately for call-in to op_bindparm in generated M code.
      4. Eliminated need to drive op_extcall/op_extexfun to start M routine. We now call
         push_parm to set up called-routine parameters ourselves.
    
    - sr_unix/gtmci.h
      1. Removed GTM$CI as an intermediate call-in frame.
      2. Modified SET_CI_ENV() macro to setup actual base frame instead of intermediate frame.
    
    - sr_unix/gtmci_isv.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
      2. Some comment cleanup.
    
    - sr_unix/gtmci_signals.c
      1. Remove reference to invocation flag MUMPS_GTMCI which was never being set anywhere.
    
    - sr_unix/invocation_mode.h
      1. Remove MUMPS_GTMCI_OFF macro as unused.
      2. Remove MUMPS_GTMCI mode and resequence modes to fill gap (MUMPS_GTMCI never set).
    
    - sr_unix/jobchild_init.c
      1. Don't create GTM$CI anymore (make_cimode() not needed and removed).
      2. Create single base CI frame instead of base + GTM$CI frame.
    
    - sr_unix/make_cimode.c
      1. Removed - no longer needed.
    
    - sr_unix/make_mode.c
      1. Removed all capability to create GTM$CI as this "program" is no longer needed.
      2. Removed all #ifdef of __ia64, __hpux, and _AIX.
    
    - sr_unix/ojchildparms.c
      1. Change the "dummy" base frame generated from GTM$CI frame to GTM$DMOD frame. It doesn't
         matter what type of frame this is so long as there is one that can be used to return.
    
    - sr_unix/op_fnfgncal.c
      1. Change SFF_CI flag usage to SFT_CI type usage.
    
    - sr_unix/relinkctl.c
      1. Change M stack frame loop to compensate for the lack of the GTM$CI frame and which frame
         now has the CI marker (was flag field in GTM$CI frame, now type field in CI base frame).
    
    - sr_unix/rtnhdr.h
      1. Modify CLEANUP_COPIED_RECURSIVE_RTN() macro such that a null routine header address in a
         base frame doesn't screw things up and cause that NULL to be dereferenced.
    
    - sr_x86_64/ci_restart.s
      1. Remove this parameter massaging routine as being unnecessary anymore.
    
    - sr_x86_64/g_msf.si
      1. Remove SFF_CI (no longer used) and SFF_ETRAP_ERR (still defined but not used in assemble).
    
    sr_port/op_fnview.c
      1. Add "ENVIRONMENT" returns possible environment tokens "MUMPS", "MUPIP", "CALLIN", or "TRIGGER".
         Multiple comma separated tokens may be returned.
    
    sr_port/viewtab.h
      1. Add "ENVIRONMENT" option.
      2. Remove non-UNIX environment #ifdef code.
    
    - sr_linux/release_name.h - Change to R1.10
    
    sr_port/alias.h:
      1. Move DBGALS related macros to mdef.h so we can put the define for DEBUG_ALIAS right there and not
         have to specify a compiler option.
    
    sr_port/alias_funcs.c:
      1. The LVMON* view was overloaded. There was the LVMON command that monitors given/specified local variables
         for when they get changed and there were the LVMON* view commands that help to debug aliases and are active
         only when DEBUG_ALIAS is defined. When this define was enabled, the LVMON command was giving a VIEWAMBIG
         error since the view commands that followed it in the table were LVMON*. This update changes the name of the
         VIEW commands, the related routines and fields to be lvamon* instead of lvmon*. There are no tests of LVMON*
         in the test system due to it requiring a compiler flag to be incorporated so no tests were affected.
    
    sr_port/gbldefs.c:
    sr_port/lv_val.h:
    sr_port/op_view.h
    sr_port/tp_unwind.c
    sr_port/viewtab.h
      1. Rename lvmon* for lvamon*.
    
    sr_port/mdef.h
      1. Comment out define for DEBUG_ALIAS
    
    -----------------------------------
    
    This project significantly changes how call-ins works. To properly describe it, we need to first describe the
    current (original) operation of callins. The following write-up between the ------ lines below describes the
    current (r100 and previous versions) call-in operation:
    
    -----------------------------------
    
    How call-ins work:
    
    gtm_init() - Initializes YDB runtime
      1.   - image_type set to GTM_IMAGE
      2.   - invocation mode set to MUMPS_CALLIN
      3.   - init_gtm()
      4.     - gtm_startup()
      5.       - allocate M stack
      6.       - create (dummy) base frame with return addr of gtm_ret_code()
      7.       - jobchild_init()
      8.         - make_cimode() with returned addr of created routine set into "base_addr". This dynamically created routine
                   is called GTM$CI.
      9.         - gtm_init_env(base_addr, transfer_addr)
     10.           - base_frame(base_addr transfer_addr) - creates "another" base frame with an initial execution addr
                     of gtm_ret_code() but with rvector set to addr of GTM$CI.
     11.           - new_stack_frame(base_addr, PTEXT_ADR(base_addr), transfer_addr) - creates an executable frame for the
                     GTM$CI routine.
     12.         - Runs SET_CI_ENV() which changes the executable frame created so it has the SF_CI call-in flag set, and
                   and changes the base frame's execution address to ci_ret_code_exit(). This base frame is only used when
                   there's a ZGOTO 0.
    
    Stack after gtm_init():
      0. BaseFrame (created by gtm_startup()) - probably just to have *something* on the stack in case later initialization fails.
         - This baseframe has a zeroed rvector with an execution address of gtm_ret_code(). There is no previous pointer nor is there
           an unwind frame pointer hiding behind the frame like most base frames have
         - This is the true bottom of the stack in terms of stack frames though an mv_stent is on the stack before this.
         - In practise, we won't see this frame as it isn't used past initialization.
      1. BaseFrame (created by base_frame() as called by gtm_init_env()).
         - rvector is for GTM$CI routine.
         - Execution address is ci_ret_code_exit() (again, only used for ZGOTO 0 aka process exit call from M).
         - The address field on the stack immediately prior to this base frame pointer is an actual unwind address to base frame 0.
      2. Executable frame (created by new_stack_frame() as called by gtm_init_env()).
         - rvector pointer is set to GTM$CI routine
         - Frame is flagged as as call-in
    
    gtm_ci[p]() - Perform call-in:
      1. Executable frame set up in step YottaDB#11 of gtm_init() above has its execution field modified to point to the GTM$CI routine.
      2. Creates parameter block including address of call-in routine and address of either op_extcall() or op_extexfun() which
         will add a new stack frame and have the call-in parameters set up.
      3. Drives dm_start() to invoke the callin. The actual call path goes like this:
           a. dm_start() - establishes mdb_condition_handler() and sets up to drive M code. Invokes top stack frame (GTM$CI).
           b. ci_restart() - this is the first thing done in GTM$CI. This takes the parmblk_struct passed in the global var
              param_list and restructures the inputs for a call to op_extcall() or op_extexfun() and jumps to it.
           c. op_extcall() or op_extexfun() allocate a new stack frame and "return" to it such that the actual routine we are
              calling into is invoked.
           d. When the M code returns, it returns to GTM$CI and runs ci_ret_code() which drives a longjmp back to where
              mdb_condition_handler was setup in dm_start().
           e. dm_start() then returns to gtm_ci[p] for return value processing.
    
    The old parameter flow is:
      1. Parms are passed in as part of the call to gtm_ci[p]().
      2. gtm_ci[p]() reformats the parameters into a "parm_blk".
      3. When ci_restart is driven, parms are converted from parm_blk to register/stack parms.
      4. When op_extexfun() is driven, parms are shifted around for call to push_parm().
      5. When push_parm() is driven from op_extexfun(), register/stack parms are converted to parm_blk parms.
      6. When called-in routine is driven, its parms are picked up from parm_blk by op_bindparm() and bound to local
         vars in the called routine.
    
    Note, the entire purpose of the GTM$CI routine is to create the stack frame the call-in will run in and to set up its
    parameters in a fashion that op_bindparm() can read them. Since the op_bindparm of the day expected a varargs list (albeit
    in a different fashion from the varargs list passed into gtm_ci[p]()), at that time, the GTM$CI was written to provide the
    parameter conversion necessary at the time. The new support described next has made some changes that make this extra
    conversion step unnecessary.
    
    -----------------------------------
    
    Following (between the *********** lines) is a description of how call-ins work with this new support in place:
    
    ***********************************
    
    How the new call-ins functionality works:
    
    gtm_init() - Initializes YDB runtime
      1.   - image_type set to GTM_IMAGE
      2.   - invocation mode set to MUMPS_CALLIN
      3.   - init_gtm()
      4.     - gtm_startup()
      5.       - allocate M stack
      6.       - create (dummy) base frame with return addr of gtm_ret_code()
      7.       - jobchild_init()
      8.         - base_frame(base_addr transfer_addr) - creates "another" base frame with an initial execution addr
                   of gtm_ret_code() but with rvector set to addr of GTM$CI.
      9.         - Runs SET_CI_ENV() which changes the executable frame created so it has the SF_CI call-in flag set, and
                   and changes the base frame's execution address to ci_ret_code_exit(). This base frame is only used when
                   there's a ZGOTO 0.
    
    Stack after gtm_init():
      0. BaseFrame (created by gtm_startup()) - probably just to have *something* on the stack in case later initialization fails.
         - This baseframe has a zeroed rvector with an execution address of gtm_ret_code(). There is no previous pointer nor is there
           an unwind frame pointer hiding behind the frame like most base frames have
         - This is the true bottom of the stack in terms of stack frames though an mv_stent is on the stack before this.
         - In practise, we won't see this frame as it isn't used past initialization.
      1. BaseFrame (created by base_frame() as called by jobchild_init()).
         - rvector is NULL as this frame is just for return.
         - Execution address is gtm_levl_ret_code() - an entry point in dm_start() used to return without unwinding the frame.
         - The address field on the stack immediately prior to this base frame pointer is an actual unwind address to base frame 0.
    
    gtm_ci[p]() - Perform call-in:
      1. Create new executable frame (new_stack_frame()) for the called M routine to run in.
      2. Creates parameter block which contains a newly constructed lv_val for each incoming parameter.
      3. Cal push_parm_ci() which takes the parameter block built by gtm_ci[p]() and moves the parms to the the parameter area used by
         op_bindparm() which gets called at the top of the called target routine. See parameter flow below.
      4. Drives dm_start() to invoke the callin. Since the M stack frame for the called routine is on top, it gets immediately driven.
      5. When the M code returns, it returns to gtm_levl_ret_code which does a simple return to dm_start() without using longjmp().
      6. dm_start() then returns to gtm_ci[p] for return value processing.
    
    The parameter flow is:
      1. Parms are passed in as part of the call to gtm_ci[p]().
      2. gtm_ci[p]() reformats the parameters into a "parm_blk".
      3. gtm_ci[p]() drives push_parm_ci() to setup the parms for processing by op_bindparm() (moves them into a specific area used
         for buffering parameters).
      4. When called-in routine is driven (op_bindparm() gets driven at all entryrefs with parameter lists defined - even if empty)
         op_bindparm pulls the parms from the parm pool parameter space created by push_parm_ci() and binds them to local vars in
         the called routine then releases the parameter space for reuse.
    
    ***********************************
    
    -----------------------------------
    
    Operational changes with this project in the execution of a call-in (note steps are from original code):
    
    - No longer call make_cimode() to generate the internal GTM$CI routine (step gtm_init YottaDB#8).
    - Call base_frame() instead of gtm_init_env() to create ONLY the call-in base frame instead of the base frame plus
      the GTM$CI frame (step gtm_init YottaDB#9).
    - No longer setup GTM$CI as part of gtm_ci[p](). Instead, we allocate the frame using new_stack_frame() called
      directly (gtm_ci[p]() YottaDB#1).
    - After the call-in execution frame is created by gtm_ci[p](), a variant of the push_parm routine is called that takes
      the parameters from the parm block that gtm_ci[p] created from the incoming arguments and converts them so they can
      be processed by op_bindparm() which is a routine called by generated code anytime a label with arguments occurs in
      the M source code (gtm_ci[p]() YottaDB#2 and YottaDB#3).
    - The execution flow used to look like:
        * C [possibly main] routine
        * call gtm_ci[p]()
          * drive dm_start() to kick off transition to M mode execution
            * drive GTM$CI "build" routine.
              * op_extexfun()
    	  * Drive target M routine
    	  * Target routine returns to GTM$CI which then drives ci_ret_code (longjmp back to dm_start).
            * return to gtm_ci[p] via longjmp() in ci_ret_code() which returns to dm_start() and returns from there.
        * Return to C caller
    - The flow has similar steps but one less layer and returns without the use of a (longjmp) system call:
        * C [possibly main] routine
        * call gtm_ci[p]()
          * drive dm_start() to kick off transition to M mode execution
            * drive target M routine
    	* M returns to gtm_levl_ret_code which returns from an M routine without unwinding it which unwinds to dm_start.
            * return to gtm_ci[p]
          * return to C caller.
    
    -----------------------------------
    
    Notes on the removal of GTM$CI.
    
    - A call-in previously did not return "normally" by a return statement that unwound the stack. Instead it invoked
      a system call (longjmp()) to unwind the stack and return to dm_start() which then returned "normally" to the caller.
    - A much faster way to return would be to just - return but GTM$CI complicated that. The purpose of the GTM$CI "routine"
      was to set up arguments and call an assembler "glue code" routine to put the parms where they needed to be before
      calling the call-in routine.
    - Specifically, GTM$CI was a "constructed" routine (much like GTM$DMOD) in that it was not built from M source but
      was created by the make_mode() routine. Here's the operation:
        a. gtm_ci[p]() builds a parm_blk that contains the name of the glue routine to call (op_extexfun or op_extcall
           depending on whether has args or return value). The parm_blk also contains the parameters and other things
           needed to effect a call.
        b. gtm_ci[p]() drives dm_start() which enters M mode and drives the top routine on the M stack which happens to
           be GTM$CI.
        c. GTM$CI's first function is to drive ci_restart() which is an assembler routine sort of like a specialized
           version of callg() that takes the routine to call (e.g. op_extexfun), the routine/label to call and all the
           input and output parameters for the call-in routine and drives it to create the stack.
        d. When op_extexfun() completes, it drives the top routine on the M stack which is now the call-in frame.
        e. When the callin frame returns to GTM$CI, it drives ci_ret_code() which does a longjmp() to return.
    - So GTM$CI's primary purpose was to allocate the stack frame, setup the arguments and drive the glue code that
      made the actual call. This means the arguments were reformatted at least twice. Wanted to avoid that. Also, the
      system call flavor of return is unnecessary. Wanted to avoid that too.
    - Two changes allowed us to be rid of the extra GTM$CI overhead:
        1. With the routine gtm_levl_ret_code added to GT.M for triggers, it became possible for a stackframe to return
           without being unwound. We make use of that so we can do the simple unwind of call-in levels.
        2. By changing the push_parm() routine (in parm pool) so we it can pull the arguments for the call-in routine
           directly out of the parm block created by gtm_ci[p] instead of having to push them on the stack and pull them
           back off in the glue routine. This avoided one of the argument restructurings that were happening.
    - Because GTM$CI went away, the SFF_CI flag was changed to a type flag instead (since that is what it actually is) and
      it was moved to the base frame itself which is how triggers also does it.
    
    -----------------------------------
    
    User visible changes in this project (for specification in release note):
    
    1. There is no GTM$CI level anymore (no longer needed). So this name no longer shows up in stack listings.
    2. Because there is no GTM$CI level anymore, the $STACK and $ZLEVEL SVNs show one less than they used to in a call-in
       environment. These SVNs now mirror the levels one would get by using mumps -run. The first level executing routine
       is $ZLEVEL=1 and $STACK=0 instead of the previous $ZLEVEL=2 and $STACK=1.
    3. ZSHOW "S" shows the entire stack. It used to stop at the first call-in frame and not report any further back. A
       stack marker shows where call-in base frames are located in the stack. Where a call-in frame is detected, the text
       "(Call-In Level Entry)" appears in the stack list.
    4. When replacing a routine that is active on the stack, we ran the stack backwards to verify the routine was not being used.
       You can still replace an active routine but it is a special case. Unfortunately the loop was again stopping at the first
       call-in frame and not looking further back. If a routine being replaced was on the stack further back than that call-in
       base-frame, ugly stuff was likely to occur when we unwound back to that earlier frame.
    5. Similar issue when M-Profiling is looking up an entry ref on the stack. Need to figure out what happens when it doesn't
       find it.
    6. $VIEW(ENVIRONMENT) description.
    7. [Z]HALT in a call-in do not halt but return to the caller as they should but did not previously.
    
    ```
    estess committed Oct 12, 2017
    Configuration menu
    Copy the full SHA
    2a03972 View commit details
    Browse the repository at this point in the history