Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handler support #93

Merged
merged 2 commits into from
Mar 28, 2022
Merged

Error handler support #93

merged 2 commits into from
Mar 28, 2022

Conversation

marshallward
Copy link
Member

This patch introduces some limited support for error handling in a POSIX
environment. A partial interface to the POSIX system calls is included
as part of the implementation of this feature.

An internal global logical flag, ignore_fatal is added to the
MOM_error_handler module which controls the behavior of FATAL
errors. If true, then FATAL errors trigger a signal (here SIGUSR1)
which call a handler which restores the program to the earlier state.

This change allows us to test FATAL errors at runtime without stopping
program execution, so that we can cycle through multiple errors.

This feature is toggled with two new functions, disable_fatal_errors()
and enable_fatal_errors().

This is implemented in part using POSIX system calls from the C standard
library (libc), using the C interoperability layer. Details which are
unspecified by the POSIX standard, such as the definitions of signals or
the datatype size, are defined in a header (posix.h) which can be
optionally configured at compile-time.

The header defaults to use typical Linux POSIX values, and may be
configured by autoconf for more exotic platforms in a later patch.

The previous state is set and restored by the setjmp/longjmp (or
sigsetjmp/siglongjmp) functions. Although these functions behave as
expected on all tested platforms, a code which uses these functions is
not standards-conforming. If issues arise in the future, error handling
may not be available on those platforms, and compile-time configuration
may be required to disable these features.

No existing code is using these features, but are expected to be part of
a future patch.

Error handler API:

  • disable_fatal_errors
  • enable_fatal_errors

New POSIX API:

  • chmod
  • signal
  • kill
  • getpid
  • getppid
  • sleep
  • setjmp
  • sigsetjmp
  • longjmp
  • siglongjmp

New derived types:

  • jmp_buf
  • sigjmp_buf

New interface (for signal handlers):

  • handler_interface

New compile-time macros:

  • JMP_BUF_SIZE
  • SIGJMP_BUF_SIZE
  • SIGSETJMP_NAME
  • POSIX_SIGUSR1

@codecov
Copy link

codecov bot commented Mar 18, 2022

Codecov Report

Merging #93 (838298e) into dev/gfdl (2ac64b3) will decrease coverage by 0.02%.
The diff coverage is 1.36%.

@@             Coverage Diff              @@
##           dev/gfdl      #93      +/-   ##
============================================
- Coverage     29.02%   28.99%   -0.03%     
============================================
  Files           245      246       +1     
  Lines         72223    72296      +73     
============================================
+ Hits          20962    20963       +1     
- Misses        51261    51333      +72     
Impacted Files Coverage Δ
src/framework/posix.F90 0.00% <0.00%> (ø)
src/framework/MOM_error_handler.F90 41.57% <4.00%> (-14.68%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ac64b3...838298e. Read the comment docs.

@marshallward marshallward force-pushed the errhandle branch 3 times, most recently from 82eee31 to 778a186 Compare March 23, 2022 15:57
This patch introduces some limited support for error handling in a POSIX
environment.  A partial interface to the POSIX system calls is included
as part of the implementation of this feature.

An internal global logical flag, `ignore_fatal` is added to the
`MOM_error_handler` module which controls the behavior of `FATAL`
errors.  If true, then `FATAL` errors trigger a signal (here `SIGUSR1`)
which call a handler which restores the program to the earlier state.

This change allows us to test `FATAL` errors at runtime without stopping
program execution, so that we can cycle through multiple errors.

This feature is toggled with two new functions, `disable_fatal_errors()`
and `enable_fatal_errors()`.

This is implemented in part using POSIX system calls from the C standard
library (libc), using the C interoperability layer.  Details which are
unspecified by the POSIX standard, such as the definitions of signals or
the datatype size, are defined in a header (`posix.h`) which can be
optionally configured at compile-time.

The header defaults to use typical Linux POSIX values, and may be
configured by autoconf for more exotic platforms in a later patch.

The previous state is set and restored by the `setjmp`/`longjmp` (or
`sigsetjmp`/`siglongjmp`) functions.  Although these functions behave as
expected on all tested platforms, a code which uses these functions is
not standards-conforming.  If issues arise in the future, error handling
may not be available on those platforms, and compile-time configuration
may be required to disable these features.

No existing code is using these features, but are expected to be part of
a future patch.

Error handler API:
* `disable_fatal_errors`
* `enable_fatal_errors`

New POSIX API:
* `chmod`
* `signal`
* `kill`
* `getpid`
* `getppid`
* `sleep`
* `setjmp`
* `sigsetjmp`
* `longjmp`
* `siglongjmp`

New derived types:
* `jmp_buf`
* `sigjmp_buf`

New interface (for signal handlers):
* `handler_interface`

New compile-time macros:
* `JMP_BUF_SIZE`
* `SIGJMP_BUF_SIZE`
* `SIGSETJMP_NAME`
* `POSIX_SIGUSR1`
@marshallward
Copy link
Member Author

posix.F90 and posix.h were moved from config_src/os to src/framework, mostly to avoid requiring a new directory to mkmf.

While not necessarily the most appropriate place for it, it is the simplest place to park it for now. We can move it around when we have a more robust method of autogenerating the filepaths.

@adcroft
Copy link
Member

adcroft commented Mar 28, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants