Skip to content

Commit

Permalink
Produce proper exit status in case image has been built with timeout (#…
Browse files Browse the repository at this point in the history
…35282)

When we build the image with timeout, we fork the process and set alarm
and create a new process group in order to be sure that all the parallel
build processes can be killed easily with sending termination signal to
process group on timeout. In order to wait for the forked process
to complete we used waitpid, but we did not handle the status code
properly, so if the build failed, we returned with 0 exit code.

This had the side effect that "Build CI image" did not fail, instead
the next step (generating source providers failed instead and it was
not obvious that the CI image build failing was the root cause.

This PR properly retrieves the wait status and converts it to
exit code - since we are still supporting Python 3.8 this is still
done using a bit nasty set of if statements - only in Python 3.9 we
have `os.waitstatus_to_exitcode` method to do it for us, but we cannot
use the method yet.
  • Loading branch information
potiuk authored Oct 31, 2023
1 parent 6276c40 commit 92c2c3f
Showing 1 changed file with 21 additions and 2 deletions.
23 changes: 21 additions & 2 deletions dev/breeze/src/airflow_breeze/commands/ci_image_commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,20 @@ def kill_process_group(build_process_group_id: int):
pass


def get_exitcode(status: int) -> int:
# In Python 3.9+ we will be able to use
# os.waitstatus_to_exitcode(status) - see https://github.com/python/cpython/issues/84275
# but until then we need to do this ugly conversion
if os.WIFSIGNALED(status):
return -os.WTERMSIG(status)
elif os.WIFEXITED(status):
return os.WEXITSTATUS(status)
elif os.WIFSTOPPED(status):
return -os.WSTOPSIG(status)
else:
return 1


@ci_image.command(name="build")
@option_python
@option_run_in_parallel
Expand Down Expand Up @@ -277,8 +291,13 @@ def run_build(ci_image_params: BuildCiParams) -> None:
atexit.register(kill_process_group, pid)
signal.signal(signal.SIGALRM, handler)
signal.alarm(build_timeout_minutes * 60)
os.waitpid(pid, 0)
return
child_pid, status = os.waitpid(pid, 0)
exit_code = get_exitcode(status)
if exit_code:
get_console().print(f"[error]Exiting with exit code {exit_code}")
else:
get_console().print(f"[success]Exiting with exit code {exit_code}")
sys.exit(exit_code)
else:
# turn us into a process group leader
os.setpgid(0, 0)
Expand Down

0 comments on commit 92c2c3f

Please sign in to comment.