Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: add --action-env cli option #2136

Closed
wants to merge 2 commits into from

Conversation

rst0git
Copy link
Member

@rst0git rst0git commented Mar 22, 2023

In some use cases, action scripts might require user-specified environment variables to be passed. For example, when creating a checkpoint of a Pod, we would like to be able to pass the container name and pod name to the action script. To enable this functionality, this pull request adds a new CLI option --action-env that takes as input a string in the following format <NAME>=<VALUE> . This option can be used multiple times.

Example:

criu dump --action-script /foo/bar/script.sh  --action-env "FOO1=BAR1"  --action-env "FOO2=BAR2" ...

@adrianreber
Copy link
Member

CI is broken because of it, but this is a case where you could extend the unit tests to make sure this use case is also tested.

Every time strings are parsed a unit test is probably a useful addition. Especially error paths can be tested during unit tests.

There is also some command line option and configuration file testing which can be enhanced to ensure that the new options are tested.

Overall I think this sounds useful.

For example, when creating a checkpoint of a Pod, we would like to be able to pass the container name and pod name to the action script.

Not sure about this argument. How is the container and pod name relevant on the CRIU level which only deals with PIDs? Maybe you can elaborate a bit more how this can be useful.

@codecov-commenter
Copy link

codecov-commenter commented Mar 23, 2023

Codecov Report

Patch coverage: 75.47% and no project coverage change.

Comparison is base (b845846) 70.36% compared to head (48f828c) 70.36%.

❗ Current head 48f828c differs from pull request most recent head e8dc491. Consider uploading reports for the commit e8dc491 to get more accurate results

Additional details and impacted files
@@            Coverage Diff            @@
##           criu-dev    #2136   +/-   ##
=========================================
  Coverage     70.36%   70.36%           
=========================================
  Files           133      133           
  Lines         33894    33946   +52     
=========================================
+ Hits          23848    23887   +39     
- Misses        10046    10059   +13     
Impacted Files Coverage Δ
criu/crtools.c 67.03% <ø> (ø)
criu/unittest/mock.c 8.16% <0.00%> (-0.35%) ⬇️
criu/action-scripts.c 77.17% <63.15%> (-3.91%) ⬇️
criu/config.c 82.11% <82.60%> (+0.02%) ⬆️
criu/unittest/unit.c 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@rst0git rst0git force-pushed the action-env branch 3 times, most recently from f82a553 to 4f0aa7f Compare March 23, 2023 12:29
@rst0git
Copy link
Member Author

rst0git commented Mar 23, 2023

Thanks Adrian! I've updated the unit tests with a new test for the --action-env option

How is the container and pod name relevant on the CRIU level which only deals with PIDs?

When a distributed application that consist of multiple containers is running across different nodes, each container could have dependency on one or more others containers. In order to synchronise and coordinate the checkpointing/restoring of multiple containers we can use action-scripts. However, the information currently provided to an action script is insufficient to identify the container that is being checkpointed/restored.

The --action-env option allows to pass arbitrary information to the action script. In particular, the container name and pod name can be used to coordinate at CRIU-level the checkpointing / restoring across multiple nodes.

@rst0git rst0git marked this pull request as ready for review March 23, 2023 12:41
@adrianreber
Copy link
Member

Your changes are partially not covered by tests. Adding something to https://github.com/checkpoint-restore/criu/blob/criu-dev/test/others/config-file/run.sh would help.

@rst0git rst0git force-pushed the action-env branch 4 times, most recently from 19fe7ff to 14d6e32 Compare March 23, 2023 20:46
@rst0git
Copy link
Member Author

rst0git commented Mar 23, 2023

I've updated the pull request with a test that covers the action-script functionality.

In some use cases, action scripts might require user-specified
environment variables to be passed. For example, when creating a
checkpoint of a Pod, we would like to be able to pass the container
name and pod name to the action script.

To enable this functionality, this patch adds a new CLI option
`--action-env` that takes as input a string in the following format
"<NAME>=<VALUE>". This option can be used multiple times.

Example:

	criu dump --action-script /foo/bar/script.sh \
		--action-env "FOO1=BAR1" \
		--action-env "FOO2=BAR2" ...

Signed-off-by: Radostin Stoyanov <[email protected]>
Copy link
Member

@adrianreber adrianreber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Very good test coverage and the code looks correct regarding memory management.

@github-actions
Copy link

A friendly reminder that this PR had no activity for 30 days.

@avagin
Copy link
Member

avagin commented Apr 27, 2023

Can we execute criu with all these variables and then action scripts will inherit them?

@rst0git
Copy link
Member Author

rst0git commented Apr 27, 2023

Can we execute criu with all these variables and then action scripts will inherit them?

Yes, these variables don't affect how criu works. We are currently working on a tool called criu-coordinator. It is invoked as an action script that coordinates the dump/restore process of multiple criu instances to create a consistent global checkpoint and to enable rollback-recovery of distributed applications. This is similar to how dmtcp_coordinator would synchronise the checkpoint/restore sequence but instead of "barriers" we use the "hooks" available in CRIU.

For example, a configuration file that would enable this functionality looks something like the following:

$ cat looper-1.conf 
action-script /usr/local/bin/criu-coordinator
action-env CR_ADDR=127.0.0.1
action-env CR_PORT=8080
action-env CR_ID=looper-looper-1
action-env CR_DEPS=looper-looper-2:looper-looper-3

where CR_ADDR / CR_PORT are address and port the criu-coordinator client would use to connect to a criu-coordinator server. CR_ID is an identifier of the current CRIU instance (e.g., in the case of Kubernetes this could be <pod>-<container>). CR_DEPS contains a colon-separated list of dependencies (i.e., identifiers of other CRIU instances that the current CRIU instance depends on).

@avagin
Copy link
Member

avagin commented May 3, 2023

@rst0git Do I understand it right that you don't need this change any more?

@rst0git
Copy link
Member Author

rst0git commented May 3, 2023

@rst0git Do I understand it right that you don't need this change any more?

No, we do need it.

@avagin
Copy link
Member

avagin commented May 7, 2023

Can we execute criu with all these variables and then action scripts will inherit them?

Yes, ...

Could you answer this question? How do you run criu? Why can't you set this variables for the criu process?

@rst0git
Copy link
Member Author

rst0git commented May 7, 2023

@avagin I'm sorry, I misunderstood your question. Our goal is enable coordination between multiple criu instances when checkpointing Kubernetes pods. In this case, we use a kubelet API to initiate the container checkpoint via CRI-O and runc. We use the action-env option to setting up the coordination parameters via a configuration file.

@avagin
Copy link
Member

avagin commented May 8, 2023

Is it a per-container configuration file? Is an action script a shell script? If the answer is yest, can it sources a file with env variables?

@rst0git
Copy link
Member Author

rst0git commented May 8, 2023

Is it a per-container configuration file?

Yes

Is an action script a shell script? If the answer is yest, can it sources a file with env variables?

Good question. The criu-coordinator action script is a client-server implemented in Rust, but it could load the configuration parameters from a file.

@rst0git
Copy link
Member Author

rst0git commented Jun 6, 2023

I'm closing this pull request for now as we have modified criu-coordinator to load the connection parameters from a configuration file.

I've moved the commit adding a test for the --action-script option in #2191.

@rst0git rst0git closed this Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants