Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DevOps Dojo: CI - JSON Schema Validation of YAML Parameter File(s) #232

Closed
3 tasks done
kaspermeck-arm opened this issue Apr 13, 2023 · 39 comments
Closed
3 tasks done
Assignees

Comments

@kaspermeck-arm
Copy link

kaspermeck-arm commented Apr 13, 2023

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I've agreed with the maintainers that I can plan this task.

Description

JSON Schema validation to ensure that parameter file(s) are valid.

Purpose

Ensure that parameter files are valid to mitigate misconfiguration of ROS nodes.

Possible approaches

E.g., https://github.com/marketplace/actions/frontmatter-json-schema-validator

Definition of done

Pass/Fail for the JSON Schema validation using draft-07.
Pass/Fail for /<path_to_package>/schema/*.schema.json validation of parameter file(s) in /<path_to_package>/config/*.param.json.

@kaspermeck-arm
Copy link
Author

@xmfcx @mitsudome-r

I spoke to @ambroise-arm earlier today. Arm is willing to take this issue on and with your input and feedback, add the validation into GitHub Actions.

@xmfcx
Copy link
Contributor

xmfcx commented May 25, 2023

json-yaml-validate might be a better option since frontmatter-json-schema-validator is for YAML front matter. Maybe we can also use the same action to validate the JSON files too.

@ambroise-arm
Copy link
Contributor

ambroise-arm commented Jun 16, 2023

autowarefoundation/autoware.universe#4002 is a PR for validating the JSON Schema files against their meta-schema. This is the first half of the work. The other half is validating the yaml configuration files against the schema files, I still need to look at that.

EDIT: I am not using the suggested json-yaml-validate for that because it is designed to take in a single schema to validate multiple files against. Whereas our usecase is to have multiple pairs of yaml/json where we want to validate each yaml config file against its associated json schema.

@kaspermeck-arm
Copy link
Author

kaspermeck-arm commented Jun 27, 2023

autowarefoundation/autoware.universe#4002 is a PR for validating the JSON Schema files against their meta-schema. This is the first half of the work. The other half is validating the yaml configuration files against the schema files, I still need to look at that.

EDIT: I am not using the suggested json-yaml-validate for that because it is designed to take in a single schema to validate multiple files against. Whereas our usecase is to have multiple pairs of yaml/json where we want to validate each yaml config file against its associated json schema.

@ambroise-arm thanks for taking this on!

There might be cases where the are more than one *.param.yaml files in the config/ directory.

@kaspermeck-arm
Copy link
Author

kaspermeck-arm commented Jul 6, 2023

The GitHub action which validates the parameter file using the schema is now merged!
autowarefoundation/autoware.universe#4122

@kaspermeck-arm
Copy link
Author

@ktro2828
Copy link

ktro2828 commented Sep 28, 2023

@kaspermeck-arm @ambroise-arm I found that current CI only allows single schema and configuration file in a package. Therefore, I created the PR to fix this. Could you check that?

@ktro2828 ktro2828 reopened this Sep 28, 2023
@ambroise-arm
Copy link
Contributor

ambroise-arm commented Sep 28, 2023

I don't think a package is meant to have several schema files. But it can have several different configuration files that comply to its schema, which I think your PR would not allow to check anymore.

EDIT: Ah, for packages that have several nodes you need several schema files, I see.

EDIT2: The documentation currently shows a single schema file and several config files for it (https://autowarefoundation.github.io/autoware-documentation/main/contributing/coding-guidelines/ros-nodes/directory-structure/) so that would need to be changed accordingly. So it seems like more than a CI change, but a design change.

@ktro2828
Copy link

ktro2828 commented Sep 28, 2023

Ah, make sense, I misunderstood. So, each package can only have single schema file even if it has multiple configuration files, right?
Then I will close the PR.

@ambroise-arm
Copy link
Contributor

ambroise-arm commented Sep 28, 2023

So, each package can only have single schema file even if it has multiple configuration files, right?

Yes, at least currently.

But the situation in autowarefoundation/autoware.universe#4902 looks like a valid use case for having several schemas. I don't know if you can make it work another way.

@kaspermeck-arm
Copy link
Author

@ktro2828 @ambroise-arm

Could a solution be to have a nested schema and parameter file?

@kaspermeck-arm
Copy link
Author

@ktro2828 @ambroise-arm

Could a solution be to have a nested schema and parameter file?

I'm in discussion with @ambroise-arm and we'll have an proposal soon which should resolve this issue.

@ambroise-arm
Copy link
Contributor

We propose having:

  • one schema file per node, with a NODE_NAME.schema.json name (which is already the current design, except that the design expected one node per package)
  • one or more configuration files per node, with a NODE_NAME*.param.yaml name

So the delta to the current design would only be a a stricter naming convention for the yaml files. In the current main branch I think only autoware_auto_msgs_adapter would need a modification with the renaming of its yaml files. And I think it is also only a small deviation to what you currently have @ktro2828

@kaspermeck-arm
Copy link
Author

@KYabuuchi thanks for your comment and helping us create a more rigid way with the parameters aspect of Autoware.

I'm not quite following your thought process and I don't understand where the complication is. I've tried to describe the process in a few bullet points:

  • A ROS package can have none or many nodes
  • Each node has one schema associated with it
  • A node has one or more parameter files associated with it
  • The schema validates the parameter file(s)

Am I addressing the issue with these bullet points?

We have not yet started the DevOps Dojo: ROS Node Launch. I do believe that overwriting parameters can be useful but also has risks, as you also mention. As soon as a parameter is configured outside of the parameter file, it cannot be validated by the schema, or at least not with how we validate parameter files today.

@ambroise-arm

@KYabuuchi
Copy link

@kaspermeck-arm Thank you for describing.

I noticed that a PR has been developed to address case 3 of my concerns. This will eventually be resolved. 👍

However, the bullet points do not consider nodes that load multiple parameter files. Some nodes store their parameters in separate files and load multiple parameter files during launch.

For example, probabilistic_occupancy_grid_map loads two files.
2023-12-12_17-07

Additionally, parameter overrides within launch files might lead to misunderstandings and bugs. Nevertheless, removing parameters determined at startup is challenging, so it would be helpful to have a method for handling this on the validation side.

@kaspermeck-arm
Copy link
Author

kaspermeck-arm commented Dec 12, 2023

@KYabuuchi

Just to understand.

  1. Why is it desirable to provide multiple parameter files to launch a single node?
  2. When would you want to determine parameters at startup?

@KYabuuchi
Copy link

@kaspermeck-arm

  1. Why is it desirable to provide multiple parameter files to launch a single node?

In cases where developers want to prepare presets corresponding to several modes, they create multiple parameter files.
For example, the probabilistic_occupancy_grid_map has parameter files that vary based on observation methods and another set of parameter files for independent update rules.
If all parameters are stored in a single file, the number of files becomes large and difficult to maintain when there are many preset combinations.

2023-12-12_17-07

Moreover, when some parameters are shared with other nodes, multiple parameter files may also be passed to a node.
For example, both the lane_departure_checker and vehicle_cmd_gate nodes require parameters such as the vehicle's wheel_base and max_steer_angle. These parameters are stored in the vehicle_info.param.yaml file.
These nodes, in order to read both the vehicle-related parameters and their specific parameters, load multiple parameter files.

2023-12-13_09-11

  1. When would you want to determine parameters at startup?

The paths for maps and DNN models and the localization mode are provided by the user during startup.
These parameters need to be provided not only to a single node but also to several nodes. Therefore, they are not managed in parameter files but are passed to the nodes from the launch file using <param name= instead of <param from=config.yaml.

@kaspermeck-arm
Copy link
Author

@KYabuuchi

  1. Thanks, that makes sense. Removing the parameters from being required is one way to resolve this. @ambroise-arm - what do you think?
  2. Would it be possible to set this to an environment variable?
    Schema:
        "test_env_var": {
          "type": "string",
          "description": "Environment variable",
          "default": "${ENV_VAR}",
          "enum": ["${ENV_VAR}"]
        }

and parameter file

    test_env_var: ${ENV_VAR}

@ambroise-arm
Copy link
Contributor

ambroise-arm commented Dec 14, 2023

Removing the parameters from being required is one way to resolve this. @ambroise-arm - what do you think?

Yes, It is one way to resolve this.

But I think the launch_ros rules in https://github.com/ros2/launch_ros/blob/humble/launch_ros/launch_ros/actions/node.py#L175-L182 are not that complex. And they would allow to keep all schema fields as required, even if some of the parameter from the config file are meant to always be overridden. In order to avoid bugs, I think the only thing we need to ensure is that the default parameter file of a package uses a "/**" namespace, which should already be enforced in the schema files (for example: https://github.com/autowarefoundation/autoware.universe/blob/main/perception/lidar_apollo_segmentation_tvm_nodes/schema/lidar_apollo_segmentation_tvm_nodes.schema.json#L87). It may indeed hurt clarity, as changing some of the values of the package's parameter file will have an effect, but not for others. But I think it is already not clear where parameters are set and how they are propagated in launch files, so it seems like a minor issue to me.

Also about validating multiple parameter files, I think the goal was never to validate the parameter files held by the launch packages. So I don't think multiple parameter file is an issue beyond validating those default parameters files. Please someone correct me if my assumption is incorrect.

And keeping all fields required seems especially important if we intend to rework the launch files in the future, as it keeps things clean and consistent.

(but also I am not in charge of maintaining the launch and parameter files working well together, and I understand the reluctance to change what currently works)

Would it be possible to set this to an environment variable?

It might, but I don't think we want to introduce another way to set parameters.

@KYabuuchi
Copy link

@ambroise-arm @kaspermeck-arm

Would it be possible to set this to an environment variable?
test_env_var: ${ENV_VAR}

Sorry, is this method actually valid for ros2 node? I have never seen this approach, and when I tried it, the variables were passed as strings without being interpreted.
If we want to reference environment variables during the startup, it is more common to use the $(env ) command within the launch file. And I do not think this would work in a parameter file.

Even if it is a valid method, I also do not think we want to introduce a new way to set parameters.

@kaspermeck-arm
Copy link
Author

@ambroise-arm @KYabuuchi

I suggest the following:

  1. Let's keep the the required fields in the schema and it will validate a default parameter file. For testing purposes, having a single complete parameter file is important. Then allow for parameters to be overwritten in the launch file. Does this sound OK?

  2. Ok, let's pass on my ${ENV_VAR} idea. For variables which use environmental variables, let's rely on the $(env ) command. E.g.,

Schema:

        "data_path": {
          "type": "string",
          "description": "Absolute path to data and artifacts packages directory.",
          "default": "abs_path_to_data_and_artifacts"
        }

and parameter file:

/**:
  ros__parameters:
    data_path: abs_path_to_data_and_artifacts

Which will be overwritten using:

<arg name="data_path" default="$(env HOME)/autoware_data">

Does this sound OK?

@ambroise-arm
Copy link
Contributor

ambroise-arm commented Dec 15, 2023

is this method actually valid for ros2 node?

@KYabuuchi Yes, you just need to allow the launch file to interpret it correctly. See yolo_v2_tiny_example.param.yaml and allow_substs="true" in yolo_v2_tiny_example.launch.xml

Ok, let's pass on my ${ENV_VAR} idea. For variables which use environmental variables, let's rely on the $(env ) command.

@kaspermeck-arm I think the point of @KYabuuchi was that those $... were meant to be used in the launch file, as opposed to the parameter file. Not ${...} vs $(env ...).

For the data path of the artifacts, I think the yolo_v2_tiny links above are also a very good example of what can be done then.

But then I think there is a distinction between using $HOME or $(find-pkg-share ...) in a parameter file (which don't require the user to do anything and I think are quite useful to have in parameter files), and creating custom new environment variables (which is what I thought we were talking about above and I am against doing)

@kaspermeck-arm
Copy link
Author

@KYabuuchi - I aligned with @ambroise-arm and this is our proposal.

All parameters which are expected to be used in a node must exist in the schema and parameter file according to the contribution guide. This is necessary for rendering the table view in the documentation and for testing purposes. There is nothing preventing multiple parameter files in the launch file.

The yolov2 tiny example resolves the environmental parameter issue. In the schema

the env command is used:

        "data_path": {
          "type": "string",
          "default": "$(env HOME)/autoware_data",
          "description": "Packages data and artifacts directory path."
        }

And the corresponding parameter file

/**:
  ros__parameters:
    ...
    data_path: $(env HOME)/autoware_data

Finally, in the launch file

the following line needs to be added:

<param from="$(var tvm_utility_example_node_param_path)" allow_substs="true"/>

Note allow_substs="true", which enables the commands from the parameter file to be resolved.

Does this solve the problems which you're facing?

@KYabuuchi
Copy link

KYabuuchi commented Dec 18, 2023

@kaspermeck-arm Thank you for explaining the method using environment variables. But, it is difficult for me to judge whether that method is appropriate or not.
If you are going to proceed with using new environment variables, it would be better to summarize the modifications and ask for feedback not only in this thread but also in the discussion space. 🙏

@kaspermeck-arm
Copy link
Author

@KYabuuchi - that's a good point that this could be a larger discussion. I'll bring this topic up in the next OADK meeting!

In my opinion there is no technical difference to setting the environmental parameters in the parameter file vs the launch file. It is however more transparent and follows the current contribution implementation details to do this in the parameter file.

@doganulus
Copy link
Member

When and who performs the variable substitution here? As late as node initialization or at an earlier step?

@kaspermeck-arm
Copy link
Author

kaspermeck-arm commented Dec 21, 2023

@doganulus - my understanding is that it's resolved when running the ros launch command, so during node initialization.

Edit: https://docs.ros.org/en/iron/Releases/Release-Galactic-Geochelone.html#use-launch-substitutions-in-parameter-files

@isamu-takagi
Copy link
Contributor

Launch configuration can also be used. I think this is easier to use because it has a narrower scope than environment variables.

test.param.yaml

/**:
  ros__parameters:
    foo: $(var foo)
    bar: data-bar
    baz: data-baz

test.launch.xml

<group>
  <let name="foo" value="test-foo"/>
  <node pkg="test_pkg" exec="node" name="node">
    <param from="$(find-pkg-share test_pkg)/config/test.param.yaml" allow_substs="true"/>
  </node>
</group>

output

foo: test-foo
bar: data-bar
baz: data-baz

@doganulus
Copy link
Member

I am leaning towards that keeping *.param.yml node parameter files free of any variable substitution as this means the implicit use of ros launch, which performs the substitution. If we want to use Eclipse Bluechi or Ankaios at some point, these tools probably would not support variable substitution as ros launch.

So the following solution would be more robust if my last assumption is true.

First, we have a pure yaml parameter file.

# test.param.yml
/**:
  ros__parameters:
    foo: /opt/autoware/<component>/<node>/config/test.param.yaml # Hardcoded conventional path
    bar: data-bar
    baz: data-baz

Then, I assume parameter overwrite will work as follows (needs to be verified):

# test.launch.xml
<group>
  <node pkg="test_pkg" exec="node" name="node">
    <param from="$(find-pkg-share test_pkg)/config/test.param.yaml" allow_substs="true"/>
    <param name="foo" value="test-foo"/>
  </node>
</group>

@kaspermeck-arm
Copy link
Author

@doganulus - I'm not following your reasoning regarding Bluechi and Ankaios. These are orchestrators, like k3s. Are you suggesting that these will remove the need to use the ros launch command entirely?

@doganulus
Copy link
Member

@kaspermeck-arm Yes, in the long run though. Those projects are at early stages but I think they can start ROS/Zenoh executables or containers directly.

@kaspermeck-arm
Copy link
Author

@doganulus - moving away from using ros launch to launch ROS nodes is something which I've never considered and I don't know how that would work in practice. I think this topic in and of itself is a way bigger topic of discussion.

The experience I have with BlueChi is that it is Kubernetes yaml compatible, translates these files to systemd service files and then runs containers as systemd services. When the container is launched, a command/file inside that environment is run (e.g., entry point file) and for ROS that would be the ros launch command. Can you share more information on how these orchestrators will address launching ROS nodes without using the ros launch command?

@doganulus
Copy link
Member

I think this topic in and of itself is a way bigger topic of discussion.

Definitely needs a bigger discussion. Below is a starting example.

For the example, let's work in a test container:

docker run --rm -it ros:humble

And we install some example ROS nodes inside the container, create an example param.yml file, and start the ROS node parameter_blackboard directly as follows:

apt update && apt install -y ros-humble-demo-nodes-cpp
printf '/**:\n  ros__parameters:\n    a_string: $(env HOME)/autoware_data\n' >> param.yml
/opt/ros/humble/lib/demo_nodes_cpp/parameter_blackboard --ros-args --params-file param.yml & # Launch the node directly
ros2 param get parameter_blackboard a_string                                                 # Check the parameter set                         

Normally, the command ros launch is a convenient way to start (many) nodes via launch files but we also can start nodes directly as in the example. And here we are able to set parameters using --params-file argument. First observation, starting nodes directly does not substitute variables as you can see in the output. I think it is a feature of ros launch. Therefore, I suggest not using variable or shell commands in the parameter files, which can be used outside ros launch.

Then, if we can start nodes directly in this way, systemd can start them via unit files. And this is the connection with other projects. I think BlueChi can orchestrate all ROS node unit files all at some point.

@kaspermeck-arm
Copy link
Author

@doganulus

I understand your point of not being dependent on ros launch and I think this is fair. However, not using variable/shell commands in the parameter files complicate things. This is a very useful feature of ROS as hard-coded environmental variables would be very tedious to change and not flexible, i.e., going against our goal of making Autoware software-defined.

How about we find a way to evaluate these variable/shell commands and then write the output to a new interpreted parameter file which is free of variable/shell commands and with the correct, e.g., paths to local directories. What do you think?

@doganulus
Copy link
Member

I am not absolutely against parameter file templates and environment variables in parameter files. They can be very handy when they are used right.

The current implementation does not feel right, as parameter files reside at a lower level than launch descriptions. Hence, parameter files should not be aware of specific ros launch features. If we could bring the substitution functionality to the lower level, then my objection would be resolved.

But this is not an easy task to handle properly. So I do not suggest immediate action as we gonna use ros launch for some time anyway. Still, avoiding variables as much as possible would be better. Setting some path conventions in the OpenAD Kit may help reduce the need for configuration.

@kaspermeck-arm
Copy link
Author

@doganulus

In the end, Autoware is a ROS project and I am in favor of using the features which ROS provides to make it more convenient to configure nodes. The situation with adding a schema was different as there was no solution to this built into ROS already. I suggest that if someone has different needs, i.e., no dynamic variables in the parameter file, then they can make these changes themselves with, e.g., a parameter resolving script. If you want to bring this topic further, I suggest you create a new discussion with all the background.

@doganulus
Copy link
Member

The clash between ROS practices and modern cloud-native practices is unavoidable. I side with the latter when they overlap.

As a compromise: We can at least restrict it to $VAR or ${VAR} syntax for environment variables. That can be substituted by external tools like envsubst. We don't have this opportunity for $(find-pkg-share) and friends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

8 participants