Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing custom check functions vignette #127

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .github/workflows/pkgdown-netlify-preview.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,20 @@ jobs:
branch: gh-pages
folder: docs

- id: deploy-dir
name: Determine dev status
run: |
if [[ $(grep -c -E 'sion. ([0-9]*\.){3}' ${{ github.workspace }}/DESCRIPTION) == 1 ]]; then
echo 'dir=./docs/dev' >> $GITHUB_OUTPUT
else
echo 'dir=./docs' >> $GITHUB_OUTPUT
fi
- name: Deploy PR preview to Netlify
Copy link
Contributor Author

@annakrystalli annakrystalli Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes fix broken dev previews URLs resulting from the fact we've got both release and dev versions of docs now

if: contains(env.isPush, 'false')
id: netlify-deploy
uses: nwtgck/actions-netlify@v2
uses: nwtgck/actions-netlify@v3
with:
publish-dir: './docs'
publish-dir: '${{ steps.deploy-dir.outputs.dir }}'
production-branch: main
github-token: ${{ secrets.GITHUB_TOKEN }}
deploy-message:
Expand Down
2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,11 @@ Imports:
yaml
Suggests:
covr,
DT,
gert,
kableExtra,
mockery,
pak,
readr,
rmarkdown,
testthat (>= 3.2.0),
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# hubValidations (development version)

* Added:
- new vignette on how to create custom validation checks for hub validations (#121)
- new section on how to manage additional dependencies required by custom validation functions (#22).

# hubValidations 0.7.0

* Added function `create_custom_check()` for creating custom validation check function files from templates (#121).
Expand Down
7 changes: 5 additions & 2 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,11 @@ navbar:
- text: Validating submissions locally
href: articles/validate-submission.html
- text: -------
- text: Including custom validation functions
href: articles/custom-functions.html
- text: "Custom validation checks"
- text: Writing custom validation functions
href: articles/writing-custom-fns.html
- text: Deploying custom validation functions
href: articles/deploying-custom-functions.html
development:
mode: auto

3 changes: 2 additions & 1 deletion tests/testthat/_snaps/check_tbl_values_required.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,8 @@
---

Code
check_for_errors(validate_submission(hub_path, file_path))
check_for_errors(validate_submission(hub_path, file_path,
skip_submit_window_check = TRUE))
Message

-- 2024-10-02-UMass-HMLR.parquet ----
Expand Down
5 changes: 4 additions & 1 deletion tests/testthat/test-check_tbl_values_required.R
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,10 @@ test_that("(#123) check_tbl_values_required works with all optional output types
)
# Ensure that req_vals check is the only one that fails
expect_snapshot(
check_for_errors(validate_submission(hub_path, file_path)),
check_for_errors(validate_submission(
hub_path, file_path,
skip_submit_window_check = TRUE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fixing an oversight in a previously added test that was now returning a submission window check error

)),
error = TRUE
)
})
24 changes: 24 additions & 0 deletions vignettes/articles/children/_add-deps-pkg.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
### Deploying custom functions as a package

To deploy custom functions managed as a package in `src/validations`, you can use the `pkg` configuration property in the `validations.yml` file to specify the package namespace.

For example, if you have created a simple package in `src/validations/` with a `cstm_check_tbl_example.R` script containing the specification of an `cstm_check_tbl_example()` function in `src/validations/R`, you can use the following configuration in your `validation.yml` file to source the function from the installed `validations` package namespace:

```
default:
validate_model_data:
custom_check:
fn: "cstm_check_tbl_example"
pkg: "validations"
```

To ensure the package (and any additional dependencies it depends on) is installed and available during validation, you must add the package to the `setup-r-dependencies` step in the `hubverse-actions` `validate-submission.yaml` GitHub Action workflow of your hub like so:

```yaml
- uses: r-lib/actions/setup-r-dependencies@v2
with:
packages: |
any::hubValidations
any::sessioninfo
local::./src/validations
```
53 changes: 53 additions & 0 deletions vignettes/articles/children/_add-deps-source.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@

## Available dependencies

**All `hubValidations` exported functions are available** for use in your custom check functions as well as functions from hubverse packages **`huUtils`**, **`hubAdmin`** and **`hubData`**.

```{r, echo=FALSE}
get_deps <- function(pkg) {
suppressMessages(pak::pkg_deps(pkg))
}
memoise_pkg_deps <- memoise::memoise(get_deps)
pkgs <- memoise_pkg_deps(".")[, c("package", "version")]
```

In addition, **functions in packages from the `hubValidations` dependency tree are also generally available**, both locally (once `hubValidations` is installed) and in the hubverse `validate-submission` GitHub Action.

Functions from these packages can be used in your custom checks without specifying them as additional dependencies.

```{r, echo=FALSE}
pkgs[order(pkgs$package), ] |>
DT::datatable()
```


## Additional dependencies

If any custom functions you are deploying depend on additional packages, you will need to ensure these packages are available during validation.

The simplest way to ensure they are available is to edit the `setup-r-dependencies` step in the `hubverse-actions` [`validate-submission.yaml`](https://github.com/hubverse-org/hubverse-actions/blob/main/validate-submission/validate-submission.yaml) GitHub Action workflow of your hub and add any additional dependency to the `packages` field list.

In the following pseudo example we add `additionalPackage` package to the list of standard dependencies:

```yaml
- uses: r-lib/actions/setup-r-dependencies@v2
with:
packages: |
any::hubValidations
any::sessioninfo
any::additionalPackage
```

Note that this ensures the additional dependency is available during validation on GitHub but does not guarantee it will be installed locally for hub administrators or submitting teams. Indeed such missing dependencies could lead to execution errors in custom checks when running `validate_submission()` locally.

You could use documentation, like your hub's README to communicate additional required dependencies for validation to submitting teams. Even better, you could add a check to the top of your function to catch missing dependencies and provide a helpful error message to the user.

```{r, eval=FALSE}
if (!(requireNamespace("additionalPackage"))) {
stop(
"Package 'additionalPackage' must be installed to run the full validation check.
Please install and try again."
)
}
```

26 changes: 26 additions & 0 deletions vignettes/articles/children/_custom-fn-available-args.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Each of the `validate_*()` functions contain a number of standard objects in their call environment which are passed automatically to arguments with the same name and therefore do not need including but can be overridden through a function's `args` configuration during deployment.

The exact set of objects available to arguments depend on `validate_*()` calling function:

- **`validate_model_file`:**
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.
- **`validate_model_data`:**
- `tbl`: a tibble of the model output data being validated.
- `tbl_chr`: a tibble of the model output data being validated with all columns coerced to character type.
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.
- `round_id_col`: character string of name of `tbl` column containing `round_id` information.
- `output_type_id_datatype`: character string. The value of the `output_type_id_datatype` argument. This value is useful in functions like `hubData::create_hub_schema()` or `hubValidations::expand_model_out_grid()` to set the data type of `output_type_id` column.
- `derived_task_ids`: character vector or `NULL`. The value of the `derived_task_ids` argument, i.e. the names of task IDs whose values depend on other task IDs.
- **`validate_model_metadata`:**
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.

The `args` configuration can be used to override objects from the caller environment as well as defaults during deployment.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Include custom validation functions"
title: "Deploying custom validation functions"
---

```{r, include = FALSE}
Expand Down Expand Up @@ -36,30 +36,13 @@ Within the default configuration, individual checks can be configured for each o
- **`source:`** Path to `.R` script containing function code to be sourced. If relative, should be relative to the hub's directory root. Must be supplied if function is not part of a package and only exists as a script.
- **`args`:** A yaml dictionary of key/value pairs or arguments to be passed to the custom function. Values can be yaml lists or even executable R code (optional).

Note that each of the `validate_*()` functions contain a standard objects in their call environment which are passed automatically to any custom check function and therefore do not need including in the `args` configuration.

- **`validate_model_file`:**
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.
- **`validate_model_data`:**
- `tbl`: a tibble of the model output data being validated.
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.
- `round_id_col`: character string of name of `tbl` column containing `round_id` information.
- **`validate_model_metadata`:**
- `file_path`: character string of path to file being validated relative to the `model-output` directory.
- `hub_path`: character string of path to hub.
- `round_id`: character string of `round_id`
- `file_meta`: named list containing `round_id`, `team_abbr`, `model_abbr` and `model_id` details.

The `args` configuration can be used to override objects from the caller environment as well as defaults.


Here's an example configuration for a single check (`opt_check_tbl_horizon_timediff()`) to be run as part of the `validate_model_data()` validation function which checks the content of the model data submission files.

```{r child="children/_custom-fn-available-args.Rmd", echo=FALSE, results="asis"}
```

#### Deploying optional `hubValidations` functions

Here's an example configuration for a single optional `hubValidations` check (`opt_check_tbl_horizon_timediff()`) to be run as part of the `validate_model_data()` validation function which checks the content of the model data submission files.

```{r, eval=FALSE, code=readLines(system.file('testhubs/flusight/hub-config/validations.yml', package = 'hubValidations'))}
```
Expand All @@ -79,6 +62,19 @@ default:
timediff: !expr lubridate::weeks(2)
```

#### Deploying custom functions

The above example involved an optional `hubValidation` function. To deploy a custom function that is not part of the `hubValidations` package, you should store the script containing the function in `src/validations/R/` and include the path to the script in the `source` argument in the configuration file.

```
default:
validate_model_data:
custom_check:
fn: "cstm_check_tbl_example"
source: "src/validations/R/cstm_check_tbl_example.R"
```


### Round specific configuration

Additional round specific configurations can be included in `validations.yml` that can add to or override default configurations.
Expand Down Expand Up @@ -159,6 +155,12 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations")
```


## Managing dependencies of custom sourced functions
# Managing dependencies of custom functions

TODO
If any custom functions you are deploying depend on additional packages, you will need to ensure these packages are available during validation.

```{r child="children/_add-deps-source.Rmd", echo=FALSE, results="asis"}
```

```{r child="children/_add-deps-pkg.Rmd", echo=FALSE, results="asis"}
```
6 changes: 3 additions & 3 deletions vignettes/articles/validate-pr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ library(hubValidations)

The `validate_pr()` functions is designed to be used to validate team submissions through Pull Requests on GitHub.
Only model output and model metadata files are individually validated using `validate_submission()` or `validate_model_metadata()` respectively on each file according to file type
(_See the end of this article for details of the standard checks performed on each file. For more information on deploying optional or custom functions please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`)_).
(_See the end of this article for details of the standard checks performed on each file. For more information on deploying optional or custom functions please check the article on [including custom functions](articles/deploying-custom-functions.html) (`vignette("deploying-custom-functions")`)_).
As part of checks, however, hub config files are also validated.
Any other files included in the PR are ignored but flagged in a message.

Expand Down Expand Up @@ -76,7 +76,7 @@ Supplying the names of derived task IDs to argument `derived_task_ids` will igno

#### Warning

Ignoring derived task IDs means that the validity of derived task ID value combinations will not be check. It is therefore **important to ensure that the values of derived task IDs are correctly derived from other task IDs through custom checks**. For example, the values of `target_end_date` can be checked by deploying optional check `opt_check_tbl_horizon_timediff()`. See the article on [including custom functions](articles/custom-functions.html) for more information.
Ignoring derived task IDs means that the validity of derived task ID value combinations will not be check. It is therefore **important to ensure that the values of derived task IDs are correctly derived from other task IDs through custom checks**. For example, the values of `target_end_date` can be checked by deploying optional check `opt_check_tbl_horizon_timediff()`. See the article on [including custom functions](articles/deploying-custom-functions.html) for more information.

</div>

Expand Down Expand Up @@ -260,6 +260,6 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations")

#### Custom checks

The standard checks discussed here are the checks deployed by default by the `validate_pr` function. For more information on deploying optional or custom functions please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`).
The standard checks discussed here are the checks deployed by default by the `validate_pr` function. For more information on deploying optional or custom functions please check the article on [deploying custom functions](articles/deploying-custom-functions.html) (`vignette("deploying-custom-functions")`).

</div>
2 changes: 1 addition & 1 deletion vignettes/articles/validate-submission.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,6 @@ arrow::read_csv_arrow(system.file("check_table.csv", package = "hubValidations")
#### Custom checks

The standard checks discussed here are the checks deployed by default by the `validate_submission` or `validate_model_metadata` functions.
For more information on deploying optional/custom functions or functions that require configuration please check the article on [including custom functions](articles/custom-functions.html) (`vignette("custom-functions")`).
For more information on deploying optional/custom functions or functions that require configuration please check the article on [including custom functions](articles/deploying-custom-functions.html) (`vignette("deploying-custom-functions")`).

</div>
Loading
Loading