-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add details about sample target #8
base: main
Are you sure you want to change the base?
Conversation
✅ Hub correctly configured! 2024-10-16 21:35:54 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good, thanks @sbidari! A few tweaks/suggestions.
@@ -169,7 +169,7 @@ Values in the `output_type` column are either | |||
- "quantile" or | |||
- "samples". | |||
|
|||
This value indicates whether that row corresponds to a quantile forecast or sample trajectories for weekly incident hospital admissions. | |||
This value indicates whether that row corresponds to a quantile forecast or sample trajectories for weekly incident hospital admissions. Samples can be submitted either for individual modeling tasks, where each `horizon` and `location` is treated independently, or as a part of a compound modeling task that encodes dependencies across forecast `horizon` and `location`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value indicates whether that row corresponds to a quantile forecast or sample trajectories for weekly incident hospital admissions. Samples can be submitted either for individual modeling tasks, where each `horizon` and `location` is treated independently, or as a part of a compound modeling task that encodes dependencies across forecast `horizon` and `location`. | |
This value indicates whether that row corresponds to a quantile forecast or sample trajectories for weekly incident hospital admissions. Samples can be submitted either for individual modeling tasks, where each `horizon` and `location` is treated independently, or as a part of a compound modeling task that encodes predictive statistical dependency across forecast `horizon`s and/or `location`s. |
We want to allow single location / multiple horizon joint samples and single horizon multiple location joint samples.
"min_samples_per_task": 100, | ||
"max_samples_per_task": 100 | ||
"max_samples_per_task": 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could consider allowing more than this. Main concern would be disk space / file size.
@@ -217,8 +217,18 @@ Teams must provide the following 23 quantiles: | |||
|
|||
#### sample output | |||
|
|||
When the predictions are samples, values in the `output_type_id` column are indexes for the samples. | |||
*More details to be added here* | |||
When the predictions are samples, values in the `output_type_id` column are indexes for the samples. The `output_type_id` is used to indicate the dependence across multiple task id variables when samples come from a joint predictive distribution. For example, samples from a joint predictive distribution across `horizon`, will share `output_type_id` for predictions for different horizons within a same `location` as below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the predictions are samples, values in the `output_type_id` column are indexes for the samples. The `output_type_id` is used to indicate the dependence across multiple task id variables when samples come from a joint predictive distribution. For example, samples from a joint predictive distribution across `horizon`, will share `output_type_id` for predictions for different horizons within a same `location` as below: | |
When the predictions are samples, values in the `output_type_id` column are indexes for the samples. The `output_type_id` is used to indicate the dependence across multiple task id variables when samples come from a joint predictive distribution. For example, samples from a joint predictive distribution across `horizon`s for a given `location`, will share `output_type_id` for predictions for different horizons within a same `location` as below: |
| 2024-10-15 | 0 | MA | sample | s1 | - | | ||
| 2024-10-15 | 1 | MA | sample | s1 | - | | ||
|
||
Here, `output_type_id = s0` specifies that the predictions for horizons -1, 0, and 1 are part of the same joint distribution. More details on sample output can be found in the [hubverse documentation of sample output type](https://hubverse.io/en/latest/user-guide/sample-output-type.html). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For extra clarity, give an example of a second location, so people know how to indicate that MA trajectories are not joint with, e.g., NH
trajectories, but are joint across horizons for each location? And maybe also give an example of a submission of trajectories that are joint across both locations and horizons?
add more details re. submitting samples from a marginal/joint distribution with an example data table.