-
Notifications
You must be signed in to change notification settings - Fork 0
/
models.qmd
57 lines (45 loc) · 1.4 KB
/
models.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
title: "Models"
editor: source
execute:
echo: false
warning: false
message: false
---
## Model Definition
A generalized Bayesian linear model using the [stan_glm()](https://mc-stan.org/rstanarm/reference/stan_glm.html) function was used, where the dependent variable is the margin of victory and the independent variable is the point spread (absolute). This model was the byproduct of 8,206 observations.
$$y_i = \beta_0 + \beta_1 x_{1,i} + \epsilon_i$$
with $y = victory\_margin$, $x_1 = spread$, and $\epsilon_i \sim N(0, \sigma^2)$.
```{r}
library(tidyverse)
library(rstanarm)
```
```{r}
# Data files
list_of_files <- list.files(path = "data",
recursive = TRUE,
pattern = "\\.csv$",
full.names = TRUE)
```
```{r}
# Import data
x <- read_csv(list_of_files, id = "file_name", show_col_types = FALSE) |>
janitor::clean_names() |>
drop_na() |>
mutate(date = as.Date(date, "%m/%d/%Y"),
spread = abs(line)) |>
mutate(victory_margin = abs(home_score - visitor_score)) |>
select(date, home_team, visitor, home_score, visitor_score, line, spread, victory_margin) |>
rename(home = home_team)
```
```{r}
# Model
obj_fit <- stan_glm(data = x,
formula = victory_margin ~ spread,
family = gaussian,
refresh = 0,
seed = 9)
```
```{r}
gtsummary::tbl_regression(obj_fit)
```