-
Notifications
You must be signed in to change notification settings - Fork 0
/
Template.Rmd
139 lines (103 loc) · 3.62 KB
/
Template.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
title: "**Étude de l’évolution du coût des APC**"
subtitle: "Phase 1-2-3. Sous-titre"
output:
html_document:
theme: paper
toc: yes
toc_float: yes
toc_depth: 6
code_folding: hide
include:
after_body: footer.html
knit: (
function(inputFile, encoding) {
rmarkdown::render(inputFile, params = "ask",
encoding = encoding,
output_dir = "../reports",
output_file = paste0(tools::file_path_sans_ext(inputFile), ".html")) })
---
<style>
body {
text-align: justify
}
</style>
```{r setup, include=FALSE}
# Settings summarytools
library(summarytools)
st_options(plain.ascii = FALSE # This is very handy in all Rmd documents
, style = "rmarkdown" # This too
, footnote = NA # Avoids footnotes which would clutter the results
, subtitle.emphasis = FALSE
# This is a setting to experiment with - according to the theme used, it might improve the headings layout
)
# General settings
knitr::opts_chunk$set(
eval = TRUE,
fig.align = "center",
fig.show = "hold",
message = FALSE,
warning = FALSE,
collapse = TRUE,
out.width = "100%",
results = 'asis'
)
```
<br>
[![logo_datactivist](https://nextcloud.datactivist.coop/s/o53wzfMNnFosQni/preview)](https://datactivist.coop/fr/){width=25%} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ [![logo_pleiade](https://nextcloud.datactivist.coop/s/GnAMYqr7Ec3STFS/preview)](https://pleiade.nl/fr/){width=45%}
<br>
----
**Objectif** : objectif
**Source** : [lien vers les données]()
**Date début de l'analyse** : jour mois 2022
<br>
---
## I - Données brutes
```{r libraries et import}
# Packages nécessaires à l'analyse
packages = c("tidyverse", "data.table", "summarytools", "plotly", "kableExtra", "knitr", "DT")
## Installation des packages si besoin et chargement des librairies
package.check <- lapply(
packages,
FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
library(x, character.only = TRUE)
}
}
)
# Import des données
#lien page données
start_time <- Sys.time()
data <- fread("https://storage.gra.cloud.ovh.net/v1/AUTH_32c5d10cb0fe4519b957064a111717e3/bso_dump/bso-publications-20220118.csv.gz", na.strings=c("","NA"))
end_time <- Sys.time()
# Configuration du temps d’exécution de la commande
import_time <- end_time - start_time
import_time <- paste(round(as.numeric(import_time, units.difftime(import_time)),2), units.difftime(import_time))
```
**Temps d'exécution de l'import des données** : `r import_time`.
**Dimensions du jeu de données brut** : `r nrow(data)` observations et `r ncol(data)` variables.
**Les 100 premières observations** :
```{r}
kable(data[c(1:100),], format = "html") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = T,
font_size = 9,
fixed_thead = TRUE, #fix headers when scrolling
html_font = "sans-serif") %>%
scroll_box(width = "100%", height = "600px")
```
<br>
---
## II - Statistiques descriptives
<br>
+ Nombre de **valeurs manquantes** :
```{r}
# On compte le nombre de valeurs manquantes
nb_NA <- as.data.frame(apply(is.na(data), 2, sum)) %>%
rename(`nombre de NA` = `apply(is.na(data), 2, sum)`) %>%
mutate(pourcentage = `nombre de NA`/nrow(data)*100) %>%
mutate(pourcentage = round(pourcentage, 2))
nb_NA %>% datatable(options=list(pageLength=ncol(data), searching=F), width = '60%')
```
<br>