forked from MotusWTS/MotusRBook
-
Notifications
You must be signed in to change notification settings - Fork 0
/
12-AppendixD.Rmd
129 lines (78 loc) · 8.09 KB
/
12-AppendixD.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# Appendix - `motus` - Data filtering functions {#appendixD}
```{r tidyr11, echo = FALSE, message = FALSE, warning = FALSE}
library(knitr)
opts_chunk$set(tidy.opts=list(width.cutoff=50), tidy = FALSE)
```
The `motus` R package offers functions that can be used to assign probabilities to tag detections, and to filter detections based on those probabilities. For example, as you work through your data to clean false positive and ambiguous detections (see Chapter \@ref(dataCleaning)), you may determine that some detections do not belong to your tag(s). Instead of simply using an R script to filter out those detections, you can use these filter functions to create and save a custom filter in your .motus file, which assigns a probability value between 0 and 1 to the runIDs supplied in the filter.
The data filtering functions in the R package work at the level of a run. A run is a group of consecutive detections of a tag detected on a receiver. In general, a detection with a run length of 2 has a high probability of being a false positive detection. The probabilities associated with each runID can be generated in a number of possible ways, including at the simplest level, generating a list of 0's and 1's for records that you would like to exclude or include. Alternatively, you might develop a model that assigns a probability to each runID in your data.
## `listRunsFilters` {#listRunsFilters}
### Description
Returns a dataframe containing the filterIDs, logins, names, projectIDs and descriptions for a given tag or receiver projectID available in the local database.
## Arguments
**src** is the SQLite object that you get from loading a .motus file into R, e.g., 'sql.motus' file in Chapter \@ref(accessingData).
## Example
```{r listRunsFilters.C, eval = FALSE}
filt.df <- listRunsFilters(src = sql.motus)
```
## `createRunsFilter` {#createRunsFilter}
### Description
This function can be used mostly by users to modify properties of existing filters (e.g., filter description or projectID), but it is also being called internally by 'writeRunsFilter' (section \@ref(writeRunsFilter)) to generate a new filterID. To save the actual filter records, you must use `writeRunsFilter` (section \@ref(writeRunsFilter)). The function returns the `filterID` (integer) in the local database that matches the new or existing filter with the provided filterName. If a filter with the same name already exists, the function generates a warning and returns the ID of the existing filter.
### Arguments
**`src`** is the SQLite object that you get from loading a .motus file into R, e.g., 'sql.motus' file in Chapter \@ref(accessingData).
**`filterName`** the name you would like to assign to the filter. The function only creates a new filter if the name does not already exist locally.
**`motusProjID`** the numeric ID associated with a project, e.g., 176 for the sample data used throughout this book. The function defaults to motusProjID = 'NA' when project ID is not supplied, which is the recommended value for now. The project ID assigned to a filter will mostly be useful for future synchronization of filters with the Motus server. The detection records contained in the filter do not have to be assigned to the projectID assigned to the filter.
**`descr`** default 'NA'. Optional description of the filter.
**`update`** boolean (default = FALSE). If the filter already exists, determines if the properties (e.g. descr are preserved or updated)
### Example
Create a new filter called “myfilter” for the sql.motus database which is not attached to a specific project:
```{r createRunsFilter2, eval = FALSE}
createRunsFilter(sql.motus, "myfilter")
# OR add assignment to project
createRunsFilter(sql.motus, "myfilter", motusProjID = 176)
# OR add project and description, possibly updating any previous version called myfilter.
createRunsFilter(sql.motus, "myfilter", motusProjID = 176,
descr = "assign probability of 0 to false positives", update = TRUE)
```
## `getRunsFilters` {#getRunsFilters}
### Description
Returns a sqlite table reference to the runsFilters records saved in the database (`runID`, `motusTagID`, and `probability`) associated with a specific name (and optionally project) from the local database. For examples on how you can use the returned table to merge with your detection data, refer to section \@ref(saveFilter) in chapter 5.
### Arguments
**`src`** is the SQLite object that you get from loading a .motus file into R, e.g., 'sql.motus' file in Chapter \@ref(accessingData).
**`filterName`** the name you used when you created or saved your filter. Function returns a warning if the `filterName` doesn't exist.
**`motusProjID`** the numeric ID associated with a project, e.g., 176 for the sample data used throughout this book. The function defaults to `motusProjID = 'NA'` when project ID is not supplied.
### Example
```{r getRunsFilters, eval = FALSE}
tbl.filt <- getRunsFilters(src = sql.motus, filterName = "myfilter")
tbl.filt2 <- getRunsFilters(sql.motus, "myfilter2")
# filter records from df that are in tbl.filt
df <- left_join(df, tbl.filt, by = c("runID", "motusTagID")) %>%
mutate(probability = if_else(is.na(probability), 1, probability)) %>%
filter(probability > 0)
# you can apply a second filter, tbl.filt2, to the result of the previous filter
df <- left_join(df, tbl.filt2, by = c("runID", "motusTagID")) %>%
mutate(probability = if_else(is.na(probability), 1, probability)) %>%
filter(probability > 0)
```
## `writeRunsFilter` {#writeRunsFilter}
### Description
Writes to the local database (SQLite file) the content of a dataframe containing `runID`, `motusTagID`, and assigned `probability.` If the `filterName` provided does not exist, the function will call `createRunsFilter` (section \@ref(createRunsFilter)) to create one in your database. The default behaviour of the function is that any new records from the dataframe are appended to the existing or new filter called filterName, those that already are present (same `runID` and `motusTagID`) are replaced (`overwrite=TRUE`), but those that are not included in the dataframe are retained in the existing filter table (`delete=FALSE`). To entirely replace the existing filter values with those of the new dataframe, use `delete=TRUE`. The function returns a sqlite table reference to the filter, similarly to `getRunsFilter` (section \@ref(getRunsFilters)).
### Arguments
**`src`** is the SQLite object that you get from loading a .motus file into R, e.g., 'sql.motus' file in Chapter \@ref(accessingData).
**`filterName`** the name of the filter you would like to assign the database to.
**`motusProjID`** the numeric ID associated with a project, e.g., 176 for the sample data used throughout this book. Default = 'NA' when project ID is not supplied.
**`df`** dataframe which contains the `runID` (integer), `motusTagID` (integer), and `probability` (float) of detections you would like to assign a filter to. `motusTagID` should be the actual tag ID, and not the negative `ambigID` associated with ambiguous detections.
**`overwrite`** `Default = "TRUE"`. When `TRUE`, ensures that existing records (same `runID` and `motusTagID`) matching the same `filterName` and `runID` get replaced in the local database.
**`delete`** `Default = "FALSE"`. When `TRUE`, removes all existing filter records associated with the `filterName` and re-inserts the ones contained in df. This option should be used if df contains the entire set of filters you want to save.
### Examples
```{r writeRunsFilter, eval = FALSE}
# write a dataframe containing filter records (runID, motusTagID and
# probability) to “myfilter”
writeRunsFilter(src = sql.motus, filterName = "myfilter", df = filter.df)
# write a dataframe containing filter records (runID, motusTagID and
# probability) to “myfilter”, overwriting a previous version entirely
writeRunsFilter(src = sql.motus, fileName = "myfilter", df = filter.df, delete = TRUE)
# write a dataframe containing filter records (runID, motusTagID and
# probability) to "myfilter", but only append new records, leaving previously
# created ones intact
writeRunsFilter(src = sql.motus, "myfilter", df = filter.df, overwrite = FALSE)
```