Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
DanChaltiel committed Nov 30, 2023
1 parent dffd718 commit ce5ecdc
Show file tree
Hide file tree
Showing 6 changed files with 31 additions and 30 deletions.
10 changes: 5 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ EDCimport is a package designed to easily import data from EDC software TrialMas

# EDCimport 0.4.0 <sub><sup>2023/xx/xx</sup></sub>

#### New features
### New features

- New function `check_subjid()` to check if a vector is not missing some patients (#8).
```r
Expand All @@ -33,15 +33,15 @@ tibble(subjid=c(1:10, 1)) %>% assert_no_duplicate() %>% nrow()
- You can now use the syntax `read_trialmaster(split_mixed=c("col1", "col2"))` to split only the datasets you need to (#10).


#### Bug fixes & Improvements
### Bug fixes & Improvements

- Reading with `read_trialmaster()` from cache will output an error if parameters (`split_mixed`, `clean_names_fun`) are different (#4).

- `split_mixed_datasets()` is now fully case-insensitive.

- Non-UTF8 characters in labels are now identified and corrected during reading (#5).

#### Minor breaking changes
### Minor breaking changes

- `read_trialmaster(use_cache="write")` is now the default. Reading from cache is not stable yet, so you should opt-in rather than opt-out.

Expand All @@ -52,7 +52,7 @@ tibble(subjid=c(1:10, 1)) %>% assert_no_duplicate() %>% nrow()

# EDCimport 0.3.0 <sub><sup>2023/05/19</sup></sub>

#### New features
### New features

- New function `edc_swimmerplot()` to show a swimmer plot of all dates in the database and easily find outliers.

Expand All @@ -66,7 +66,7 @@ tibble(subjid=c(1:10, 1)) %>% assert_no_duplicate() %>% nrow()

- New helper `unify()`, which turns a vector of duplicate values into a vector of length 1.

#### Bug fixes
### Bug fixes

- Reading errors are now handled by `read_trialmaster()` instead of failing. If one XPT file is corrupted, the resulting object will contain the error message instead of the dataset.

Expand Down
4 changes: 2 additions & 2 deletions R/helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# User helpers --------------------------------------------------------------------------------


#' Find a keyword
#' Find a keyword in the whole database
#'
#' Find a keyword in all names and labels of a list of datasets.
#'
Expand Down Expand Up @@ -73,7 +73,7 @@ find_keyword = function(keyword, data=getOption("edc_lookup"), ignore_case=TRUE)



#' Check completion of subject ID column
#' Check the completion of the subject ID column
#'
#' Compare a subject ID vector to the study's reference subject ID (usually something like `enrolres$subjid`).
#'
Expand Down
35 changes: 18 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
[![R-CMD-check](https://github.com/DanChaltiel/EDCimport/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/DanChaltiel/EDCimport/actions/workflows/check-standard.yaml)
<!-- badges: end -->

EDCimport is a package designed to easily import data from EDC software TrialMaster.
EDCimport is a package designed to easily import data from EDC software [TrialMaster](https://www.anjusoftware.com/trial-master/).

## Installation

Expand All @@ -20,15 +20,15 @@ devtools::install_github("DanChaltiel/EDCimport")

You will also need [`7-zip`](https://www.7-zip.org/download.html) installed, and preferably added to the [`PATH`](https://www.java.com/en/download/help/path.html).

### Windows-only

This package was developed to work on Windows and is unlikely to work on any other OS. Feel free to submit a PR if you manage to get it to work on another OS.
> [!WARNING]
> This package was developed to work on Windows and is unlikely to work on any other OS.
> You are very welcome to submit a PR if you manage to get it to work on Mac or Linux.
## TrialMaster

### Load the data

First, you need to request an export of type `SAS Xport`, with the checkbox "Include Codelists" ticked. This export should generate a `.zip` archive.
Inside TrialMaster, you should request an export of type `SAS Xport`, with the checkbox "Include Codelists" ticked. This export should generate a `.zip` archive.

Then, simply use `read_trialmaster()` with the archive password (if any) to retrieve the data from the archive:

Expand All @@ -37,7 +37,7 @@ library(EDCimport)
tm = read_trialmaster("path/to/my/archive.zip", pw="foobar")
```

The resulting object `tm` is a list containing all the datasets, plus the date of extraction (`datetime_extraction`) and a dataset summary (`.lookup`).
The resulting object `tm` is a list containing all the datasets, plus metadatas.

You can now use `load_list()` to import the list in the global environment and use your tables:

Expand All @@ -46,23 +46,26 @@ load_list(tm) #this also removes `tm` to save memory
mean(dataset1$column5)
```

There are other options available, e.g. colnames cleaning & table splitting), see `?read_trialmaster` for more details.
There are many other options available (e.g. colnames cleaning & table splitting), see `?read_trialmaster` for more details.

## Utils
### Database management tools

`EDCimport` include a set of useful tools that help with using the imported database.
`EDCimport` include a set of useful tools that help with using the imported database. See [References](https://danchaltiel.github.io/EDCimport/reference/index.html) for a complete list.

### Search the whole database
#### Database summary

`.lookup` is a dataframe containing for each dataset all its column names and labels.
Reading a database using `read_trialmaster()` generates the `.lookup` dataframe, which contains for each dataset the number of rows, columns, patients, and the CRF name.

Its main use is to work with `find_keyword()`. For instance, say you do not remember in which dataset and column is located the "date of ECG". `find_keyword()` will search every column name and label and will give you the answer:
`.lookup` is used by many other tools inside EDCimport, be careful not to modify or delete it.

``` r
find_keyword("date")
```
#### Search the whole database

Using `find_keyword()`, you can run a global search of the database.

For instance, say you do not remember in which dataset and column is located the "date of ECG". `find_keyword()` will search every column name and label and will give you the answer:

``` r
find_keyword("date")
#> # A tibble: 10 x 3
#> dataset names labels
#> <chr> <chr> <chr>
Expand All @@ -78,8 +81,6 @@ find_keyword("date")
#> 10 vs VISITDT Visit Date
```

Note that `find_keyword()` uses the `edc_lookup` option as its second argument, automatically set by `read_trialmaster()`.

### Swimmer Plot

The `edc_swimmerplot()` function will create a swimmer plot of all date variables in the whole database.
Expand Down
8 changes: 4 additions & 4 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ navbar:


reference:
- title: "Main function"
- title: "Reading databases"
- contents:
- read_trialmaster
- read_tm_all_xpt
Expand All @@ -31,15 +31,15 @@ reference:
- edc_swimmerplot
- title: "Helpers"
- contents:
- find_keyword
- assert_no_duplicate
- check_subjid
- unify
- extend_lookup
- find_keyword
- get_lookup
- get_datasets
- get_key_cols
- split_mixed_datasets
- extend_lookup
- get_lookup
- title: "List Utils"
- contents:
- load_list
Expand Down
2 changes: 1 addition & 1 deletion man/check_subjid.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/find_keyword.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit ce5ecdc

Please sign in to comment.