Skip to content

Commit

Permalink
differences for PR #273
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Apr 10, 2024
1 parent d91fbb7 commit ec3501d
Show file tree
Hide file tree
Showing 6 changed files with 156 additions and 12 deletions.
111 changes: 105 additions & 6 deletions 01-r-basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,16 +255,42 @@ longer exists.
Error: object 'gene_name' not found
```

## Understanding object data types (modes)
## Understanding object data types (classes and modes)

In R, **every object has two properties**:
In R, **every object has several properties**:

- **Length**: How many distinct values are held in that object
- **Mode**: What is the classification (type) of that object.
- **Class**: A property assigned to an object that determines how a function
will operate on it.

We will get to the "length" property later in the lesson. The **"mode" property**
**corresponds to the type of data an object represents**. The most common modes
you will encounter in R are:
**corresponds to the type of data an object represents** and the **"class" property determines how functions will work with that object.**


::::::::::::::::::::::::::::::::::::::::: callout

## Tip: Classess vs. modes

The difference between modes and classess is a bit **confusing** and the subject of
several [online discussions](https://stackoverflow.com/questions/35445112/what-is-the-difference-between-mode-and-class-in-r).
Often, these terms are used interchangeably. Do you really need to know
the difference?

Well, perhaps. This section is important for you to have a better understanding
of how R works and how to write usable code. However, you might not come across
a situation where the difference is crucial while you are taking your first steps
in learning R. However, the overarching concept—**that objects in R have these properties and that you can use functions to check or change them**—is very important!

In this lesson we will mostly stick to **mode** but we will throw in a few
examples of the `class()` and `typeof()` so you can see some examples of where
it may make a difference.

::::::::::::::::::::::::::::::::::::::::::::::::::



The most common modes you will encounter in R are:

| Mode (abbreviation) | Type of data |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Expand All @@ -279,9 +305,9 @@ Data types are familiar in many programming languages, but also in natural
language where we refer to them as the parts of speech, e.g. nouns, verbs,
adverbs, etc. Once you know if a word - perhaps an unfamiliar one - is a noun,
you can probably guess you can count it and make it plural if there is more than
one (e.g. 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If
one (e.g, 1 [Tuatara](https://en.wikipedia.org/wiki/Tuatara), or 2 Tuataras). If
something is a adjective, you can usually change it into an adverb by adding
"-ly" (e.g. [jejune](https://www.merriam-webster.com/dictionary/jejune) vs.
"-ly" (e.g., [jejune](https://www.merriam-webster.com/dictionary/jejune) vs.
jejunely). Depending on the context, you may need to decide if a word is in one
category or another (e.g "cut" may be a noun when it's on your finger, or a verb
when you are preparing vegetables). These concepts have important analogies when
Expand Down Expand Up @@ -353,6 +379,69 @@ Error in eval(expr, envir, enclos): object 'pilot' not found

::::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::: challenge


## Exercise: Create objects and check their class using "class"

Using the objects created in the previous challenge, use the `class()` function
to check their classes.

::::::::::::::: solution

## Solution





```r
class(chromosome_name)
```

```{.output}
[1] "character"
```

```r
class(od_600_value)
```

```{.output}
[1] "numeric"
```

```r
class(chr_position)
```

```{.output}
[1] "character"
```

```r
class(spock)
```

```{.output}
[1] "logical"
```


```r
class(pilot)
```

```{.error}
Error in eval(expr, envir, enclos): object 'pilot' not found
```

:::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::::::::::::

Notice that in the two challenges, `mode()` and `class()` return the same results. This time.

Notice from the solution that even if a series of numbers is given as a value
R will consider them to be in the "character" mode if they are enclosed as
single or double quotes. Also, notice that you cannot take a string of alphanumeric
Expand All @@ -373,6 +462,16 @@ mode(pilot)
[1] "character"
```


```r
pilot <- "Earhart"
typeof(pilot)
```

```{.output}
[1] "character"
```

## Mathematical and functional operations on objects

Once an object exists (which by definition also means it has a mode), R can
Expand Down
53 changes: 49 additions & 4 deletions 03-basics-factors-dataframes.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,53 @@ Ok, thats a lot up unpack! Some things to notice.
by the object mode (e.g. chr, int, etc.). Notice that before each
variable name there is a `$` - this will be important later.



::::::::::::::::::::::::::::::::::::::: challenge

## Exercise: Revisiting modes and classess

Remeber when we said mode and class are sometimes different? If you do, here
is a chance to check. What happens when you try the following?

1. `mode(variants)`
2. `class(variants)`

::::::::::::::: solution

## Solution




```r
mode(variants)
```

```{.output}
[1] "list"
```




```r
class(variants)
```

```{.output}
[1] "data.frame"
```

This result makes sense because mode (which deals with how an object is stored)
is treated as a **list** in R. A data frame is in some sense a "fancy" list.
However, data fames do have some specific properties so they have their own
class (**data.frame**) which is useful for functions (and programmers) to know.
:::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::::::::::::


## Introducing Factors

Factors are the final major data structure we will introduce in our R genomics
Expand Down Expand Up @@ -442,7 +489,7 @@ possible SNP we could generate a plot:
plot(factor_snps)
```

<img src="fig/03-basics-factors-dataframes-rendered-unnamed-chunk-14-1.png" style="display: block; margin: auto;" />
<img src="fig/03-basics-factors-dataframes-rendered-unnamed-chunk-16-1.png" style="display: block; margin: auto;" />

This isn't a particularly pretty example of a plot but it works. We'll be
learning much more about creating nice, publication-quality graphics later in
Expand Down Expand Up @@ -479,7 +526,7 @@ Now we see our plot has be reordered:
plot(ordered_factor_snps)
```

<img src="fig/03-basics-factors-dataframes-rendered-unnamed-chunk-16-1.png" style="display: block; margin: auto;" />
<img src="fig/03-basics-factors-dataframes-rendered-unnamed-chunk-18-1.png" style="display: block; margin: auto;" />

Factors come in handy in many places when using R. Even using more
sophisticated plotting packages such as ggplot2 will sometimes require you
Expand Down Expand Up @@ -1397,5 +1444,3 @@ write.csv(Ecoli_metadata, file = "exercise_solution.csv")
- Base R has many useful functions for manipulating your data, but all of R's capabilities are greatly enhanced by software packages developed by the community

::::::::::::::::::::::::::::::::::::::::::::::::::


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified fig/03-basics-factors-dataframes-rendered-unnamed-chunk-16-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
"config.yaml" "b91cd97fa3b408bd1ac0a00e67ab3219" "site/built/config.yaml" "2024-04-04"
"index.md" "7f9c30e6487338a0c3f8ecc4018873ab" "site/built/index.md" "2024-04-04"
"episodes/00-introduction.Rmd" "e1354ed92fb458179c8c00b00ee1cf55" "site/built/00-introduction.md" "2024-04-04"
"episodes/01-r-basics.Rmd" "2f4b7fd244990f97e0c2fe88bae2618b" "site/built/01-r-basics.md" "2024-04-04"
"episodes/01-r-basics.Rmd" "b41108befd1f768bdd55b070c01d4972" "site/built/01-r-basics.md" "2024-04-10"
"episodes/02-data-prelude.Rmd" "ab2b1fd3cdaae919f9e409f713a0a8ad" "site/built/02-data-prelude.md" "2024-04-04"
"episodes/03-basics-factors-dataframes.Rmd" "78de77380f2b4bfd76622a8fc1e10f99" "site/built/03-basics-factors-dataframes.md" "2024-04-10"
"episodes/03-basics-factors-dataframes.Rmd" "fac5d2dfcc1df976dcdd5f4770a9d622" "site/built/03-basics-factors-dataframes.md" "2024-04-10"
"episodes/04-bioconductor-vcfr.Rmd" "10eb69b4697d7ecb9695d36c0d974208" "site/built/04-bioconductor-vcfr.md" "2024-04-04"
"episodes/05-dplyr.Rmd" "f74055bd8677338a213e0a0c6c430119" "site/built/05-dplyr.md" "2024-04-04"
"episodes/06-data-visualization.Rmd" "0b45534421bad05f040b24c40b6da71b" "site/built/06-data-visualization.md" "2024-04-04"
Expand Down

0 comments on commit ec3501d

Please sign in to comment.