R Markdown for everyday use: a mini tutorial
Narrative, code and visualizations
last updated: 17 Sept. 2021 - changelog
Prologue
Hi and welcome to this streamlined, fast-paced, crash course on basic R Markdown. I never intended to write a Prologue but it serves well as the only unnumbered ({.unnumbered}
) or ({-}
) section of the Table of Contents (TOC) to the left. Once you download the original rmd
file from the dropdown menu at the top-right of this document, you’ll notice that in my YAML there is an element instructing document’s sections to be numbered (number_section: true
). Nevertheless, this specific setting in Prologue, ({.unnumbered}
), takes precedence and overrides general YAML settings.
Prologue 2
Next section, Prologue 2, not only it is unnumbered but it is also unlisted despite the fact that YAML again instructs otherwise. If you look to the left, Prologue 2 is not included in the TOC. Have in mind that to be unlisted, it has also to be unnumbered. You can’t unlist a heading that is numbered. So the syntax for this is {.unnumbered .unlisted}
.
This is a nice opportunity to say that my YAML also instructs all of my code chunks, like the one below this paragraph, to be hidden by default, courtesy of this syntax: (code_folding: hide
), so that they can only appear by clicking on the Code
button to the right corner, above the chunk. Again, any “contra legem” instruction inside the code chunk takes precedence, allowing me to show the chunk on load, despite the general rule of hiding them all by default. The syntax for that is {r, class.source = 'fold-show'}
.
> #Prologue 2 {.unnumbered .unlisted}
1 Introduction
And now let’s move on to the official introductions! You can find more about me if you look down to the lower left pane of this (quite modified) theme from the rmdformats
package by Julien Barnier. The document you’re reading is a ‘living’ and ‘breathing’ HTML document. I can write this narrative in it, then spice it up with some executable code that you can hide/show and run, I can prepare visualizations that you can interact with and all in all, I can express myself with the absolute freedom of explaining my ideas and providing evidence outside the inherent restrictions of all known (to me) editing platforms. It’s minimal, it’s reproducible, it’s highly effective. As Donald Knuth1 eloquently put it:
“Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.”
This file is meant to be my perpetual educational playground in R and I thought it could prove useful to share it and possibly help beginners to quickly overcome the hurdle of setting up their own reporting environment. This document summarizes all hints, tips, tricks and perks I come across my readings from various scattered and dispersed sources.
(If you clicked the previous link you just learned about internal links and how you can easily let your readers navigate through any section of your site).
I would like this file to end up being a very elegant and meticulous example of what can be done by using R Markdown for data analysis through the production of HTML outputs. This means that I won’t discuss PDF/DOC output environment or respective options like \newpage
(page break) since in HTML (if not printed on paper) page break has no practical meaning. My main focus is to gradually standardize the reporting process and all necessary tools (packages, coding tips, best practices, etc) and then use this file as an open template for the projects to come.
p.s.: don’t forget to download the original
rmd
file from the menu at the top of the document to accompany you along the study of these lines. (How did I > do this? Just by adding thecode_download = TRUE
element in YAML). If you want to embed arbitrary files in the HTML output file for download (i.e., source > data files, etc), use these instructions.
Until you do so, here is a quick fix from xfun
package that gives you a link to download a file:
> my_file <- here::here('empty_excel.xlsx')
> xfun::embed_file(my_file)
And here you have another one, downloadthis
which adds a nifty button:
> library(downloadthis)
> list(mtcars, iris) %>%
+ download_this(
+ output_name = "mtcars and iris datasets",
+ output_extension = ".xlsx",
+ button_label = "Download 'mtcars' and 'iris' datasets as xlsx",
+ button_type = "warning",
+ has_icon = TRUE,
+ icon = "fa fa-save"
+ )
2 Basic document structure
R Markdown is all about literate and reproducible programming, so every markdown document could really benefit from following a more or less standard structure, consisting of four top chunks right after the YAML section: setup, libraries, functions, reads.
- setup: set the options you want to define globally (echo, eval, include, cache, figs, etc.), i.e.:
```r
> knitr::opts_chunk$set(message = FALSE, warning = FALSE, collapse = TRUE, prompt = TRUE)
```
- library: put all your library calls.
```r
> library(tidyverse)
> library(reticulate)
> library(here)
> (...)
```
- functions: all your functions in one place.
- reads: read the data you are going to be using in the document, i.e.:
```r
> data <- read_csv(data/my_dataset.csv)
```
3 Formatting
3.1 A few words about the formatting of this document
How did I make this section collapsible? 😉
You can use this handy code in case you need collapsible elements in your report. There are of course obvious formatting problems that need adjustment but one can take care of it according to needs and time available. Pro tip: add this summary h6 { display: inline-block; }
into your styles.css and tweak accordingly if you want the arrow to appear next to the header (thank me later!)
So, the document you read is an R Markdown document exported to HTML. Its format is based on a modified version of the readthedown theme that can be found in the rmdformats
package. This is a very handy theme for my needs but I felt like an aesthetic intervention was necessary to personalize some formatting options. To do that, I located the main css file of the theme (styles.css), brought it up to R’s working directory, called it in YAML section (css: custom.css) and made in there all necessary changes. From that point on, one should experiment to see what fits to her/his needs and what doesn’t (for more information visit the ‘How to’ section of this page). The winking face
emoji was directly copied and pasted from emojipedia).
So, with R Markdown, one can write a strikeout bold sentence and italicize it. Also, one can write subscripts, like chemistry formulas (H2O) or superscripts (210 is 1024) and this is something I would like to underline. I can also choose the color of some words, with good old plain HTML.
Except the really basic formatting that has already been introduced above and you can explore by downloading the rmd
file, one can use an array of options to beautify their reports. These options are not always aligned with R Markdown’s ‘canon’ though but they are clever and more importantly, do the job.
The ‘canon’ by the way, formulated by Markdown creator, John Gruber2, goes like this:
Markdown is intended to be as easy-to-read and easy-to-write as is feasible. Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. While Markdown’s syntax has been influenced by several existing text-to-HTML filters, including Setext, atx, Textile, reStructuredText, Grutatext, and EtText, the single biggest source of inspiration for Markdown’s syntax is the format of plain text email. To this end, Markdown’s syntax is comprised entirely of punctuation characters, which punctuation characters have been carefully chosen so as to look like what they mean. E.g., asterisks around a word actually look like emphasis. Markdown lists look like, well, lists. Even blockquotes look like quoted passages of text, assuming you’ve ever used email.
3.2 Line blocks
A nice way to preserve the division of lines in the output as well as any leading spaces is the line blocks syntax. Let’s write some poetry!
April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
Winter kept us warm, covering
Earth in forgetful snow, feeding
A little life with dried tubers. (…)
excerpt from The Waste Land by T. S. Eliot.
Although Pandoc documentation says that Inline formatting (such as emphasis) is allowed in the content, but not block-level formatting (such as block quotes or lists), I apparently used block quotes syntax and it worked…
3.3 Lists
What follows is a collection of some list types.
3.3.1 Bullet lists
Use either *
or +
or -
.
- one syntax error
- two syntax errors
- three syntax errors
3.3.2 Nested lists
objects
- chair
- table
- spoon
colors
- blue
- red
- white
3.3.3 Task lists
- an unchecked task list item
- checked item
3.3.4 Definition lists
In a data analytics project the use of definition lists would be quite useful for stakeholders or/and team members and this is a handy way of making it happen:
- Aggregation
The process of collecting or gathering many separate pieces into a whole.
- Analytical skills in programming
Qualities and characteristics as well as computational tools associated with using facts to solve problems. E.g.,
R markdown, R packages, etc. (this box was created with just 4 tabs)
- Area chart
A data visualization that uses individual data points for a changing variable connected by a continuous line with a field in area underneath.
3.3.5 Numbered example lists
There are cases where a concise way of referring to your examples is necessary. To achieve this, you can use the @
special marker before the example sentence, followed by a space
. The numbering will continue automagically throughout the document.
- For example, this will be my first example.
After that I will continue my report but all of a sudden the need of providing more examples is born. What is it going to happen if I use the same syntax, (@)
, once again?
- This is my second example and apparently the numbering continues.
Finally, to ‘cut off’ lists, you can insert some non-indented content like an HTML comment, which won’t produce visible output in any format, like: <!-- end of list -->
or similar. By ‘breaking’ the list, normal formatting can go on.
3.4 Boxes and chunks
3.4.1 A simple box
A nice div
to wrap-up conclusions:
- This is my first conclusion
- This is my second conclusion
3.4.2 Scrollable chunks
Some code for fixing chunk’s maximum height:
<style type="text/css">
pre {
max-height: 300px;
overflow-y: auto;
}
pre[class] {
max-height: 200px;
}
</style>
Above we defined some CSS rules to limit the height of code blocks. Now we can test if these rules work on code blocks and text output by looking for the presence of a vertical scroll bar:
> # pretend that we have a lot of code in this chunk
> if (1 + 1 == 2) {
+ # of course that is true
+ print(mtcars)
+ # we just printed a lengthy data set
+ }
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Next we add rules for a new class, scroll-100
, to limit the height to 100px, and add the class to the output of a code chunk via the chunk option class.output
. This way, we can manually define each chunk’s height:
> print(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
4 Visualizations
4.1 Interactive graphics
> library(ggplot2)
> library(plotly)
> library(gapminder)
>
> p <- gapminder %>%
+ filter(year==1977) %>%
+ ggplot( aes(gdpPercap, lifeExp, size = pop, color=continent)) +
+ geom_point() +
+ scale_x_log10() +
+ theme_bw()
>
> ggplotly(p)
4.2 Tables & Figures
4.2.1 Simple tables
The most simple table is the one that is formatted with plain markdown syntax:
So, a syntax like this:
| | 2020 | 2021 | both |
|---|---|---|---|
| files | 160 | 80 | 1 |
| folders | 53 | 23 | 0 |
ends up like this:
2020 | 2021 | both | |
---|---|---|---|
files | 160 | 80 | 1 |
folders | 53 | 23 | 0 |
For a convenient table generator, take a look here.
4.2.1.1 A simple kable
table
> require(knitr)
> require(kableExtra)
> mtcars %>%
+ head() %>%
+ kable(digits = 1, caption = 'example of kable table') %>%
+ kable_styling(full_width = FALSE, position = 'left') %>%
+ row_spec(0,
+ bold = T,
+ color = 'white',
+ background = 'black')
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.9 | 2.6 | 16.5 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.9 | 2.9 | 17.0 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.9 | 2.3 | 18.6 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.1 | 3.2 | 19.4 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.1 | 3.4 | 17.0 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.8 | 3.5 | 20.2 | 1 | 0 | 3 | 1 |
4.2.1.2 A less simple kable
table
A table with custom column names, custom alignment and a caption:
> iris2 <- head(iris)
> knitr::kable(iris2, col.names = c('We', 'Need', 'Five', 'Names', 'Here'), align = "lccrr", caption = "An example table caption.")
We | Need | Five | Names | Here |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
5.4 | 3.9 | 1.7 | 0.4 | setosa |
4.2.1.3 An example of kableExtra
table
Smaller fonts, kableExtra
package required, documentation here:
> kable(head(iris, 5), booktabs = TRUE) %>%
+ kable_styling(font_size = 8)
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
The following example takes a portion of a dataset and turns it into a table via kableExtra
package:
> data <- faithful[1:4, ]
> knitr::kable(data,
+ caption = "Table with kable")
eruptions | waiting |
---|---|
3.600 | 79 |
1.800 | 54 |
3.333 | 74 |
2.283 | 62 |
This one filters the content of the table:
> summary(cars$dist)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.00 26.00 36.00 42.98 56.00 120.00
> summary(cars$speed)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.0 12.0 15.0 15.4 19.0 25.0
4.2.1.4 A row/column kableExtra
table formatting
Various formatting options:
> kable(head(iris, 5), align = 'c', booktabs = TRUE) %>%
+ row_spec(1, bold = TRUE, italic = TRUE) %>%
+ row_spec(2:3, color = 'white', background = 'black') %>%
+ row_spec(4, underline = TRUE, monospace = TRUE) %>%
+ row_spec(5, angle = 45) %>%
+ column_spec(5, strikeout = TRUE)
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
4.2.1.5 A kableExtra
column grouping
for row grouping, here:
> iris2 <- iris[1:5, c(1, 3, 2, 4, 5)]
> names(iris2) <- gsub('[.].+', '', names(iris2))
> kable(iris2, booktabs = TRUE) %>%
+ add_header_above(c("Length" = 2, "Width" = 2, " " = 1)) %>%
+ add_header_above(c("Measurements" = 4, "More attributes" = 1))
Sepal | Petal | Sepal | Petal | Species |
---|---|---|---|---|
5.1 | 1.4 | 3.5 | 0.2 | setosa |
4.9 | 1.4 | 3.0 | 0.2 | setosa |
4.7 | 1.3 | 3.2 | 0.2 | setosa |
4.6 | 1.5 | 3.1 | 0.2 | setosa |
5.0 | 1.4 | 3.6 | 0.2 | setosa |
4.2.1.6 More kableExtra
options
Examples taken directly from the author’s website:
> library(kableExtra)
> dt <- mtcars[1:5, 1:6]
> dt %>%
+ kbl(caption = "Recreating booktabs style table") %>%
+ kable_classic_2(bootstrap_options = "striped", "hover", "condensed", full_width=F, position="float_right", html_font = "Cambria", font_size=16) %>%
+ column_spec(5:7, bold = T) %>%
+ row_spec(3:5, bold = T, color = "white", background = "#D7261E") %>%
+ add_header_above(c(" ", "Group 1" = 2, "Group 2" = 2, "Group 3" = 2)) %>%
+ add_header_above(c(" ", "Group 4" = 4, "Group 5" = 2))
Group 4
|
Group 5
|
|||||
---|---|---|---|---|---|---|
Group 1
|
Group 2
|
Group 3
|
||||
mpg | cyl | disp | hp | drat | wt | |
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 |
4.2.2 Table with filtering slider
This is a very fancy DT
package table with a slider for filtering!
> library(DT)
> datatable(mtcars, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )
With an unnumbered, unlisted and empty section header, ## {- .unlisted}
, we can end/‘cut off’ the tabset/pillset above and continue to write more paragraphs. This is the only way to escape this object.
And now that Tables section is over, I’d like to show you how to refer to any table or figure (or even equation, although I don’t deal with those yet) inside your document. The steps are:
- In your YAML, include
use_bookdown = TRUE
beneath youroutput:
element. Beware, use proper indentation or else this won’t work. - The go into your table/figure/equation code chunk and label it, i.e.
{r my-labeled-chunk}
- Then write the reference, i.e., Please see Table
\@ref(tab:my-labeled-chunk)
Live example: Refer to Table 4.1 above.
All set.
4.2.4 More on formatting of tables and figures
Moving on, in this figure, we play with out.width, alignment and captions (check original rmd!):
{r, out.width = '30%', fig.align='center', fig.cap='A beautiful plot from a newbie R coder!'}
> plot(cars, pch = 18)
Figure 4.1: A beautiful plot from a newbie R coder!
And a test in figure’s dimensions (width and height):
{r, fig.dim=c(5,3.2)}
> plot(cars, pch = 16)
Now, we’ll place multiple figures side-by-side from the same code chunk:
{r, fig.show='hold', out.width='50%'}
> par(mar = c(4, 4, .2, .1))
> plot(cars, pch = 19)
> plot(pressure, pch = 17)
And now, tables (the table can break across pages):
> knitr::kable(iris[1:15, ], caption = 'A caption')
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
5.4 | 3.9 | 1.7 | 0.4 | setosa |
4.6 | 3.4 | 1.4 | 0.3 | setosa |
5.0 | 3.4 | 1.5 | 0.2 | setosa |
4.4 | 2.9 | 1.4 | 0.2 | setosa |
4.9 | 3.1 | 1.5 | 0.1 | setosa |
5.4 | 3.7 | 1.5 | 0.2 | setosa |
4.8 | 3.4 | 1.6 | 0.2 | setosa |
4.8 | 3.0 | 1.4 | 0.1 | setosa |
4.3 | 3.0 | 1.1 | 0.1 | setosa |
5.8 | 4.0 | 1.2 | 0.2 | setosa |
The following pagination is due to YAML element, df_print:paged
{r cols.print=3, rows.print=3}
> mtcars
4.3 Several columns
Since R Markdown use the bootstrap framework under the hood! It is possible to benefit its powerful grid system. Basically, you can consider that your row is divided in 12 subunits of same width. You can then choose to use only a few of this subunits.
Here, I use 3 subunits of size 4 (4x3=12). The last column is used for a plot. You can read more about the grid system here. I got this result showing the following code in my R Markdown document.
> # annual flow on Nile River
> Nile %>% as.data.frame() %>% mutate(year=1871:1970) %>%
+ rename(flow=x) %>%
+ ggplot(.)+ geom_line(aes(x=year, y=flow),
+ color="darkblue", lwd=2) +
+ theme_minimal() + labs(x="", y="Flow",
+ title="Annual river flow on Nile River",
+ subtitle="(1871-1970)")
4.4 Diagrams
A diagram based on DiagrammeR
package:
> DiagrammeR::grViz("digraph {
+ graph [layout = dot, rankdir = TB]
+
+ node [shape = rectangle]
+ rec1 [label = 'Step 1. Wake up']
+ rec2 [label = 'Step 2. Write code']
+ rec3 [label = 'Step 3. ???']
+ rec4 [label = 'Step 4. PROFIT']
+
+ # edge definitions with the node IDs
+ rec1 -> rec2 -> rec3 -> rec4
+ }",
+ height = 500)
In the code chunk above, a special background formatting is used to change the appearance of the element. The same goes for this next example as well, which is one more diagram, this time with added parameters:
> #can also use bg-success for green
>
> DiagrammeR::grViz("
+ digraph graph2 {
+
+ graph [layout = dot, rankdir = LR]
+
+ # node definitions with substituted label text
+ node [shape = oval]
+ a [label = '@@1']
+ b [label = '@@2']
+ c [label = '@@3']
+ d [label = '@@4']
+
+ a -> b -> c -> d
+ }
+
+ [1]: names(iris)[1]
+ [2]: names(iris)[2]
+ [3]: names(iris)[3]
+ [4]: names(iris)[4]
+ ",
+ height = 100)
Chunk background formatting could be useful when presenting good/bad approaches in solving a problem or when you want to showcase worst/optimal coding choices.
5 Images
an image of myself
And below another method (a better method) of inserting images that allows alignment and size handling, by using the knitr
package:
{r, echo=TRUE, fig.align='center', out.width='30%'}
> knitr::include_graphics("images/chess.jpg")
6 Maps
> buellton <- st_sfc(st_point(c(-120.1927, 34.6136)), crs=4326)
> m1 <- mapview(buellton, col.regions="orange") # make the map
> m1@map %>% leaflet::addMeasure(primaryLengthUnit = "meters")
7 Use of other languages
7.1 Python
Yes, you can run Python in RStudio! (more here) (reticulate
package needed):
> x = 'hello, python world!'
+ print(x.split(' '))
## ['hello,', 'python', 'world!']
7.2 SQL
(more info):
> #create an in-memory RSQLite database of the mtcars dataset
> library(RSQLite)
> con <- dbConnect(RSQLite::SQLite(), dbname = ':memory:')
>
> dbListTables(con)
## character(0)
> dbWriteTable(con, "mtcars", mtcars)
> dbListTables(con)
## [1] "mtcars"
>
> dbListFields(con, "mtcars")
## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
## [11] "carb"
> dbReadTable(con, "mtcars")
> SELECT *
+ FROM mtcars
+ WHERE (`cyl` = 4.0)
> mt_cars_df
And now we take the SQL results which are now saved as an R dataframe and we throw them into ggplot (!!!):
> library(ggplot2)
> ggplot(data = mt_cars_df,
+ aes(x = disp, y = mpg)) +
+ geom_point() +
+ xlab("Engine Size") +
+ ylab("Miles Per Gallon") +
+ ggtitle("Fuel Efficiency Generally Decreases as Engine Size Increases")
8 Interactive documents
8.1 Via htmlwidgets
package
There are two types of interactive R Markdown documents: you can use the HTML Widgets framework, or the Shiny framework (or both). The HTML Widgets framework is implemented in the R package htmlwidgets (Vaidyanathan et al. 2020), interfacing JavaScript libraries that create interactive applications, such as interactive graphics and tables. Several widget packages have been developed based on this framework, such as DT (Xie, Cheng, and Tan 2021), leaflet (Cheng, Karambelkar, and Xie 2021), and dygraphs (Vanderkam et al. 2018). Visit https://www.htmlwidgets.org to know more about widget packages as well as how to develop a widget package by yourself.
Below is a map that shows the location of the Department of Statistics, Iowa State University, following this procedure. For more HTML widgets, experiment with that collection.
> library(leaflet)
> leaflet() %>% addTiles() %>%
+ setView(-93.65, 42.0285, zoom = 17) %>%
+ addPopups(
+ -93.65, 42.0285,
+ 'Here is the <b>Department of Statistics</b>, ISU'
+ )
8.2 Via shiny
package
A standard R plot can be made interactive by wrapping it in the Shiny renderPlot()
function. The selectInput()
function creates the input widget to drive the plot.
(section commented out due to conflicts between Shiny and other packages (need to revisit this section)
8.3 Dashboards
8.3.1 flexdashboard
package
You can use flexdashboard
to publish groups of related data visualizations as a dashboard. A flexdashboard
can either be static (a standard web page) or dynamic (a Shiny interactive document). A wide variety of components can be included in flexdashboard
layouts, including:
- Interactive JavaScript data visualizations based on htmlwidgets.
- R graphical output including base, lattice, and grid graphics.
- Tabular data (with optional sorting, filtering, and paging).
- Value boxes for highlighting important summary data.
- Gauges for displaying values on a meter within a specified range.
- Text annotations of various kinds.
8.3.2 A usage example
The following examples, while working well, are not aesthetically pleasing because the output format of this file does not come from the flexdashboard
package. So the idea behind this package is to use it along with the main report, via links from main to flexdashboard
report.
8.3.2.1 Contact Rate
> gauge(91, min = 0, max = 100, symbol = '%', gaugeSectors(
+ success = c(80, 100), warning = c(40, 79), danger = c(0, 39)
+ ))
8.3.2.2 Average Rating
> gauge(37.4, min = 0, max = 50, gaugeSectors(
+ success = c(41, 50), warning = c(21, 40), danger = c(0, 20)
+ ))
8.3.2.3 Cancellations
> gauge(7, min = 0, max = 10, gaugeSectors(
+ success = c(0, 2), warning = c(3, 6), danger = c(7, 10)
+ ))
9 Secrets for effective coding
When starting a new project, start an R Studio ‘New Project’ and keep the file path clean and tidy!!
- Use
here
package to make the file really portable.
- Use
Your working directory is where your *.rmd lives.
Finish your R Markdown with a
session-info
chunk.Document your packages and include code for optional installation, when sharing the *.rmd.
Fundamental mindset when using R Markdown:
- Reproducible research (recommended Coursera lesson)
- Literate programming (Donald Knuth)
The idea of literate programming shines some light on this dark area of science. This is an idea from Donald Knuth where you combine your text with your code output to create a document. This is a blend of your literature (text), and your programming (code), to create something that you can read from top to bottom. Imagine your paper - the introduction, methods, results, discussion, and conclusion, and all the bits of code that make each section. With
rmarkdown
, you can see the pieces of your data analysis all together (Nicolas Tierney, here).
Name (label) thy chunks!
Run code in chunks, run locally, knit, see how they work, move on.
Include TODO placeholders as notes for future revisits.
Avoid hard-coded info(numbers, text) if the use of inline R code is applicable and viable, i.e., There were 50 cars studied.
There were 50 cars studied.
Invest some quality time in reading packages’ documentation.
10 Books and packages
In order to prepare this file I’ve read and taken notes from the following web references:
- Yihui Xie, (2021). R Markdown: The Definitive Guide
- Yihui Xie, Christophe Dervieux, Emily Riederer, (2021). R Markdown Cookbook
- John MacFarlane, (2021). PANDOC official documentation (yes, I went through all of it (but only once!!))
- Holtz Yan, (2018).Pimp my RMD
- To knit and run the rmd file, the installation of the following packages (please excuse me if something’s missing!) is necessary:
htmltools
,leaflet
,flexdashboard
,reticulate
,knitr
,shiny
,pander
,rmdformats
,DiagrammeR
,kableExtra
,tidyverse
,plotly
,gapminder
,DT
,bookdown
andRSQLite
.
11 How to…?
To customize the postamble (lower-left corner) section of this theme (readthedown theme of rmdformats
package) in order to add more YAML elements beyond the default author
and date
ones, you need to:
- define in YAML the element you want, i.e., website, telephone, etc.
- tweak template.html [‘readthedown’ section], found in
rmdformats
package installation folder. - tweak the style.css for the ‘readthedown’ theme [various postamble sections]
- find the proper name for your glyphicon and include it in the template.html [‘readthedown’ section].
To knit with parameters:
- Updated: 07 Sept. 2021 (this was done with a custom parameter in my YAML)
- (check here for a fantastic way of knitting with parameters!)
- also here
12 Session Info
Make your readers’ life easier. Provide them with what’s necessary to understand and reproduce your work!
> devtools::session_info()
## - Session info ---------------------------------------------------------------
## setting value
## version R version 4.1.0 (2021-05-18)
## os Windows 10 x64
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate Greek_Greece.1253
## ctype Greek_Greece.1253
## tz Europe/Istanbul
## date 2021-09-17
##
## - Packages -------------------------------------------------------------------
## package * version date lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0)
## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0)
## base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.1.0)
## bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0)
## bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0)
## blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0)
## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.1)
## broom 0.7.8 2021-06-24 [1] CRAN (R 4.1.0)
## bslib 0.3.0 2021-09-02 [1] CRAN (R 4.1.1)
## bsplus 0.1.2 2020-06-25 [1] CRAN (R 4.1.1)
## cachem 1.0.5 2021-05-15 [1] CRAN (R 4.1.0)
## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0)
## class 7.3-19 2021-05-03 [2] CRAN (R 4.1.0)
## classInt 0.4-3 2020-04-07 [1] CRAN (R 4.1.1)
## cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0)
## codetools 0.2-18 2020-11-04 [2] CRAN (R 4.1.0)
## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0)
## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0)
## crosstalk 1.1.1 2021-01-12 [1] CRAN (R 4.1.1)
## data.table 1.14.0 2021-02-21 [1] CRAN (R 4.1.0)
## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.1)
## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0)
## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0)
## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0)
## DiagrammeR * 1.0.6.1 2020-05-08 [1] CRAN (R 4.1.1)
## digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0)
## downloadthis * 0.2.1 2020-09-17 [1] CRAN (R 4.1.1)
## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0)
## DT * 0.19 2021-09-02 [1] CRAN (R 4.1.1)
## e1071 1.7-8 2021-07-28 [1] CRAN (R 4.1.1)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0)
## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0)
## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
## flexdashboard * 0.5.2 2020-06-24 [1] CRAN (R 4.1.1)
## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0)
## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0)
## gapminder * 0.3.0 2017-10-31 [1] CRAN (R 4.1.1)
## generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0)
## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0)
## glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0)
## haven 2.4.1 2021-04-23 [1] CRAN (R 4.1.0)
## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.1)
## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0)
## hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0)
## htmltools * 0.5.2 2021-08-25 [1] CRAN (R 4.1.1)
## htmlwidgets 1.5.3 2020-12-10 [1] CRAN (R 4.1.1)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0)
## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0)
## kableExtra * 1.3.4 2021-02-20 [1] CRAN (R 4.1.1)
## KernSmooth 2.23-20 2021-05-03 [2] CRAN (R 4.1.0)
## knitr * 1.33 2021-04-24 [1] CRAN (R 4.1.1)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0)
## lattice 0.20-44 2021-05-02 [2] CRAN (R 4.1.0)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.1.0)
## leafem 0.1.6 2021-05-24 [1] CRAN (R 4.1.1)
## leaflet * 2.0.4.1 2021-01-07 [1] CRAN (R 4.1.1)
## leaflet.providers 1.9.0 2019-11-09 [1] CRAN (R 4.1.1)
## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0)
## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0)
## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0)
## mapview * 2.10.0 2021-06-05 [1] CRAN (R 4.1.1)
## Matrix 1.3-3 2021-05-04 [2] CRAN (R 4.1.0)
## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0)
## mime 0.11 2021-06-23 [1] CRAN (R 4.1.0)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0)
## pander * 0.6.4 2021-06-13 [1] CRAN (R 4.1.1)
## pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0)
## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0)
## plotly * 4.9.4.1 2021-06-18 [1] CRAN (R 4.1.1)
## png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0)
## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0)
## proxy 0.4-26 2021-06-07 [1] CRAN (R 4.1.1)
## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0)
## rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0)
## raster 3.4-13 2021-06-18 [1] CRAN (R 4.1.1)
## RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.1.0)
## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0)
## readr * 2.0.1 2021-08-10 [1] CRAN (R 4.1.1)
## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0)
## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0)
## reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0)
## reticulate * 1.20 2021-05-03 [1] CRAN (R 4.1.1)
## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0)
## rmarkdown 2.10 2021-08-06 [1] CRAN (R 4.1.0)
## rmdformats * 1.0.2 2021-04-19 [1] CRAN (R 4.1.1)
## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0)
## RSQLite * 2.2.8 2021-08-21 [1] CRAN (R 4.1.1)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
## rvest 1.0.0 2021-03-09 [1] CRAN (R 4.1.0)
## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.1)
## satellite 1.0.2 2019-12-09 [1] CRAN (R 4.1.1)
## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0)
## sf * 1.0-2 2021-07-26 [1] CRAN (R 4.1.1)
## sp 1.4-5 2021-01-10 [1] CRAN (R 4.1.1)
## stringi 1.6.2 2021-05-17 [1] CRAN (R 4.1.0)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
## svglite 2.0.0 2021-02-20 [1] CRAN (R 4.1.1)
## systemfonts 1.0.2 2021-05-11 [1] CRAN (R 4.1.1)
## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0)
## tibble * 3.1.3 2021-07-23 [1] CRAN (R 4.1.0)
## tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.1.0)
## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0)
## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.1)
## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0)
## units 0.7-2 2021-06-08 [1] CRAN (R 4.1.1)
## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0)
## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
## viridisLite 0.4.0 2021-04-13 [1] CRAN (R 4.1.0)
## visNetwork 2.0.9 2019-12-06 [1] CRAN (R 4.1.1)
## webshot 0.5.2 2019-11-22 [1] CRAN (R 4.1.1)
## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0)
## writexl 1.4.0 2021-04-20 [1] CRAN (R 4.1.1)
## xfun 0.25 2021-08-06 [1] CRAN (R 4.1.1)
## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0)
##
## [1] C:/Users/George/Documents/R/win-library/4.1
## [2] C:/Program Files/R/R-4.1.0/library
13 Changelog
170921 Added a link to the ‘Tables Generator’ website.
John Gruber, ‘Daring Fireball’ blog.↩︎