This is a summary of the workshop on the same topic.


Why R Markdown

R Markdown is an authoring framework for data science that enables easy creation of dynamic documents, presentations, and reports from R. It provides a notebook interface to connect data and run code as well as to generate reports that can be shared with an audience. R Markdown documents are fully reproducible.

As a language, R Markdown is an extension of the markdown syntax that has embedded R code chunks; as an R package, it is a library to process and convert .Rmd files into a number of formats (see more discussions here).


Installing R Markdown

We can install the R Markdown package from CRAN. Make sure your copy of R is of the latest version.

install.packages("rmarkdown")

Output formats

R Markdown supports a variety of static and dynamic output formats.

  • Documents
    • HTML
    • PDF
    • Word
  • Presentations
    • PowerPoint
    • reveal.js
  • Dashboards
  • Websites
  • Journals
  • Books
  • Interactive documents

Check all the supported formats by R Markdown here.

Figure 1. Output formats. Screenshot of RStudio introduction video(1:01)


.Rmd file

An R Markdown file is a simple plain text file that has the file extension .Rmd. It consists of three types of content: YAML metadata, text, and code chunks.

Figure 2. An Rmd file.


Rendering .Rmd file

A report can be generated from an .Rmd file by simply clicking the “Knit” button in RStudio. The default output format of a knitted file is HTML.

To generate PDF output from R Markdown, you need to have a LaTeX distribution installed.


Text

Texts can be formatted with Pandoc’s Markdown, which we discuss below.


Code chunks

R code chunks, embedded with the Markdown syntax, can be executed independently and interactively. Code outputs are rendered immediately beneath the inputs. A variety of objects such as text, tables, and graphics can be produced in a code chunk.

Figure 4. Code chunks.


To insert code chunk, we can:

  1. use the Insert button on the RStudio toolbar,
  2. use the keyboard shortcut Ctrl + Alt + I (Windows), or Cmd + Option + I (macOS), or
  3. type the chunk delimiters ```{r}```.

Below we discuss the code chunk in more detail.


Including code

Chunk options

On the upper right corner of a code chunk, we can see three little icons.

Figure 5. Chunk options.

The first icon can help us modify chunk options without typing code. Users have fine control over the outputs with the chunk options.

Figure 6. Modify Chunk Options.

There are a variety of chunk options for customizing components of a code chunk.

Figure 7. Chunk Options.

These chunk options include:

  • show output only echo=FALSE
  • show code and output echo=TRUE
  • show nothing (run code) include=FALSE
  • show nothing (don’t run code) eval=FALSE, include=FALSE
  • show warnings warning=TRUE
  • show messages message=TRUE
  • use custom figure size fig.height=, fig.width=

The second icon calls R to run all chunks above the current one.

Figure 8. Run All Chunks Above.


The third icon tells R to run the current chunk.

Figure 9. Run Current Chunk.


Inline code

To mark text as inline code, use a pair of backticks.

Input:

`code`

Output:

This is inline code.


Code block

To create a code block, put code in a pair of triple backticks ```.

Input:

```
code
```

Output:

This is a code block.

Including other code languages

In addition to R, an .Rmd file can execute code in many other languages, including:

  • Python
  • SQL
  • Bash
  • Rcpp
  • Stan
  • JavaScript
  • CSS

To process a code chunk in another language, we should replace the r at the start of the code chunk declaration with the name of that language. For instance,

`` `{python echo=FALSE,results=TRUE}
l = [1,2,45,'Hello World!']
for i in l:
  print(l)
`` `
l = [1,2,45,'Hello World!']
for i in l:
  print(l)
## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']

Figures and images

In an .Rmd file, we can create figures with code and insert images.


Adjusting figure sizes

One thing we often do with figures is adjusting their sizes. For figures generated by code, there are several places to do that. We may include fig.height and fig.weight in the header, to start with.

Figure 10. Set figure size in header.


We may also set the figure height and the figure width as global options that apply to every chunk in the file by calling knitr::opts_chunk$set in a code chunk, usually put at the beginning of our file.

In the case below, every figure in the document will have a width of 6 and a height of 4.

knitr::opts_chunk$set(fig.width = 6, fig.height = 4)

Note that what we pass to knitr::opts_chunk$set can be overwritten in individual chunk headers.


Lastly, we can also set the figure height and width as chunk options:

Figure 11. Set figure size as chunk options


Emedding images

We may insert an image in an R Markdown file in several ways.


We can use the Markdown syntax, as shown below, to include a path, the width (optional), and a caption (optional). The syntax starts with an exclamation mark. The path can be a local path or a web url. We can set the image size in curve brackets {} at the end.

![optional caption text](url)
![optional caption text](path)
![optional caption text](path){width=20px}
![optional caption text](path){width=20%}
![optional caption text](path){width=20%, height=40%}

Input:

![NYU Shanghai Library](https://i0.wp.com/oncenturyavenue.org/wp-content/uploads/2017/03/nyushlib.jpg?w=1280)

Output:

NYU Shanghai Library


We may also center the image with <center> </center>.

<center>

![](path)

NYU Shanghai Library

</center>

The other way to include an image is to use the knitr function knitr::include_graphics() in a code chunk. The code chunk options include out.width and out.height to set the image width and height, fig.align to set the alignment (center, left, right, and default), and fig.cap to set the caption.

Figure 12. Embed image in code chunk.


Tables

R Markdown displays data frames and matrices as what we would see in the R console.

data("iris")
iris[1:6,]
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

If we need additional table formatting, we may use knitr::kable().

library(knitr)
kable(iris[1:6,], caption = 'This is a title')
This is a title
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa

To embellish tables with more advanced styling, we may use the package kableExtra, which provides a variety of functions to build LaTeX and HTML tables.

To create professional looking tables to summarize regression models, the package stargazer is recommended.


Additionally, we can also create tables with markdown by manually typing horizontal and vertical dashed lines.

Input:

First Header | Second Header
-------------|--------------
Content Cell | Content Cell
Content Cell | Content Cell

Output:

First Header Second Header
Content Cell Content Cell
Content Cell Content Cell

Input:

  Right  Left     Center     Default
-------  ------ ----------   -------
   Cell  Cell      cell       cell
   Cell  Cell      cell       cell

Output:

Right Left Center Default
Cell Cell cell cell
Cell Cell cell cell

Notes on column alignment:

  • If the dashed line is flush with the header text on the right side but extends beyond it on the left, the column is right-aligned.
  • If the dashed line is flush with the header text on the left side but extends beyond it on the right, the column is left-aligned.
  • If the dashed line extends beyond the header text on both sides, the column is centered.
  • If the dashed line is flush with the header text on both sides, the default alignment is used (in most cases, this will be left).

Formatting text with Markdown

We can format the text in an R Markdown file with Pandoc’s Markdown, a set of ways to mark text to enable formatting. When we render an R Markdown file, it is first compiled to Markdown through the package knitr, and then converted to an output document (e.g., PDF, HTML, or Word) by Pandoc.

Now let’s take a look at how to mark up text into formatted text for some elements.


Headers

Section headers can be created on six levels, indicated by one to six pound signs.

Input:

# Header 1

## Header 2

### Header 3

#### Header 4

##### Header 5

###### Header 6

Output:


Lists: unordered

Unordered list items start with *, -, or +. We can nest one list within another by indenting the sub-list.

Input:

* Item 1
* Item 2
    + Item 2.1
    + Item 2.2
        - Item 2.21
        - Item 2.22

Output:

  • Item 1
  • Item 2
    • Item 2.1
    • Item 2.2
      • Item 2.21
      • Item 2.22

Lists: ordered

Ordered list items start with numbers, which can also be nested.

Input:

1. Item 1
2. Item 2
3. Item 3

Output:

  1. Item 1
  2. Item 2
  3. Item 3

Block quotes

Blockquotes start with >.

Input:

Einstein once said

> I never said that.

Output:

Einstein once said

I never said that.


Horizontal rule

A horizontal line starts with three or more asterisks or dashes.

Input:

******
------

Output:



Footnotes

Footnotes are put inside the square brackets after a caret ^[].

Input:

two footnotes [^1][^2] 
Check the notes at the bottom of this page.


[^1]: This is the footnote.
[^2]: This is another footnote.

Output:

Here is a footnote reference.12 Check the note at the bottom of the page.


Italicized text

Italicized text can be created using a pair of asterisk or underscores.

Input:

*text*
_text_

Output:

text

text


Bold text

Bold text can be created using a pair of double asterisks or double underscores.

Input:

**text**
__text__

Output:

text

text


Superscripts

A pair of carets (^) produce a superscript.

Input:

2^10^

Output:

210


Subscripts

A pair of tildes (~) turn text to a subscript.

Input:

H~2~O

Output:

H2O


Math expressions

The mathematical typesetting in R Markdown is based on LaTeX, a powerful tool to write mathematical equations and display mathematical notations. Read more about Latex here.

Note that in tables notations are compiled as inline code using a pair of $ while others are in display mode surrounded by a pair of $$.


Inline and display mode

Inline LaTeX equations can be written in a pair of $.


inline mode

Input:

This is a math $expression$ in inline mode.

Output:

This is a math \(expression\) in inline mode.


display mode

Math expressions of the display mode can be written in a pair of $$.

Input:

This is a math $$expression$$ in display mode.

Output:

This is a math \[expression\] in display mode.


Alternatively:

Input:

This is a math \[expression\] in display mode.

Output:

This is a math \[expression\] in display mode.


Math mode accents


hat

Input:

$$\hat{a}$$

Output:

\[\hat{a}\]


bar

Input:

$$\bar{a}$$

Output:

\[\bar{a}\]


tilde

Input:

$$\tilde{a}$$

Output:

\[\tilde{a}\]


Greek letters

Input Output
$\pi$ \(\pi\)
$\Pi$ \(\Pi\)
$h(\theta)$ \(h(\theta)\)
$\Delta$ \(\Delta\)
$\epsilon$ \(\epsilon\)
$\alpha$ \(\alpha\)

Subscript and superscript


subscript

Input Output
$\beta_0$ \(\beta_0\)
$\theta_1x_1$ \(\theta_1x_1\)

superscript

Input Output
$p^{k}$ \(p^{k}\)
$e^{-z}$ \(e^{-z}\)

Example:

Input:

$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \epsilon$
$h(\theta) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \theta_3x_3$

Output:

\(y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \epsilon\)

\(h(\theta) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \theta_3x_3\)


Sum

Input:

$$\sum_{i=1}^{m}$$

Output:

\[\sum_{i=1}^{m}\]


Example:

Input:

$\sum_{i=1}^{n}{(rating-\hat{rating})^2}$
$l(\theta)=\sum_{i=1}^{m}{[y^ilog(h_\theta(x^i)) + (1-y^i)log(1-h_\theta(x^i))]}$

Output:

\(\sum_{i=1}^{n}{(rating-\hat{rating})^2}\)

\(l(\theta)=\sum_{i=1}^{m}{[y^ilog(h_\theta(x^i)) + (1-y^i)log(1-h_\theta(x^i))]}\)


Fractions

Format:

$$\frac{numerator}{denominator}$$

Input:

$$\frac{a+b}{b}$$

Output:

\[\frac{a+b}{b}\]

Input:

$$1 + \frac{a}{b}$$

Output:

\[1 + \frac{a}{b}\]

Input:

$$g(z) = \frac{1}{1+e^{-z}}$$

Output:

\[g(z) = \frac{1}{1+e^{-z}}\]


Example:

Input:

$J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}{(h(x^i)-y^i)^2}$

Output:

\(J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}{(h(x^i)-y^i)^2}\)


Roots

Format:

$$\sqrt[n]{expression}$$

Input:

$$\frac{-b + \sqrt{b^2 - 4ac}}{2a}$$

Output:

\[\frac{-b + \sqrt{b^2 - 4ac}}{2a}\]

Input:

$$\sqrt[3]{q + \sqrt{ q^2 - p^3 }}$$

Output:

\[\sqrt[3]{q + \sqrt{ q^2 - p^3 }}\]


Integral

Format:

$$\int^a_b$$

Input:

$$\int^a_b \frac{1}{3}x^3$$

Output:

\[\int^a_b \frac{1}{3}x^3\]


Partial derivative

Input:

$$\frac{\partial u}{\partial t}$$

Output:

\[\frac{\partial u}{\partial t}\]


Example:

Input:

$\frac{\partial}{\partial \theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}{(h(x^i)-y^i)x^i_j}$

Output:

\(\frac{\partial}{\partial \theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}{(h(x^i)-y^i)x^i_j}\)


Matrices

Input:

$$
\begin{matrix} 
a & b \\
c & d 
\end{matrix}
\quad
\begin{pmatrix} 
a & b \\
c & d 
\end{pmatrix}
\quad
\begin{bmatrix} 
a & b \\
c & d 
\end{bmatrix}
\quad
$$

Output:

\[ \begin{matrix} a & b \\ c & d \end{matrix} \quad \begin{pmatrix} a & b \\ c & d \end{pmatrix} \quad \begin{bmatrix} a & b \\ c & d \end{bmatrix} \quad \]



  1. Here is the footnote.↩︎

  2. Here is another footnote.↩︎