Using R Markdown for Homework and More

This is a summary of the workshop on the same topic.

Why R Markdown

R Markdown is an authoring framework for data science that enables easy creation of dynamic documents, presentations, and reports from R. It provides a notebook interface to connect data and run code as well as to generate reports that can be shared with an audience. R Markdown documents are fully reproducible.

As a language, R Markdown is an extension of the markdown syntax that has embedded R code chunks; as an R package, it is a library to process and convert .Rmd files into a number of formats (see more discussions here).

Installing R Markdown

We can install the R Markdown package from CRAN. Make sure your copy of R is of the latest version.

install.packages("rmarkdown")

Output formats

R Markdown supports a variety of static and dynamic output formats.

Documents
- HTML
- PDF
- Word
Presentations
- PowerPoint
- reveal.js
Dashboards
Websites
Journals
Books
Interactive documents
…

Check all the supported formats by R Markdown here.

Figure 1. Output formats. Screenshot of RStudio introduction video(1:01)

.Rmd file

An R Markdown file is a simple plain text file that has the file extension .Rmd. It consists of three types of content: YAML metadata, text, and code chunks.

Figure 2. An Rmd file.

Rendering .Rmd file

A report can be generated from an .Rmd file by simply clicking the “Knit” button in RStudio. The default output format of a knitted file is HTML.

To generate PDF output from R Markdown, you need to have a LaTeX distribution installed.

Header

R Markdown documents start with a metadata section, the YAML metadata header, which can include document metadata such as title, author, date and output format. We can also control the appearance and style of a document by including a custom CSS file, specifying a theme, and adjusting the organization of its sections, among other options.

In the example below, we have set the title, author, date and output of the document. Besides, we have included several options for customization:

theme: readable sets the readable HTML theme (check the theme gallery)
highlight: textmate specifies the syntax highlighting style
toc: true & toc_float: true adds a floating table of contents
css: contents.css applies a pre-defined style sheet to the document

Read more about customizing output here.

Figure 3. Metadata at the top of an .Rmd file.

Text

Texts can be formatted with Pandoc’s Markdown, which we discuss below.

Code chunks

R code chunks, embedded with the Markdown syntax, can be executed independently and interactively. Code outputs are rendered immediately beneath the inputs. A variety of objects such as text, tables, and graphics can be produced in a code chunk.

Figure 4. Code chunks.

To insert code chunk, we can:

use the Insert button on the RStudio toolbar,
use the keyboard shortcut Ctrl + Alt + I (Windows), or Cmd + Option + I (macOS), or
type the chunk delimiters ```{r}```.

Below we discuss the code chunk in more detail.

Including code

Chunk options

On the upper right corner of a code chunk, we can see three little icons.

Figure 5. Chunk options.

The first icon can help us modify chunk options without typing code. Users have fine control over the outputs with the chunk options.

Figure 6. Modify Chunk Options.

There are a variety of chunk options for customizing components of a code chunk.

Figure 7. Chunk Options.

These chunk options include:

show output only echo=FALSE
show code and output echo=TRUE
show nothing (run code) include=FALSE
show nothing (don’t run code) eval=FALSE, include=FALSE
show warnings warning=TRUE
show messages message=TRUE
use custom figure size fig.height=, fig.width=
…

The second icon calls R to run all chunks above the current one.

Figure 8. Run All Chunks Above.

The third icon tells R to run the current chunk.

Figure 9. Run Current Chunk.

Inline code

To mark text as inline code, use a pair of backticks.

Input:

`code`

Output:

This is inline code.

Code block

To create a code block, put code in a pair of triple backticks ```.

Input:

```
code
```

Output:

This is a code block.

Including other code languages

In addition to R, an .Rmd file can execute code in many other languages, including:

Python
SQL
Bash
Rcpp
Stan
JavaScript
CSS

To process a code chunk in another language, we should replace the r at the start of the code chunk declaration with the name of that language. For instance,

`` `{python echo=FALSE,results=TRUE}
l = [1,2,45,'Hello World!']
for i in l:
  print(l)
`` `

l = [1,2,45,'Hello World!']
for i in l:
  print(l)

## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']
## [1, 2, 45, 'Hello World!']

Figures and images

In an .Rmd file, we can create figures with code and insert images.

Adjusting figure sizes

One thing we often do with figures is adjusting their sizes. For figures generated by code, there are several places to do that. We may include fig.height and fig.weight in the header, to start with.

Figure 10. Set figure size in header.

We may also set the figure height and the figure width as global options that apply to every chunk in the file by calling knitr::opts_chunk$set in a code chunk, usually put at the beginning of our file.

In the case below, every figure in the document will have a width of 6 and a height of 4.

knitr::opts_chunk$set(fig.width = 6, fig.height = 4)

Note that what we pass to knitr::opts_chunk$set can be overwritten in individual chunk headers.

Lastly, we can also set the figure height and width as chunk options:

Figure 11. Set figure size as chunk options

Emedding images

We may insert an image in an R Markdown file in several ways.

We can use the Markdown syntax, as shown below, to include a path, the width (optional), and a caption (optional). The syntax starts with an exclamation mark. The path can be a local path or a web url. We can set the image size in curve brackets {} at the end.

![optional caption text](url)
![optional caption text](path)
![optional caption text](path){width=20px}
![optional caption text](path){width=20%}
![optional caption text](path){width=20%, height=40%}

Input:

![NYU Shanghai Library](https://i0.wp.com/oncenturyavenue.org/wp-content/uploads/2017/03/nyushlib.jpg?w=1280)

Output:

NYU Shanghai Library

We may also center the image with <center> </center>.

<center>

![](path)

NYU Shanghai Library

</center>

The other way to include an image is to use the knitr function knitr::include_graphics() in a code chunk. The code chunk options include out.width and out.height to set the image width and height, fig.align to set the alignment (center, left, right, and default), and fig.cap to set the caption.

Figure 12. Embed image in code chunk.

Tables

R Markdown displays data frames and matrices as what we would see in the R console.

data("iris")
iris[1:6,]

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

If we need additional table formatting, we may use knitr::kable().

library(knitr)
kable(iris[1:6,], caption = 'This is a title')

This is a title
Sepal.Length	Sepal.Width	Petal.Length	Petal.Width	Species
5.1	3.5	1.4	0.2	setosa
4.9	3.0	1.4	0.2	setosa
4.7	3.2	1.3	0.2	setosa
4.6	3.1	1.5	0.2	setosa
5.0	3.6	1.4	0.2	setosa
5.4	3.9	1.7	0.4	setosa

To embellish tables with more advanced styling, we may use the package kableExtra, which provides a variety of functions to build LaTeX and HTML tables.

To create professional looking tables to summarize regression models, the package stargazer is recommended.

Additionally, we can also create tables with markdown by manually typing horizontal and vertical dashed lines.

Input:

First Header | Second Header
-------------|--------------
Content Cell | Content Cell
Content Cell | Content Cell

Output:

First Header	Second Header
Content Cell	Content Cell
Content Cell	Content Cell

Input:

  Right  Left     Center     Default
-------  ------ ----------   -------
   Cell  Cell      cell       cell
   Cell  Cell      cell       cell

Output:

Right	Left	Center	Default
Cell	Cell	cell	cell
Cell	Cell	cell	cell

Notes on column alignment:

If the dashed line is flush with the header text on the right side but extends beyond it on the left, the column is right-aligned.
If the dashed line is flush with the header text on the left side but extends beyond it on the right, the column is left-aligned.
If the dashed line extends beyond the header text on both sides, the column is centered.
If the dashed line is flush with the header text on both sides, the default alignment is used (in most cases, this will be left).

Formatting text with Markdown

We can format the text in an R Markdown file with Pandoc’s Markdown, a set of ways to mark text to enable formatting. When we render an R Markdown file, it is first compiled to Markdown through the package knitr, and then converted to an output document (e.g., PDF, HTML, or Word) by Pandoc.

Now let’s take a look at how to mark up text into formatted text for some elements.

Headers

Section headers can be created on six levels, indicated by one to six pound signs.

Input:

# Header 1

## Header 2

### Header 3

#### Header 4

##### Header 5

###### Header 6

Output:

Lists: unordered

Unordered list items start with *, -, or +. We can nest one list within another by indenting the sub-list.

Input:

* Item 1
* Item 2
    + Item 2.1
    + Item 2.2
        - Item 2.21
        - Item 2.22

Output:

Item 1
Item 2
- Item 2.1
- Item 2.2
  - Item 2.21
  - Item 2.22

Lists: ordered

Ordered list items start with numbers, which can also be nested.

Input:

1. Item 1
2. Item 2
3. Item 3

Output:

Item 1
Item 2
Item 3

Links

Hyperlinks are created using the syntax [text](link).

Input:

[R Markdown cheat sheet](https://shiny.rstudio.com/articles/rm-cheatsheet.html)

<https://shiny.rstudio.com/articles/rm-cheatsheet.html>

Output:

R Markdown cheat sheet

https://shiny.rstudio.com/articles/rm-cheatsheet.html

Block quotes

Blockquotes start with >.

Input:

Einstein once said

> I never said that.

Output:

Einstein once said

I never said that.

Horizontal rule

A horizontal line starts with three or more asterisks or dashes.

Input:

******
------

Output:

Footnotes

Footnotes are put inside the square brackets after a caret ^[].

Input:

two footnotes [^1][^2] 
Check the notes at the bottom of this page.


[^1]: This is the footnote.
[^2]: This is another footnote.

Output:

Here is a footnote reference.¹ ² Check the note at the bottom of the page.

Italicized text

Italicized text can be created using a pair of asterisk or underscores.

Input:

*text*
_text_

Output:

text

Bold text

Bold text can be created using a pair of double asterisks or double underscores.

Input:

**text**
__text__

Output:

text

Superscripts

A pair of carets (^) produce a superscript.

Input:

2^10^

Output:

2¹⁰

Subscripts

A pair of tildes (~) turn text to a subscript.

Input:

H~2~O

Output:

H₂O

Math expressions

The mathematical typesetting in R Markdown is based on LaTeX, a powerful tool to write mathematical equations and display mathematical notations. Read more about Latex here.

Note that in tables notations are compiled as inline code using a pair of $ while others are in display mode surrounded by a pair of $$.

Inline and display mode

Inline LaTeX equations can be written in a pair of $.

inline mode

Input:

This is a math $expression$ in inline mode.

Output:

This is a math $expression$ in inline mode.

display mode

Math expressions of the display mode can be written in a pair of $$.

Input:

This is a math $$expression$$ in display mode.

Output:

This is a math \[expression\] in display mode.

Alternatively:

Input:

This is a math \[expression\] in display mode.

Output:

This is a math \[expression\] in display mode.

Math mode accents

hat

Input:

$$\hat{a}$$

Output:

\[\hat{a}\]

bar

Input:

$$\bar{a}$$

Output:

\[\bar{a}\]

tilde

Input:

$$\tilde{a}$$

Output:

\[\tilde{a}\]

Greek letters

Input	Output
$\pi$	$\pi$
$\Pi$	$\Pi$
$h(\theta)$	$h(\theta)$
$\Delta$	$\Delta$
$\epsilon$	$\epsilon$
$\alpha$	$\alpha$

Subscript and superscript

subscript

Input	Output
$\beta_0$	$\beta_0$
$\theta_1x_1$	$\theta_1x_1$

superscript

Input	Output
$p^{k}$	$p^{k}$
$e^{-z}$	$e^{-z}$

Example:

Input:

$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \epsilon$
$h(\theta) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \theta_3x_3$

Output:

$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \epsilon$

$h(\theta) = \theta_0 + \theta_1x_1 + \theta_2x_2 + \theta_3x_3$

Sum

Input:

$$\sum_{i=1}^{m}$$

Output:

\[\sum_{i=1}^{m}\]

Example:

Input:

$\sum_{i=1}^{n}{(rating-\hat{rating})^2}$
$l(\theta)=\sum_{i=1}^{m}{[y^ilog(h_\theta(x^i)) + (1-y^i)log(1-h_\theta(x^i))]}$

Output:

$\sum_{i=1}^{n}{(rating-\hat{rating})^2}$

$l(\theta)=\sum_{i=1}^{m}{[y^ilog(h_\theta(x^i)) + (1-y^i)log(1-h_\theta(x^i))]}$

Fractions

Format:

$$\frac{numerator}{denominator}$$

Input:

$$\frac{a+b}{b}$$

Output:

\[\frac{a+b}{b}\]

Input:

$$1 + \frac{a}{b}$$

Output:

\[1 + \frac{a}{b}\]

Input:

$$g(z) = \frac{1}{1+e^{-z}}$$

Output:

\[g(z) = \frac{1}{1+e^{-z}}\]

Example:

Input:

$J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}{(h(x^i)-y^i)^2}$

Output:

$J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}{(h(x^i)-y^i)^2}$

Roots

Format:

$$\sqrt[n]{expression}$$

Input:

$$\frac{-b + \sqrt{b^2 - 4ac}}{2a}$$

Output:

\[\frac{-b + \sqrt{b^2 - 4ac}}{2a}\]

Input:

$$\sqrt[3]{q + \sqrt{ q^2 - p^3 }}$$

Output:

\[\sqrt[3]{q + \sqrt{ q^2 - p^3 }}\]

Integral

Format:

$$\int^a_b$$

Input:

$$\int^a_b \frac{1}{3}x^3$$

Output:

\[\int^a_b \frac{1}{3}x^3\]

Partial derivative

Input:

$$\frac{\partial u}{\partial t}$$

Output:

\[\frac{\partial u}{\partial t}\]

Example:

Input:

$\frac{\partial}{\partial \theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}{(h(x^i)-y^i)x^i_j}$

Output:

$\frac{\partial}{\partial \theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}{(h(x^i)-y^i)x^i_j}$

Matrices

Input:

$$
\begin{matrix} 
a & b \\
c & d 
\end{matrix}
\quad
\begin{pmatrix} 
a & b \\
c & d 
\end{pmatrix}
\quad
\begin{bmatrix} 
a & b \\
c & d 
\end{bmatrix}
\quad
$$

Output:

\[ \begin{matrix} a & b \\ c & d \end{matrix} \quad \begin{pmatrix} a & b \\ c & d \end{pmatrix} \quad \begin{bmatrix} a & b \\ c & d \end{bmatrix} \quad \]

Get Started, RStudio

Markdown Basics

R Markdown cheat sheet

R Markdown — Dynamic Documents for R

R Markdown: The Definitive Guide

Pandoc’s Markdown

Here is the footnote.↩︎
Here is another footnote.↩︎

Input	Output
$\pi$	\(\pi\)
$\Pi$	\(\Pi\)
$h(\theta)$	\(h(\theta)\)
$\Delta$	\(\Delta\)
$\epsilon$	\(\epsilon\)
$\alpha$	\(\alpha\)

Using R Markdown for Homework and More

Yujie Xiang, Yun Dai

12/2020

Why R Markdown

Installing R Markdown

Output formats

.Rmd file

Rendering .Rmd file

Header

Text

Code chunks

Including code

Chunk options

Inline code

Code block

Including other code languages

Figures and images

Adjusting figure sizes

Emedding images

Tables

Formatting text with Markdown

Headers

Lists: unordered

Lists: ordered

Links

Block quotes

Horizontal rule

Footnotes

Italicized text

Bold text

Superscripts

Subscripts

Math expressions

Inline and display mode

inline mode

display mode

Math mode accents

hat

bar

tilde

Greek letters

Subscript and superscript

subscript

superscript

Sum

Fractions

Roots

Integral

Partial derivative

Matrices

Read more