STAT 39000: Project 2 — Fall 2020
Motivation: The ability to quickly reproduce an analysis is important. It is often necessary that other individuals will need to be able to understand and reproduce an analysis. This concept is so important there are classes solely on reproducible research! In fact, there are papers that investigate and highlight the lack of reproducibility in various fields. If you are interested in reading about this topic, a good place to start is the paper titled "Why Most Published Research Findings Are False" by John Ioannidis (2005).
Context: Making your work reproducible is extremely important. We will focus on the computational part of reproducibility. We will learn RMarkdown to document your analyses so others can easily understand and reproduce the computations that led to your conclusions. Pay close attention as future project templates will be RMarkdown templates.
Scope: Understand Markdown, RMarkdown, and how to use it to make your data analysis reproducible.
Questions
Question 1
Make the following text (including the asterisks) bold: This needs to be very bold
. Make the following text (including the underscores) italicized: This needs to be very italicized.
Surround your answer in 4 backticks. This will allow you to display the markdown without having the markdown "take effect". For example:
|
Be sure to check out the Rmarkdown Cheatsheet and our section on Rmarkdown in the book. |
Rmarkdown is essentially Markdown + the ability to run and display code chunks. In this question, we are actually using Markdown within Rmarkdown! |
-
2 lines of markdown text, surrounded by 4 backticks. Note that when compiled, this text will be unmodified, regular text.
Question 2
Create an unordered list of your top 3 favorite academic interests (some examples could include: machine learning, operating systems, forensic accounting, etc.). Create another ordered list that ranks your academic interests in order of most interested to least interested.
You can learn what ordered and unordered lists are [here](rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf). |
Similar to (1), in this question we are dealing with Markdown. If we were to copy and paste the solution to this problem in a Markdown editor, it would be the same result as when we Knit it here. |
-
Create the lists, this time don’t surround your code in backticks. Note that when compiled, this text will appear as nice, formatted lists.
Question 3
Browse www.linkedin.com/ and read some profiles. Pay special attention to accounts with an "About" section. Write your own personal "About" section using Markdown. Include the following:
-
A header for this section (your choice of size) that says "About".
-
The text of your personal "About" section that you would feel comfortable uploading to linkedin, including at least 1 link.
-
Create the described profile, don’t surround your code in backticks.
Question 4
LaTeX is a powerful editing tool where you can create beautifully formatted equations and formulas. Replicate the equation found here as closely as possible.
Lookup "latex mid" and "latex frac". |
-
Replicate the equation using LaTeX under the Question 4 header in your template.
Question 5
Your co-worker wrote a report, and has asked you to beautify it. Knowing Rmarkdown, you agreed. Make improvements to this section. At a minimum:
-
Make the title pronounced.
-
Make all links appear as a word or words, rather than the long-form URL.
-
Organize all code into code chunks where code and output are displayed. If the output is really long, just display the code.
-
Make the calls to the
library
function be evaluated but not displayed. -
Make sure all warnings and errors that may eventually occur, do not appear in the final document.
Feel free to make any other changes that make the report more visually pleasing.
{r my-load-packages}
library(ggplot2)`markdown
`r ''
`
`r ''````{r declare-variable-390, eval=FALSE}
my_variable <- c(1,2,3)
All About the Iris Dataset
This paper goes into detail about the iris
dataset that is built into r. You can find a list of built-in datasets by visiting stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html or by running the following code:
data()
The iris dataset has 5 columns. You can get the names of the columns by running the following code:
names(iris)
Alternatively, you could just run the following code:
iris
The second option provides more detail about the dataset.
According to stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html there is another dataset built-in to r called iris3
. This dataset is 3 dimensional instead of 2 dimensional.
An iris is a really pretty flower. You can see a picture of one here:
In summary. I really like irises, and there is a dataset in r called iris
.
``
-
Make improvements to this section, and place it all under the Question 5 header in your template.
Question 6
Create a plot using a built-in dataset like iris
, mtcars
, or Titanic
, and display the plot using a code chunk. Make sure the code used to generate the plot is hidden. Include a descriptive caption for the image. Make sure to use an RMarkdown chunk option to create the caption.
-
Code chunk under that creates and displays a plot using a built-in dataset like
iris
,mtcars
, orTitanic
.
Question 7
Insert the following code chunk under the Question 7 header in your template. Try knitting the document. Two things will go wrong. What is the first problem? What is the second problem?
````markdown
plot(my_variable)
``
Take a close look at the name we give our code chunk. |
Take a look at the code chunk where |
-
The modified version of the inserted code that fixes both problems.
-
A sentence explaining what the first problem was.
-
A sentence explaining what the second problem was.
For Project 2, please submit your .Rmd file and the resulting .pdf file. (For this project, you do not need to submit a .R file.)
OPTIONAL QUESTION
RMarkdown is also an excellent tool to create a slide deck. Use the information here or here to convert your solutions into a slide deck rather than the regular PDF. You may experiment with slidy
, ioslides
or beamer
, however, make your final set of solutions use beamer
as the output is a PDF. Make any needed modifications to make the solutions knit into a well-organized slide deck (For example, include slide breaks and make sure the contents are shown completely.). Modify (2) so the bullets are incrementally presented as the slides progress.
You do not need to submit the original PDF for this project, just the |
-
The modified version of the solutions in
beamer
slide form.