Lesson 9 RMarkdown parameters & presentations

9.1 Lesson Contents

  • RMarkdown parameters (params in yaml header, params in chunks)
  • Using rmarkdown::render() to render one or multiple parameterized reports

Note that class time this week will be mostly spend on the practice job interviews.

9.2 Introduction

As we saw previously, RMarkdown is R’s answer to doing reproducible research. In this lesson we take the next step in customization of RMardown reports. We will see that RMarkdown can be used for a number of semi- and fully automated reporting work flows.

This is a much shorter lesson than the previous ones, as we will have the portfolio presentations today. The lesson consists mainly of a number of exercises.

9.3 RMarkdown parameterization

By now you must have a clear idea on the possibilities of writing a reproducible static analysis report in RMarkdown. It is time to make it a bit more flexible and robust. Most of the time an analysis will depend on several parameters that you might want to vary for an analysis. We can think of the following things (this list is not exclusive):

  • The input data: The data might vary according for example a version of the data of the date it was generated, or the experiment that it was derived from. When the formatting of the data has been standardized, this can be parameterized. Think for example about a transcriptome analysis for different input data.
  • The statistical method: If multiple statistical models can be applied to the data or is dependent on the data input, you could parameterize the statistical method
  • Visualizations: If different types of visualizations are generated from the Rmd, you could parameterize the type of visualisation. This could be meaningful when you want to produce a report for different target audiences
  • Groups in the data that represent a subset of the data. Imagine you have data of a transcriptome analysis for a large population of cancer patients. The type of cancer is a grouping variable that could be a meaningful parameter to use to generate a per cancer type report.
Exercise 9

Discuss the possibilities with your class mates and come up with at least 2 more uses for parameterized RMDs.

Exercise 9

Ocean Data

To learn how to build a parameterized RMarkdown, go over this tutorial. The Rmd file of this tutorial can be found in the course repo: ./Rmd/ocean_floor.Rmd

Go over the following steps to complete the exercise

  1. Knit the Rmd as is and see what happens. For now, don’t try to knit it from within your bookdown project folder, open it in a different session and folder. You will be prompted to install some packages (marmap, mapproj).
  2. It will not knit and give an error. Google your error and solve the problem.
  3. Now that it works, try adjusting the parameter ‘data’ according the available datasets in the {marmap} (the command data(package = "marmap"), will give you the available setting for the data parameter)
  4. Knit using the render() function as shown in the link but use rmarkdown::render().
  5. Paste the rmarkdown::render() command in a separate script (so not in the ocean_floor.Rmd) and build a forloop to generate 4 different reports automatically. You can use vcdExtra::datasets("marmap") to get all the available dataset names in a dataframe.

Portfolio assignment 9

First read the whole assignment before you begin!

Obviously, as you are presenting your portfolio’s to us today, this assignment does not need to be included in the presentation.

The most obvious use of paramterization is creating a parameter for different data inputs. Here we will use the European Center for Disease Control (ECDC) COVID-19 case data.

The aim of this assignment is to create an RMardown file containing a parameterized report for the COVID-19 cases as can be downloaded from the ECDC. The Rmd should include at least three parameters:

  1. The country to which the report applies to
  2. The year that the reported data applies to
  3. The period in months that the report applies to

Here is where you can find the ECDC data: https://www.ecdc.europa.eu/en/publications-data/data-daily-new-cases-covid-19-eueea-country

The minimum requirements for the report is that next to the three parameters mentioned above, it should include at least one graph for the COVID-19 cases and one graph for the COVID-19 related deaths. After writing the Rmd, you should be able to render it with parameters set to any value you like. 1. Think about a way to show in your portfolio that you can work with parameterized reports and to showcase your rendered report on COVID-19 related deaths!

Remember that your portfolio is an online CV. It needs to look like something you would send out while job hunting or to prospective internship supervisors.

Here are some pointers to get your started:

  1. Start in a new Rmd file in your portfolio bookdown project
  2. Start by downloading and exploring the data (you may pick multiple datasets)
  3. First create an non parameterized version of your analysis (you need to come up with a graph for cases and a graph for deaths)
  4. After the code is finished for a static report, you can start replacing parts of your filtering steps for parameters.
  5. One way to access the parameters while you are writing your report is by running the first code chunk in your Rmd interactively, this will make the parameters available as a list in your Global environment. You can than access the individual parameter values by using the $ (params$) operator in your code (as you have seen in the ocean_floor.Rmd example, and explained in the ocean floor tutorial).

This assignment does not have a solution available, you are on your own here

9.4 Resources


CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Unless it was borrowed (there will be a link), in which case, please use their license.