Shiny From Scratch Hands on tutorial

Tue, Aug 18, 2020 26-minute read

How-to

You can follow this tutorial on your computer by:

  1. Navigating to the GitHub page
  2. Click on the green button “Clone or download” -> “Download ZIP”
  3. Unzip locally
  4. Open the file “index.Rmd” in RStudio

The presentation slides are available here

Thanks…

to Martin Gustavsson and Xuan Li for helping out checking this material, and the SkåneR UseR! Group for arranging the workshop!

Introduction

The {shiny} R package allows users to rapidly set up interactive interfaces to their datasets, allowing them and others to explore the data directly.

Here, we will build a fully functional interface to a life expectancy dataset (or your own data!) in around fifty lines of code. This interface can then be rapidly deployed and shared.

The tutorial is designed to expect no prior knowledge with {shiny}, and limited knowledge in R. It aims to introduce the concepts gradually to give you an intuition for what is happening behind the scenes. If you have some prior experience, you can breeze through the initial sections and find intermediate exercises and topics towards the end.

A convenient cheat sheet covering various commonly used {shiny} functionality can be found here: https://shiny.rstudio.com/images/shiny-cheatsheet.pdf

Loading libraries

This tutorial assumes that you have the following things installed:

  • R Studio (and R)
  • The R packages: {shiny} and {tidyverse}

You can install the packages by running the chunk below (by clicking the tiny green arrow to the right). Note that this installation might take some time.

install.packages(c("shiny", "tidyverse"))

Once installed, load the packages.

library(shiny)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.3
## ✓ tibble  2.1.3     ✓ dplyr   0.8.5
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
theme_set(theme_classic())  # Removes the gray ggplot2 background

A quick peek at a Shiny interface

Before going to the details, let’s take a look at the default {shiny} app.

In RStudio, go to File -> New File -> Shiny Web App, write in a name for the application and leave the application type as the default (single file). Click “Create”.

This will generate a skeleton for an app. Try running it by clicking “Run app”. This should start the default shiny application (looking like the image below). Try dragging the “number of bins” slider. The plot should change accordingly.

There are two parts of this application - the “UI” and the “server” part. The UI is that the user interacts with, and the server calculates things in the background.

In this case, the UI consists of the slider where you can determine the number of bins and an output point which specify where the figure should go. This is what is happening on the user’s side.

The server contains the code calculating the figure and delivers it to the UI. This is happening in the background, often on a computer somewhere in the ‘cloud’.

Here is an attempt to illustrate this.

The {shiny} app does, in principle, consist of the UI part and the server part.

The UI

The UI (“User Interface”) is used to specify what “elements” should be included in the R Shiny graphical interface. This produces HTML which is used to display your element in the browser. You can try printing the ui object (print(ui)), and you will see that it is simply HTML.

There are different types of page layouts - fluidPage() is a layout based on the Bootstrap web library (LINK) and allows designing your web page in rows, within which you can specify columns.

The server

The server part is what will happen in the background. If deploying as a web page, this part will be run on the computer hosting the code, while the UI will be run on the user’s computer. This part will generate data and figures which subsequently can be displayed in the interface.

The function(input, output) {} is the server function. It always takes the UI input and output elements as arguments. The input will contain all your input elements, such as the slider. The output will contain the output points, such as where the plots should be rendered. Right now, the body of it is empty. Soon we will add more code here!

A note on reactivity

In the default example, we saw an interactive histogram. The UI contains information about that we want a panel layout, one input element and one output point for the plot. The server will recalculate the plot based on the input it receives (in this case - from a single input slider) and sends it to the UI to be displayed.

You can see it in its very minimal form in the geyser dataset. There is one input with the ID “bins”. This input is used to calculate the plot which subsequently is sent to the output “distPlot”.

Basic reactivity

Basic reactivity

This is a lazy kind of evaluation - the server code will only run when it needs to. The term for this kind of programming is ‘reactive programming’. More on this later.

Exercises

  1. Take a look at the code for the default {shiny} application. Identify the “bins” and the “distPlot” ID in the UI part. Next, look into the server part. How are they used in this part?
  2. Make some adjustment to the UI. Try changing the title and the “Number of bins” label.
  3. Try adding a descriptive text above the plot using the p("text") element (within the main panel in the UI part).
  4. See if you can add a custom header to the histogram, allowing a user to change the header interactively. Use the ?hist to see how to change its header. You can add a text input to the interface using the code textInput(<id>, <label>), exchanging <id> and <label> with your own values.

Preparing a dataset

Retrieving the data

This tutorial uses a dataset from Gapminder - an organization promoting a factful view of the world. The dataset contains life expectancies in different countries since 1800. Gapminder has many useful datasets exploring different aspects of the world. Feel free to take a look and use whichever tickles your interest.

Other great pages for datasets:

  • Kaggle, has many great datasets but might require some more preparsingpreparsing before visualizing
  • If you want to visualize something related to Coronavirus, here is a source data home page

(Optional) Parsing the data from scratch

Note: If you don’t want to do this part - jump straight to the next section and load the data preparsed. It is here as a reference if you want to parse your own data.

I have prepared a comma-separated Gapminder dataset on life expectancy from the year 1800. If you have your own dataset, exchange the file path for that one instead and adjust the code accordingly.

First, we load the comma-delimited data using the read_csv command. Next, we use the head command to inspect the first few rows.

data_wide <- read_csv("gapminder_life_expectancy_years.csv")
## Parsed with column specification:
## cols(
##   .default = col_double(),
##   country = col_character()
## )
## See spec(...) for full column specifications.
head(data_wide)
## # A tibble: 6 x 302
##   country `1800` `1801` `1802` `1803` `1804` `1805` `1806` `1807` `1808` `1809`
##   <chr>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
## 1 Afghan…   28.2   28.2   28.2   28.2   28.2   28.2   28.1   28.1   28.1   28.1
## 2 Albania   35.4   35.4   35.4   35.4   35.4   35.4   35.4   35.4   35.4   35.4
## 3 Algeria   28.8   28.8   28.8   28.8   28.8   28.8   28.8   28.8   28.8   28.8
## 4 Andorra   NA     NA     NA     NA     NA     NA     NA     NA     NA     NA  
## 5 Angola    27     27     27     27     27     27     27     27     27     27  
## 6 Antigu…   33.5   33.5   33.5   33.5   33.5   33.5   33.5   33.5   33.5   33.5
## # … with 291 more variables: `1810` <dbl>, `1811` <dbl>, `1812` <dbl>,
## #   `1813` <dbl>, `1814` <dbl>, `1815` <dbl>, `1816` <dbl>, `1817` <dbl>,
## #   `1818` <dbl>, `1819` <dbl>, `1820` <dbl>, `1821` <dbl>, `1822` <dbl>,
## #   `1823` <dbl>, `1824` <dbl>, `1825` <dbl>, `1826` <dbl>, `1827` <dbl>,
## #   `1828` <dbl>, `1829` <dbl>, `1830` <dbl>, `1831` <dbl>, `1832` <dbl>,
## #   `1833` <dbl>, `1834` <dbl>, `1835` <dbl>, `1836` <dbl>, `1837` <dbl>,
## #   `1838` <dbl>, `1839` <dbl>, `1840` <dbl>, `1841` <dbl>, `1842` <dbl>,
## #   `1843` <dbl>, `1844` <dbl>, `1845` <dbl>, `1846` <dbl>, `1847` <dbl>,
## #   `1848` <dbl>, `1849` <dbl>, `1850` <dbl>, `1851` <dbl>, `1852` <dbl>,
## #   `1853` <dbl>, `1854` <dbl>, `1855` <dbl>, `1856` <dbl>, `1857` <dbl>,
## #   `1858` <dbl>, `1859` <dbl>, `1860` <dbl>, `1861` <dbl>, `1862` <dbl>,
## #   `1863` <dbl>, `1864` <dbl>, `1865` <dbl>, `1866` <dbl>, `1867` <dbl>,
## #   `1868` <dbl>, `1869` <dbl>, `1870` <dbl>, `1871` <dbl>, `1872` <dbl>,
## #   `1873` <dbl>, `1874` <dbl>, `1875` <dbl>, `1876` <dbl>, `1877` <dbl>,
## #   `1878` <dbl>, `1879` <dbl>, `1880` <dbl>, `1881` <dbl>, `1882` <dbl>,
## #   `1883` <dbl>, `1884` <dbl>, `1885` <dbl>, `1886` <dbl>, `1887` <dbl>,
## #   `1888` <dbl>, `1889` <dbl>, `1890` <dbl>, `1891` <dbl>, `1892` <dbl>,
## #   `1893` <dbl>, `1894` <dbl>, `1895` <dbl>, `1896` <dbl>, `1897` <dbl>,
## #   `1898` <dbl>, `1899` <dbl>, `1900` <dbl>, `1901` <dbl>, `1902` <dbl>,
## #   `1903` <dbl>, `1904` <dbl>, `1905` <dbl>, `1906` <dbl>, `1907` <dbl>,
## #   `1908` <dbl>, `1909` <dbl>, …

This data is in wide-format, meaning that we have one column per year. In the plotting library {ggplot2} (part of {tidyverse}) the data is expected to be in long format with one column showing the year and one column containing all the life expectancies.

For instance, the following matrix is in wide format - the measured data is spread over multiple columns.

country   1900   1950   2000
Sweden    60     77     80
Denmark   65     79     82

The same data is illustrated in the long format below. Now we have one column containing all measurements and one extra column specifying from which year the measurement was taken. This is harder to read, but easier for a computer program to manage.

country  year  life_expectancy
Sweden   1900  60
Sweden   1950  77
Sweden   2000  80
Denmark  1900  65
Denmark  1950  79
Denmark  2000  82

The transformation from wide to long format can be done using the command pivot_longer (part of the {tidyr} package, which is part of the {tidyverse}). Make sure to inspect the data with head to see what you are working with.

data_long <- data_wide %>% 
# The -1 means that we pivot all columns except the first to one column
    pivot_longer(-1, names_to = "year", values_to="life_expectancy") %>%
    mutate(year=as.numeric(year))

# Slicing out the first column headers
head(data_long)
## # A tibble: 6 x 3
##   country      year life_expectancy
##   <chr>       <dbl>           <dbl>
## 1 Afghanistan  1800            28.2
## 2 Afghanistan  1801            28.2
## 3 Afghanistan  1802            28.2
## 4 Afghanistan  1803            28.2
## 5 Afghanistan  1804            28.2
## 6 Afghanistan  1805            28.2

Loading the data preparsed

If you skipped the previous section - load the preparsedpreparsed data directly. This will load a three-column dataset specifying the country, measurement year and estimated life expectancy for that country and year.

Load the comma-delimited file using the read_csv command and inspect the first few lines with head.

data_long <- read_csv("gapminder_life_expectancy_years_parsed.csv")
## Parsed with column specification:
## cols(
##   country = col_character(),
##   year = col_double(),
##   life_expectancy = col_double()
## )
head(data_long)
## # A tibble: 6 x 3
##   country      year life_expectancy
##   <chr>       <dbl>           <dbl>
## 1 Afghanistan  1800            28.2
## 2 Afghanistan  1801            28.2
## 3 Afghanistan  1802            28.2
## 4 Afghanistan  1803            28.2
## 5 Afghanistan  1804            28.2
## 6 Afghanistan  1805            28.2

Now we are all set!

Shiny from scratch

We have loaded the libraries, and have a dataset ready. Now, let’s start building our {shiny} application!

A minimal application

We will step by step build a complete R Shiny application for the dataset we previously prepared. If successful, you will have enough knowledge at the end to start building your own Shiny applications.

You can run the code directly from this notebook by executing the chunks one by one or create an empty .R file (for instance named app.R) where you step by step, build the application. At this point, you will only get a blank page.

# library(shiny) <- Uncomment if running in a separate .R file, already loaded in the notebook
ui <- fluidPage()
server <- function(input, output) {}  
shinyApp(ui=ui, server=server)

Preparing the UI

Here we will add some different headers and input elements, as well as an output point where we later will show two figures.

Adding a minimal layout

Now we will add a title using the titlePanel function and a sidebar layout using the sidebarLayout function.

You can use many different types of layout for RShiny. Here, we stick with the default “sidebarLayout” which gives us a panel to the left for the settings, and a bigger panel to the right for showing plots. We insert both the titlePanel and the sidebarLayout within the fluidPage function we specified previously.

Note that the elements are separated by commas, and nested (similar to HTML). A common error is to miss one comma or use one too many. Often RStudio will give you useful hints on where to look to fix this.

Try running the code now. Doesn’t look so fancy yet, but now the foundation is there.

# Uncomment if running in a separate .R file
# library(shiny)
ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(),
        mainPanel()
    )
)
server <- function(input, output) {}
shinyApp(ui=ui, server=server)

You can directly inspect the HTML building the UI by printing the ui variable. The important point to note here is that at this point, we are simply creating HTML, just in a more convenient way. (If you are familiar with the Javascript Bootstrap library you probably recognize how {shiny} is using Bootstrap to specify rows and columns).

print(ui)

UI inputs and outputs

Some very useful examples:

  • sliderInput Interactive slider
  • checkboxInput A simple checkbox
  • numericInput Directly specify a numeric value
  • textInput Freely write text
  • selectInput Select one or more of a set of predefined choices

Let’s start with just a slider. You can check which arguments it requires by running ?sliderInput (this is a useful way to check the documentation for any function). All input elements require an ID (which we need to identify it in the server code later) and a label which will show the user what it does.

Beyond that, the required values differ with the element. You can always check the documentation using the ? syntax. For instance, the sliderInput needs a minimum and maximum value and a default value. Also, it is often useful to specify step argument, specifying its stepsize.

Next, we need to specify an output point. Here, we can render different things such as figures, images, text or tables. We start with an output for rendering a figure using the plotOutput. This only requires outputID as an argument, the unique ID for this output point.

Try running it now. Now you should be able to see the layout and interact with the UI. Still not so interesting though - because there is nothing going on in the server.

# Uncomment if running in a separate .R file
# library(shiny)
ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", min=1800, max=2030, step=1, value = c(1800, 2030)),
            selectInput(inputId="countries", label="Countries", choices = c("Sweden", "Denmark", "Norway", "Iceland", "Finland"), selected="Sweden")
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot")
        )
    )
)
server <- function(input, output) {}
shinyApp(ui=ui, server=server)

Exercises

  • Try changing the title of the app to something you think fits good!
  • Try changing the default values of both the year range and the country names (for instance, include “Germany” as an option)
  • Try out one of the input elements checkboxInput, numericInput and textInput (you can check their required parameters by running for instance ?checkboxInput)
  • Use the p(...) element to write a description above the figure in the main panel (right above the plotOutput in the mainPanel)

Setting up the server

Up until now our server function have been completely empty. Now we will add code for generating a figure which subsequently will be sent to the UI and shown to the user.

Creating the plot

It is generally a good practice to make sure you understand the pieces you are working with before building them together. In RShiny I prefer to always make sure the data parsing and plotting work independently before plugging them into RShiny.

We loaded our dataset at the beginning of this tutorial. It is in three columns which specify a country, year and the life expectancy.

We can filter out certain entries from the data frame using the filter command. Let’s prepare a dataset only containing life expectancy values for Sweden.

data_long_sweden <- data_long %>% filter(country == "Sweden")
head(data_long_sweden)
## # A tibble: 6 x 3
##   country  year life_expectancy
##   <chr>   <dbl>           <dbl>
## 1 Sweden   1800            32.2
## 2 Sweden   1801            36.9
## 3 Sweden   1802            40.2
## 4 Sweden   1803            40.3
## 5 Sweden   1804            39.7
## 6 Sweden   1805            41

Next, we insert it into ggplot. There are three key points here:

  • The input data should be in long format - one column per type of data
  • We specify one or several aesthetics (such as position or colouring)
  • We specify one or several geometrics - how the data should be displayed (point-plot, boxplot, line-plot or others)
ggplot(data_long_sweden, aes(x=year, y=life_expectancy)) + 
    geom_line() +
    ggtitle("Swedish life expectancy")

Looking good!

Now, let’s see if we can focus on a smaller range by filtering the dataset further before plotting it.

data_long_sweden_range <- data_long %>% 
    filter(country == "Sweden") %>%
    filter(year >= 1900 & year <= 2000)
head(data_long_sweden_range)
## # A tibble: 6 x 3
##   country  year life_expectancy
##   <chr>   <dbl>           <dbl>
## 1 Sweden   1900            52.3
## 2 Sweden   1901            52.9
## 3 Sweden   1902            54.7
## 4 Sweden   1903            55.1
## 5 Sweden   1904            55.4
## 6 Sweden   1905            54.5
ggplot(data_long_sweden_range, aes(x=year, y=life_expectancy)) + 
    geom_line() +
    ggtitle("Swedish life expectancy")

We can clearly see the effect of the Spanish flu, which hit Sweden 1918 and infected a third of Swedens population. Other than that it looks like the 20ths went pretty well for us.

Now we have everything we need to make this interactive! ggplot2 is a whole workshop in itself. If this is new - no worries. You can start with these pieces of working code, and gradually figure out what you can change. Here is a cheat sheet if you want some further inspiration: https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf

Inserting the plot in RShiny

Now, we are ready to run the plot within R Shiny. We use the same UI as before, but now we add a server part which sends a static ggplot figure to the “distPlot” output point. (Remember that you specified this as a plotOutput point in the main panel).

Try it out! At this point, you should see the same plot as we previously produced using ggplot, but nothing will happen when you drag the sliders.

# Uncomment if running in a separate .R file
# library(shiny)
# library(tidyverse)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=1),
            selectInput(inputId="countries", label="Countries", choices = c("Sweden", "Denmark", "Norway", "Iceland", "Finland"), selected="Sweden")
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot")
        )
    )
)
server <- function(input, output) {
    
    output$year_range_plot <- renderPlot({
        
        data_long_sweden_range <- data_long %>% 
            filter(country == "Sweden") %>%
            filter(year >= 1900 & year <= 2020)
        
        ggplot(data_long_sweden_range, aes(x=year, y=life_expectancy)) + 
            geom_line() +
            ggtitle("Swedish life expectancy")
    })
    
}
shinyApp(ui=ui, server=server)

Making our Shiny app interactive

We access the output point of the UI using output$distPlot. Similarly, we can access our input elements using input$bins and input$title. We specify these ones directly within the renderPlot.

By doing this, we link these parameters from the interface into the plot. RShiny will now automatically detect when any of these change, and when this happens, it will rerun the code.

This time our reactive diagram looks only slightly more complicated. We have one output point, but it depends on two different inputs. If either of these change, so will the plot.

Try it out! Change the number of bins and title. You should directly see these changes appear in the plot.

# Uncomment if running in a separate .R file
# library(shiny)
# library(tidyverse)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

all_countries <- unique(data_long$country)

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=10),
            selectInput(inputId="countries", label="Countries", choices = all_countries, multiple = TRUE, selected=c("Sweden", "Denmark", "Norway", "Iceland", "Finland"))
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot")
        )
    )
)
server <- function(input, output) {
    
    output$year_range_plot <- renderPlot({
        
        data_long_filtered <- data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
        
        ggplot(data_long_filtered, aes(x=year, y=life_expectancy, color=country)) + 
            geom_line() +
            ggtitle("Life expectancies") +
            xlab("Year") +
            ylab("Life expectancy (years)")
    })
}
shinyApp(ui=ui, server=server)

If everything works fine, then you should be able to explore the data now! Try changing the selected year range to focus on different centuries.

Exercises

  • Which nordic country had the lowest overall life expectancy during 1900-1950?
  • Select a few countries - roughly one from each continent and look at the trends. Are they roughly as you expected?
  • Try adding an option for line width. The geom_line argument can take a size argument (geom_line(size=3) for example). See if you can add a slider where the user can decide the width.
  • Add an input text argument where the title of the plot can be specified
  • Add another numeric argument which can specify the size of the font (this can be done by adding a + theme(text=element_text(size=20)) argument after the existing ggplot arguments)
  • (Advanced) You can display individual data points by replacing the geom_line with the geom_point argument. Use either checkbox-inputs or a select-input to let the user decide how the data should be displayed. (Hint: You can save the ggplot in an object plt <- ggplot() ... and then apply geom_line or geom_point by plt <- plt + geom_line()).

Showing the data using multiple plots

One of the powerful things with RShiny is how easy it is to extend the interface to accommodate multiple figures showing different aspects of the same data.

Here, we produce a boxplot for the same dataset.

First, let’s try out the plot separately. We use code very similar to previously, but we exchange the geom_line with a geom_boxplot.

data_long_countries_range <- data_long %>% 
    filter(country %in% c("Sweden", "Denmark", "Norway", "Finland", "Iceland")) %>%
    filter(year >= 1920 & year <= 2020)
head(data_long_countries_range)
## # A tibble: 6 x 3
##   country  year life_expectancy
##   <chr>   <dbl>           <dbl>
## 1 Denmark  1920            57.5
## 2 Denmark  1921            61.7
## 3 Denmark  1922            60.6
## 4 Denmark  1923            61.1
## 5 Denmark  1924            61.1
## 6 Denmark  1925            61.9
ggplot(data_long_countries_range, aes(x=year, y=life_expectancy, color=country)) + 
    geom_boxplot() +
    ggtitle("Nordic countries life expectancy 1900 - 2020")

It seems like the better years are relatively similar, but there have been some tougher years in Finland and Iceland during the last 100 years.

Now we are ready to link this into the server part.

# Uncomment if running in a separate .R file
# library(shiny)
# library(tidyverse)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

all_countries <- unique(data_long$country)

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=10),
            selectInput(inputId="countries", label="Countries", choices = all_countries, multiple = TRUE, selected=c("Sweden", "Denmark", "Norway", "Iceland", "Finland"))
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot"),
            plotOutput(outputId="year_range_boxplot")
        )
    )
)
server <- function(input, output) {
    
    output$year_range_plot <- renderPlot({
        
        data_long_filtered <- data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
        
        ggplot(data_long_filtered, aes(x=year, y=life_expectancy, color=country)) + 
            geom_line() +
            ggtitle("Life expectancies") +
            xlab("Year") +
            ylab("Life expectancy (years)")
    })
    
    output$year_range_boxplot <- renderPlot({
        data_long_filtered <- data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
        
        ggplot(data_long_filtered, aes(x=year, y=life_expectancy, color=country)) + 
            geom_boxplot() +
            ggtitle("Life expectancies, boxplot")
    })
    
}
shinyApp(ui=ui, server=server)

Great! One ugly part here though is that we needed to duplicate the parsing part of the dataset (converting data_long into data_long_filtered). This is something we want to avoid as this kind of coding over time will make our code less maintainable and less readable. The way to do this in R Shiny is by using a reactive variable (here named data_long_filtered).

Dual plot reactivity

Dual plot reactivity

We use the reactive keyword and put in an expression which here is dependent on the reactive input variables input$countries and input$year_range. This expression will be rerun every time any of these input variables change, which in turn will trigger the plots to rerender (which now both depend on this reactive element). Note that when you use this one, you need to call it with parenthesizes similar to how you use a function (i.e. data_long_filtered()).

# Uncomment if running in a separate .R file
# library(shiny)
# library(tidyverse)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

all_countries <- unique(data_long$country)

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=10),
            selectInput(inputId="countries", label="Countries", choices = all_countries, multiple = TRUE, selected=c("Sweden", "Denmark", "Norway", "Iceland", "Finland"))
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot"),
            plotOutput(outputId="year_range_boxplot")
        )
    )
)
server <- function(input, output) {
    
    data_long_filtered <- reactive({
        data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
    })
    
    output$year_range_plot <- renderPlot({
        
        ggplot(data_long_filtered(), aes(x=year, y=life_expectancy, color=country)) +
            geom_line() +
            ggtitle("Life expectancies") +
            xlab("Year") +
            ylab("Life expectancy (years)")
    })
    
    output$year_range_boxplot <- renderPlot({
        ggplot(data_long_filtered(), aes(x=year, y=life_expectancy, color=country)) + 
            geom_boxplot() +
            ggtitle("Life expectancies, boxplot")
    })
    
}
shinyApp(ui=ui, server=server)

Exercises

  • Let the user customize the plot titles separately by adding two textInput UI fields. Make sure to have useful default values.
  • Add a third output point and display the data using some other ggplot geom. You could, for instance, display it using geom_density or geom_point.
  • (Advanced) If you are familiar with statistics - calculate the ANOVA across the different countries for the selected datasets and indicate the resulting p-value in the plot title.

Extended reading

These sections are intended to showcase some features I have found to be very useful in regards to RShiny, but are not strictly needed to get started. Feel free to read it, run it, skim it or skip it.

To complete this part you need three additional packages: {DT} for interactive tables, {plotly} for interactive plots (interactive such that you can use your mouse and interact directly with the plot) and {rsconnect} for rapidly deploying your {shiny} applications online.

install.packages(c("DT", "plotly", "rsconnect"))

When they are installed - load them.

library(DT)
## 
## Attaching package: 'DT'
## The following objects are masked from 'package:shiny':
## 
##     dataTableOutput, renderDataTable
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(rsconnect)
## 
## Attaching package: 'rsconnect'
## The following object is masked from 'package:shiny':
## 
##     serverInfo

Showing interactive tables using DT

We might want to show the displayed raw data directly. There is a built-in way of displaying tables, but using {DT} it looks much more attractive. The output point is called DTOutput, and for rendering, we use the command DT::renderDataTable.

The reactive table obtained from data_long_filtered can be rendered directly, but it is often easier for humans to read in wide format. For this, we can use the {tidyr} (part of Tidyverse) command pivot_wider where we specify which column contains what will become column headers, and what column contains the values.

# library(shiny) <- Uncomment if running in a separate .R file
# library(tidyverse)
# library(DT)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

all_countries <- unique(data_long$country)

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=10),
            selectInput(inputId="countries", label="Countries", choices = all_countries, multiple = TRUE, selected=c("Sweden", "Denmark", "Norway", "Iceland", "Finland"))
        ),
        mainPanel(
            plotOutput(outputId="year_range_plot"),
            plotOutput(outputId="year_range_boxplot"),
            DT::dataTableOutput(outputId="table_display")
        )
    )
)
server <- function(input, output) {
    
    data_long_filtered <- reactive({
        data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
    })
    
    output$year_range_plot <- renderPlot({
        
        ggplot(data_long_filtered(), aes(x=year, y=life_expectancy, color=country)) +
            geom_line()
    })
    
    output$year_range_boxplot <- renderPlot({
        ggplot(data_long_filtered(), aes(x=country, y=life_expectancy, color=country, text=year)) + 
            geom_boxplot()
    })
        
    output$table_display <- DT::renderDataTable({
        data_long_filtered() %>% pivot_wider(names_from="year", values_from="life_expectancy")
    })
}
shinyApp(ui=ui, server=server)

Further interactivity with Plotly

Plotly is a library giving us the chance to make the plots directly interactive - meaning that we can hover and click elements directly, and use a range of tools such as zooming. It is possible to use it to do more advanced tasks such as executing code when clicking or marking certain parts of the graph, but that is beyond the scope of this tutorial.

To use Plotly, you need to load (and install) the library {plotly}. It has its own custom syntax for defining plots and contains a range of plots (it can also be used in Python). It is also compatible with ggplot-plots using the ggplotly function.

To incorporate Plotly, we need to update our plotOutput, and renderPlot points to plotlyOutput and renderPlotly. Next, we convert our ggplot outputs by sending the plot object into the ggplotly function. Then we are ready to go!

Some things do not work out of the box here, unfortunately. The x- and y-labels get cut off. They can be assigned the Plotly-way but adding the code %>% layout(xaxis=list(title="Year"), yaxis=list(title="Life expectancy (years)")) after the ggplotly command.

# Uncomment these lines if running in a separate .R file
# library(shiny)
# library(tidyverse)
# library(plotly)
# library(DT)
# data_long <- read_csv("data/gapminder_life_expectancy_years_parsed.csv")

all_countries <- unique(data_long$country)

ui <- fluidPage(
    titlePanel("Life Expectancy Shiny App"),
    sidebarLayout(
        sidebarPanel(
            sliderInput(inputId="year_range", label="Year range", value=c(1800, 2030), min=1800, max=2030, step=10),
            selectInput(inputId="countries", label="Countries", choices = all_countries, multiple = TRUE, selected=c("Sweden", "Denmark", "Norway", "Iceland", "Finland"))
        ),
        mainPanel(
            plotlyOutput(outputId="year_range_plot"),
            plotlyOutput(outputId="year_range_boxplot"),
            DTOutput(outputId="table_display")
        )
    )
)
server <- function(input, output) {
    
    data_long_filtered <- reactive({
        data_long %>% 
            filter(country %in% input$countries) %>%
            filter(year >= input$year_range[1] & year <= input$year_range[2])
    })
    
    output$year_range_plot <- renderPlotly({
        
        plt <- ggplot(data_long_filtered(), aes(x=year, y=life_expectancy, color=country)) +
            geom_line()
        ggplotly(plt) %>% layout(xaxis=list(title="Year"), yaxis=list(title="Life expectancy (years)"))
    })
    
    output$year_range_boxplot <- renderPlotly({
        plt <- ggplot(data_long_filtered(), aes(x=country, y=life_expectancy, fill=country, text=year)) + 
            geom_boxplot()
        ggplotly(plt, tooltip="text") %>% layout(xaxis=list(title="Country"), yaxis=list(title="Life expectancy (years)"))
    })
    
    output$table_display <- DT::renderDataTable({
        data_long_filtered() %>% pivot_wider(names_from="year", values_from="life_expectancy")
    })
}
shinyApp(ui=ui, server=server)

Deploying the app to shinyapps.io

You already know how to run your app locally. If you have access to a server, you can deploy it yourself using Shiny server (https://rstudio.com/products/shiny/shiny-server/). This requires some Linux-knowledge, but is comparably much easier than deploying a server yourself from scratch!

An even easier way (if you are fine with deploying your app publicly) is to deploy it to shinyapps.io. This can be done in a matter of a minute. You get up to 5 applications and 25 active hours for free.

All instructions below are outlined in more details at https://docs.rstudio.com/shinyapps.io/getting-started.html.

First:

  1. Go to shinyapps.io and create a new account - Note that your account name will be used as part of the URL for your apps
  2. Install the R-package {rsconnect}

Next, you need to retrieve tokens used to connect. Go to the Account/Tokens page on shinyapps.io and click “Show”. This will give you a line of code calling the following function - replacing TOKEN and SECRET with your token and secret.

rsconnect::setAccountInfo(name='accountname', token="<TOKEN>", secret="<SECRET>"

Now you are all set to deploy your app! Either by clicking the publish button in the top right (the blue symbol looking like an eye) or by executing the command:

rsconnect::deployApp()

Have some patience while it prepares it. At the end, it will give you an URL. Now you can show off your app to friends and family!

Summary

That was it. I hope it was useful!

If you have any trouble with this tutorial, feel free to throw me an email at jakob [dot] willforss [at] hotmail [dot] com. You can also find me on: GitHub and Linkedin.

Good luck with your visualizations!

Session info

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=sv_SE.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=sv_SE.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=sv_SE.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=sv_SE.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] rsconnect_0.8.16 plotly_4.9.2.1   DT_0.13          forcats_0.5.0   
##  [5] stringr_1.4.0    dplyr_0.8.5      purrr_0.3.3      readr_1.3.1     
##  [9] tidyr_1.0.2      tibble_2.1.3     ggplot2_3.3.0    tidyverse_1.3.0 
## [13] shiny_1.4.0.2   
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.4        lubridate_1.7.4   lattice_0.20-38   assertthat_0.2.1 
##  [5] digest_0.6.25     utf8_1.1.4        mime_0.9          R6_2.4.1         
##  [9] cellranger_1.1.0  backports_1.1.5   reprex_0.3.0      evaluate_0.14    
## [13] httr_1.4.1        blogdown_0.20     pillar_1.4.3      rlang_0.4.5      
## [17] lazyeval_0.2.2    readxl_1.3.1      data.table_1.12.8 rstudioapi_0.11  
## [21] rmarkdown_2.3     labeling_0.3      htmlwidgets_1.5.1 munsell_0.5.0    
## [25] broom_0.5.5       compiler_3.6.1    httpuv_1.5.2      modelr_0.1.6     
## [29] xfun_0.12         pkgconfig_2.0.3   htmltools_0.4.0   tidyselect_1.0.0 
## [33] bookdown_0.20     viridisLite_0.3.0 fansi_0.4.1       crayon_1.3.4     
## [37] dbplyr_1.4.2      withr_2.1.2       later_1.0.0       grid_3.6.1       
## [41] nlme_3.1-145      jsonlite_1.6.1    xtable_1.8-4      gtable_0.3.0     
## [45] lifecycle_0.2.0   DBI_1.1.0         magrittr_1.5      scales_1.1.0     
## [49] cli_2.0.2         stringi_1.4.6     farver_2.0.3      fs_1.3.2         
## [53] promises_1.1.0    xml2_1.2.5        vctrs_0.2.4       generics_0.0.2   
## [57] tools_3.6.1       glue_1.3.2        hms_0.5.3         fastmap_1.0.1    
## [61] yaml_2.2.1        colorspace_1.4-1  rvest_0.3.5       knitr_1.28       
## [65] haven_2.2.0