Things I learnt at rstudio::conf 2019

· by Michael Harper · Read in about 6 min · (1211 words) ·

Mountain View
Cover photo by atmtx

rstudio::conf 2019 is sadly over. I can honestly say it was one of the most interesting and educational conferences that I have been to, and I have come back with a long list of ideas which I want to get using right away! Here are some of the key points that I learned this week, with links to useful packages you may want to check out!

Shiny is maturing into a serious tool for production

There is often a perception that Shiny is good for prototyping, whilst deploying the app should be reserved for other programming languages. This is definitely not the case, as discussed by Joe Cheng in his keynote speach. To assist in this process making production-ready shiny apps, there are some great packages which were showcased:

  • shinytest: does your process of testing your shiny app currently involve launching the app and clicking around until you break something? This package makes it easy to record tests which can be run automatically, allowing you to know if you have accidentally broken your app with your latest changes.

  • shinyloadtest: once you have built your app, this package can help assess how the app will perfom under multiple users. It simulates traffic and provides useful graphs which can show you where you may be facing perfomance constraints in the app. Be prepared to watch your shiny app fail to scale!

  • ipc: this targets a smaller group of users who may want to develop apps which contain long computations. This package provides tools for helping with launching child processes which means that the app can still run while the calculation is being done in the background

R Markdown continues to get even better!

I pretty much do all my work in R Markdown, and one of my most common frustrations has been making pretty tables. While kableExtra does a great job of allowing extensive customisation, it can be a bit tricky to get set up and will often force the user to select a single output format.

gt package does a way with a lot of the difficulty, and allows you to create beautiful tables for HTML and PDF outputs. The syntax is really simple and users can format data, modify columns, change the colour with a few lines of code. Extra mention should be given the author Richard Iannone for how beautiful the documentation is!

pagedown is another useful package in addition to the family of R Markdown packages including bookdown and blogdown. It allows users to create paged HTML documents, and provide some great templates. If you have ever wanted to write your CV in R Markdown, it is now easier than ever!

It was great to be able to speak to many of the R users and see how interested others are in R Markdown. There was great interest in gaining more ideas on how to take the R Markdown documents to the next level with customisation, something that I am trying to address with the R Markdown Cookbook. Feedback would be greatly appreciated if there is any topic you would like to be covered in the book.

Spatial modelling in R is getting easier

A lot of my work deals with spatial analysis, and it has always been slightly fiddly to handle this data in R. I particularly found this to be a big barrier to new users of R who were familiar with spatial modelling, but were trying to program for the first time. There has been some great progress in the past year as demonstrated at the conference:

  • sf: I started doing geospatial analysis in R over two years ago, and it has always been quite fiddly to interact with this data. The sf uses more familiar terms to query the data and with version 3.1.0 of ggplot, it makes it possible to directly plot spatial object using the geom_sf argument.

  • stars: similar to the sf package, this package aims to make the handling of raster data more efficient. One the areas this seeks to improve is the handling of large datasets, something which can be quite a frustrating and slow experience at the moment!

  • rayshader: making pretty maps is always important for communicating results. This package can help producing 2D and 3D hillshaded maps.

RStudio keep making great utilities for package development

I spent a lot of the week in the programming sessions, and was amazed at the amount of new tools being developed to make the life of package development easier. Some of the tools include:

  • pkgman: package installation will typically be done with the install.packages command, but this package seems to offer many improvements making package installation quicker and more reliable. I can see this quickly becoming my default package for installing and loading packages.

  • itdepends: not all dependencies in your code should be treated equally. However, if you write code which utilises functions from lots of different packages, you can leave yourself vulnerable to things breaking in the future. This package is designed to help assess the dependencies within your code, and can be a useful tool for working out whether you can reduce the dependencies in your package. This is still in early development, but this is definitely one to watch.

  • vctrs: if a package is ever made by Hadley Wickham you can be rest assured that it is both useful and well designed. It provides tools for handling vectors in a more consistent manner, and helps eliminate some of the more quirky behaviour of base R.

Embrace the tidyverse

It is 2019, so I hope that most people are on board with this by now, but you should definitely be using the tidyverse! The oversight of this area from RStudio means that the experience is seamless, and more and more packages are being developed within this.

As an example of the awesome kind of reasons why you should use the tidyverse, I did the Big Data in R Workshop which showed how R can connect with SQL databases and use spark. Normally, connecting to an SQL database would probably require you to learn some SQL, but no, dplyr can translate your commands automatically! If you ever decided to migrate your data to a database, you do so without having to rewrite all your code.

You can check out the Big Data course here, which explains this behaviour in more detail https://github.com/dr-harper/bigdataclass

The R community is incredible

Finally, I wanted to reflect how awesome the whole R community is! I lost count of the number of great people I met, and the whole conference was incredibly welcoming throughout. The breadth of the areas and expertise in R created really interesting conversations. I am really looking forward to working with people online and meeting again at future conferences.

I think it is also important to recognise the benefit which RStudio have on the R community. Their development within the tidyverse and beyond is astonishing, and their coordination across the packages makes using the software so seamless. Long may they prosper for the good of the R community.

Viewing the resources

Keep an eye on the R Studio conference for the upcoming videos. All the talks are available from previous years here: https://resources.rstudio.com/rstudio-conf-2018

Looking forward to 2020 and San Francisco!