David Robinson bio photo

David Robinson

Data Scientist at Stack Overflow, works in R and Python.

Email Twitter Github Stack Overflow

Subscribe


Recommended Blogs

Tidy data analysis in R with dplyr, ggplot2, and broom

Welcome to the webpage for the UP-STAT 2016 Tutorial: Tidy data analysis in R with dplyr, ggplot2, and broom.

Setup

You’ll need to install:

  • Install the latest version of R (3.2.4), which can be done here. If you already have R installed, make sure your version is at least 3.2.0: if it’s not, upgrade!
  • Install RStudio, which can be found here.

You’ll need to install several R packages, which you can do in R with:

install.packages(c("ggplot2", "dplyr", "tidyr", "broom"))
  • Live Code Feed
  • R Error Message Cheat Sheet: contains some common R error messages. If you get an error while running a line that you expect to work (perhaps because you saw it on the screen), you can check these examples. (In particular, check your spelling and capitalization carefully).
  • Data Wrangling Cheat Sheet: this sheet is a great summary of dplyr and tidyr operations, two packages used today.

Other Resources

Relevant Code

We’ll be studying a set of United Nations voting data that can be found here:

  • Anton Strezhnev; Erik Voeten, 2013, “United Nations General Assembly Voting Data”, hdl:1902.1/12379 UNF:5:s7mORKL1ZZ6/P3AR5Fokkw== Erik Voeten [Distributor] V7 [Version]

You can download it using the following line of code:

load(url("http://varianceexplained.org/courses/upstat/RawVotingdata.RData"))