About Me

I’m a data scientist at Heap. My interests include statistics, data analysis, education, and programming in R.

I’m the co-author with Julia Silge of the tidytext package and the O’Reilly book Text Mining with R. I’m also the author of the broom and fuzzyjoin packages, and of the e-book Introduction to Empirical Bayes.

I previously worked as Chief Data Scientist at DataCamp and as a data scientist at Stack Overflow, and received a PhD in Quantitative and Computational Biology from Princeton University.



  • broom: Convert messy model outputs to a tidy format, for use with tools such as dplyr and tidyr.
  • fuzzyjoin: Join tables based on inexact matching of columns
  • tidytext: Analyze text using tidy packages such as dplyr, ggplot2, and tidyr
  • stackr: R package for connecting to the Stack Exchange API


All opinions and views are my own and do not represent my employer.