About Me
I’m a data scientist at Heap. My interests include statistics, data analysis, education, and programming in R.
I’m the co-author with Julia Silge of the tidytext package and the O’Reilly book Text Mining with R. I’m also the author of the broom and fuzzyjoin packages, and of the e-book Introduction to Empirical Bayes.
I previously worked as Chief Data Scientist at DataCamp and as a data scientist at Stack Overflow, and received a PhD in Quantitative and Computational Biology from Princeton University.
Books
- Text Mining with R: A Tidy Approach: a guide to drawing insights from text using the tidytext package in R. Co-authored with Julia Silge, and published by O’Reilly in July 2017. Also available for free online.
- Introduction to Empirical Bayes: Examples from Baseball Statistics. An e-book demonstrating the statistical method of empirical Bayes, based on the example of estimating baseball batting averages.
Software
- broom: Convert messy model outputs to a tidy format, for use with tools such as dplyr and tidyr.
- fuzzyjoin: Join tables based on inexact matching of columns
- tidytext: Analyze text using tidy packages such as dplyr, ggplot2, and tidyr
- stackr: R package for connecting to the Stack Exchange API
Publications
- Johnson EL, Robinson D.G., Coller HA. (2017) Widespread changes in mRNA stability contribute to quiescence-specific gene expression patterns in a fibroblast model of quiescence. BMC Genomics 2017, 18(1):123.
- Robinson, D.G. (2015) broom: An R package for converting statistical analysis objects into tidy data frames. arXiv preprint. arXiv:1412.3565 [stat.CO].
- Robinson, D.G., Wang, J., and Storey, J.D. (2014) A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays. biorXiv preprint. doi:10.1101/013342.
- Robinson, D.G. and Storey, J.D. (2014) subSeq: Determining appropriate sequencing depth through efficient read subsampling. Bioinformatics, 30 (23): 3424-3426. doi: 10.1093/bioinformatics/btu552.
- Robinson, D.G., Chen, W., Storey, J.D., and Gresham, D. (2014) Design and Analysis of Bar-seq Experiments. G3: Genes/Genomes/Genetics, 4(1), 11-18
- Robinson, D.G., Lee, M.C. and Marx, C.J. (2012) OASIS: an automated program for global investigation of bacterial and archaeal insertion sequences. Nucleic Acids Research, 10.1093/nar/gks778.
Links
About This Site
This site is powered by Jekyll using the Minimal Mistakes theme. All blog posts are released under a Creative Commons Attribution-ShareAlike 4.0 International License. The favicon and logo were created by Thomas Lin Pedersen.
All blog posts are compiled with knitr R markdown using this script. You can find the reproducible sources of each blog post here.
All opinions and views are my own and do not represent my employer.