David Robinson bio photo

David Robinson

Director of Data Scientist at Heap, works in R.

Email Twitter Github Stack Overflow

Subscribe


Recommended

In April I attended the 2017 New York R conference, hosted by Lander Analytics and Work-Bench. It was both the third time the conference was held and the third time I’ve attended, and it gets more fun each year, especially because this year eight of us attended from Stack Overflow (including all five of us on the Data Team).

Now that the videos from the conference are all posted, I’m sharing some thoughts on the conference, and the slides and video from my talk, below. As is my habit, I’m also sharing some of my favorite tweets from the conference.

We R What We Ask

I’ve been at Stack Overflow for almost two years now, which has granted me access to a lot of interesting data about how people code, including in my favorite programming language R. I got the chance to share these insights with the R community in my talk We R What We Ask: The Landscape of R Users on Stack Overflow.

(I’m going to give a similar talk at the useR 2017 conference this summer, so if you’re attending you can also see it then!)

I started by showing that Stack Overflow data can track the rise and decline of programming languages and technologies, by charting the growth of R questions in the last decade, along with several other technologies used in the expanding data science field.

(You can make your own plots of technologies using our new Stack Overflow Trends tool).

I also examined the growth and decline of particular R packages, by parsing the code from questions and answers.

One of my favorite results is a “bird’s eye view” of the R package ecosystem, built from correlations of packages that tended to appear in answers to the same questions. You can see how it breaks the packages out into particular problem domains.

Since I’m into live tweeting conferences, host Jared Lander surprised me by asking me to give a second, four-minute talk (video) to share my advice about live-tweeting. That was fun!

Talks

Ricardo Bion started the conference (video) by talking about how Airbnb built an internal R ecosystem, including packages, classes, custom themes and even stickers. I was a great fan of his blog post on the topic and I am always interested in hearing more about this philosophy.

Serge Belongie gave a fascinating talk (video) on some of the newest challenges in image recognition, including developing a training set from users even when the training data isn’t common knowledge (e.g. telling the difference between a sparrow and a robin).

Sandy Griffith gave a brilliant talk on how her team went from a research question to a completed manuscript in just two days. I enjoy hackathons (like this recent one) so I’ve often dreamed of being part of an effort like that.

I also really enjoyed Friederike Schuur’s talk (video) on the history and philosophy of data science, and how it compares to the history of software engineering.

JD Long gave a popular talk (video) about the role of empathy- with colleagues, with users, or with the subjects of data- in a technical career.

I’d been looking forward to Ramnath Vaidyanathan’s talk (video) on HTML widgets, and he didn’t disappoint, showing us the details of creating a widget in real time.

There were many other excellent speakers: make sure you check out the full list of videos!

Stack Overflow Data Team

I was excited to see Julia Silge (only the third time we’ve met in person!) and data team members Nick Larsen and Jason Punyon, who normally work remotely but came to New York for the week.

Julia gave a terrific talk (video) on the tidytext package, as well as our upcoming book Text Mining with R).

Jason got a data-related souvenir from the trip.

And all around, our live-tweeting game was on point.

I’ll be going to a few other R and statistics conferences this year (including useR and JSM), but I’m particularly glad I got to share this one with my team.