Key Visual DW Innovation
Key Visual DW Innovation
DDJ

You Will Not Believe How Easy It Is To Learn Statistics With SWIRL

According to various sources 'data scientist' is the sexiest job of the 21^st^ century. Especially R language skills attract the highest salaries ($100,000–$125,000 in the United States) according to recently published salary surveys. There is undeniably a high demand for people who are able to analyze and interpret data for the sake of knowledge discovery.

Note: This is part of series of interviews and short profiles about data journalism related topics.

Especially R language skills attract the highest salaries ($100,000–$125,000 in the United States) according to recently published salary surveys. There is undeniably a high demand for people who are able to analyze and interpret data for the sake of knowledge discovery. Yes, even media organizations are fighting for talented professionals like Nate Silver who ran the FiveThirtyEight blog at the New York Times and has now moved on to ESPN.

One of the first steps in this domain is very closely linked to an educational background in statistics. But learning statistics is cumbersome and so far not very entertaining. Nick Carchedi, a graduate student in Biostatistics, is about to change that.

I spoke to Nick Carchedi, the developer of swirl, a software package for the R statistical programming language. He surprised the statistics community (and many more) with a very smart concept. swirl allows users to pick up on statistics and R skills simultaneously and interactively.

What was your motivation for developing swirl?

Nick Carchedi: I began working on swirl in the summer of 2013. The project was motivated by my frustration with the lack of existing resources for learning R and statistics interactively. At the time, interactive resources for learning other programming languages were abundant (see Codecademy, Code School etc.) and I thought the R and statistics communities deserved the same.

I strongly believe that the best way to learn anything is by doing it. Traditional academic resources such as textbooks, journal articles, and static websites may always have their place, but there's no substitute for rolling up your sleeves and getting your hands dirty. The textbooks that I've found most helpful over the years are the ones with plenty of practical examples that I can follow along with, either on the computer or with pencil and paper.

swirl simply allows this process of acquiring knowledge and practicing what you've learned to happen all in one place, with the benefit of immediate feedback. Additionally, the swirl learning environment is as authentic as possible, since the user learns statistics and R in the very same place he or she will do data analysis in when not using swirl: the R console.

How did you learn statistics?

Nick Carchedi: I was first exposed to statistics as an undergraduate at the University of Maryland. I was a math major and I became disenchanted with the highly abstract and seemingly impractical nature of many of my classes. I was drawn to statistics because of its obvious and useful application to so many real world problems.

Upon graduation, I took a job in the financial services that didn't require me to apply my knowledge of maths and statistics. By the time I decided to return to graduate school a few years later, I was a bit rusty. My struggles to relearn the finer points of maths, statistics and programming have been a powerful motivator for the development of swirl.

How can swirl help (data) journalists overcome their fear of learning statistics?

Nick Carchedi: swirl is designed to make learning statistics and R programming more accessible to people from a variety of backgrounds. A majority of the first courses we've developed assumes no prior knowledge of either topic. The first time someone sits down in front of an R console, staring at the blinking cursor, can be an intimidating experience. We aim to make that learning process less intimidating by providing a helping hand to the user, while still maintaining the authenticity of the learning environment.

Also, we've developed swirl in such a way that anyone can write his or her own interactive content and make it available to anyone they choose. swirl is still young and the R and the statistics communities are just beginning to take up its cause. However, I imagine that in the future, data-driven journalists will write interactive content to help other journalists overcome their fear of statistics and programming.

Interview by Cosmin Cabulea (@pushthings4ward on Twitter)

More information

swirl Swirl on GitHub R-Project RStudio Beginner's guide to R R-Cook book

Author
Logo Deutsche Welle
DW Innovation