Last year, as I embarked on my NFL sports statistics work, I attended the Sloan Sports Analytics Conference for the first time. A year later, after a very successful draft, I was invited to present an R workshop to the conference.

My time slot was up against Nate Silver so I didn’t expect many people to attend. Much to my surprise when I entered the room every seat was taken, people were lining the walls and sitting in the aisles.

My presentation, which was unrelated to the work I did, analyzed the Giants’ probability of passing versus rushing and the probability of which receiver was targeted. It is available at the talks section of my site.

After the talk I spent the rest of the day fielding questions and gave away copies of *R for Everyone* and an NYC Data Mafia shirt.

Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York R Conference and author of R for Everyone.

Jared,

I was part of the standing audience for your excellent R workshop at the MIT Sports Analytics Conference. During your workshop you said that ANOVA was antiquated and that we should never use it again. What should we use instead?

I am analyzing data from a clinical trial including a control leg and 2 expetimental legs. If not ANOVA, what statistical analysis tool should I use to determine if the experimental treatments were more efficacious than the control and whether one experimental treatment was more effective than the other?

Thanks,

Elliot Schwartz

Elliot,

I hope you enjoyed the talk!

I HIGHLY recommend using a regression with a factor variable for group. The baseline is control then the two experimental legs each get an indicator variable. The coefficient calculated for these indicator variables tells you the effectiveness of the treatments. And you can account for other variables in this regression as well so you don’t have as much confounding.