Last week Slice ran a post about a tomato taste test they conducted with Scott Wiener (of Scott’s NYC Pizza Tours), Brooks Jones, Jason Feirman, Nick Sherman and Roberto Caporuscio from Keste. While the methods used may not be rigorous enough for definitive results, I took the summary data that was in the post and performed some simple analyses.
The first thing to note is that there are only 16 data points, so multiple regression is not an option. We can all thank the Curse of Dimensionality for that. So I stuck to simpler methods and visualizations. If I can get the raw data from Slice, I can get a little more advanced.
For the sake of simplicity I removed the tomatoes from Eataly because their price was such an outlier that it made visualizing the data difficult. As usual, most of the graphics were made using ggplot2 by Hadley Wickham. The coefficient plots were made using a little function I wrote. Here is the code. Any suggestions for improvement are greatly appreciated, especially if you can help with increasing the left hand margin of the plot. And as always, all the work was done in R.
The most obvious relationship we want to test is Overall Quality vs. Price. As can be seen from the scatterplot below with a fitted loess curve, there is not a linear relationship between price and quality.
More after the break. Continue reading
Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.