Earlier this week, my company, Lander Analytics, organized our first public Bayesian short course, taught by Andrew Gelman, Bob Carpenter and Daniel Lee. Needless to say the class sold out very quickly and left a long wait list. So we will schedule another public training (exactly when tbd) and will make the same course available for private training.
This was the first time we utilized three instructors (as opposed to a main instructor and assistants which we often use for large classes) and it led to an amazing dynamic. Bob laid the theoretical foundation for Markov chain Monte Carlo (MCMC), explaining both with math and geometry, and discussed the computational considerations of performing simulation draws. Daniel led the participants through hands-on examples with Stan, covering everything from how to describe a model, to efficient computation to debugging. Andrew gave his usual, crowd dazzling performance use previous work as case studies of when and how to use Bayesian methods.
It was an intensive three days of training with an incredible amount of information. Everyone walked away knowing a lot more about Bayes, MCMC and Stan and eager to try out their new skills, and an autographed copy of Andrew’s book, BDA3.
A big help, as always was Daniel Chen who put in so much effort making the class run smoothly from securing the space, physically moving furniture and running all the technology.
Attending this week’s Strata conference it was easy to see quite how prolific the NYC Data Mafia is when it comes to writing. Some of the found books:
And, of course, my book will be out soon to join them.
Last week Slice ran a post about a tomato taste test they conducted with Scott Wiener (of Scott’s NYC Pizza Tours), Brooks Jones, Jason Feirman, Nick Sherman and Roberto Caporuscio from Keste. While the methods used may not be rigorous enough for definitive results, I took the summary data that was in the post and performed some simple analyses.
The first thing to note is that there are only 16 data points, so multiple regression is not an option. We can all thank the Curse of Dimensionality for that. So I stuck to simpler methods and visualizations. If I can get the raw data from Slice, I can get a little more advanced.
For the sake of simplicity I removed the tomatoes from Eataly because their price was such an outlier that it made visualizing the data difficult. As usual, most of the graphics were made using ggplot2 by Hadley Wickham. The coefficient plots were made using a little function I wrote. Here is the code. Any suggestions for improvement are greatly appreciated, especially if you can help with increasing the left hand margin of the plot. And as always, all the work was done in R.
The most obvious relationship we want to test is Overall Quality vs. Price. As can be seen from the scatterplot below with a fitted loess curve, there is not a linear relationship between price and quality.
More after the break. Continue reading