Earlier this week, my company, Lander Analytics, organized our first public Bayesian short course, taught by Andrew Gelman, Bob Carpenter and Daniel Lee.  Needless to say the class sold out very quickly and left a long wait list.  So we will schedule another public training (exactly when tbd) and will make the same course available for private training.

This was the first time we utilized three instructors (as opposed to a main instructor and assistants which we often use for large classes) and it led to an amazing dynamic.  Bob laid the theoretical foundation for Markov chain Monte Carlo (MCMC), explaining both with math and geometry, and discussed the computational considerations of performing simulation draws.  Daniel led the participants through hands-on examples with Stan, covering everything from how to describe a model, to efficient computation to debugging.  Andrew gave his usual, crowd dazzling performance use previous work as case studies of when and how to use Bayesian methods.

It was an intensive three days of training with an incredible amount of information.  Everyone walked away knowing a lot more about Bayes, MCMC and Stan and eager to try out their new skills, and an autographed copy of Andrew’s book, BDA3.

A big help, as always was Daniel Chen who put in so much effort making the class run smoothly from securing the space, physically moving furniture and running all the technology.


Attending this week’s Strata conference it was easy to see quite how prolific the NYC Data Mafia is when it comes to writing.  Some of the found books:

And, of course, my book will be out soon to join them.

For the past few weeks Time Out New York‘s Dating columnist, Jamie Bufalino, has been fielding letters discussing the ratio of homosexual to heterosexual questions he answers.  The readers suggested that disproportionate attention is paid to Gay and Lesbian issues compared to the Gay and Lesbian proportion of the general population.      

Jamie rudely called his readers “ass-wipes” and repeatedly told them to “remove your head from your ass.”  He also professed to have “no idea what the percentage is of gay/bi versus straight issues that end up in the column.”  

One question and response:  

Q I see statistics that show NYC to be 6 percent gay, lesbian and bi, and yet in “Get Naked” you feature letters from them almost to the exclusion of heteros. Why the preoccupation with them in your column? It doesn’t seem right or logical. As one of the other 94 percent, I am disappointed and offended weekly.   

A All I can say is: You’ve got your head up your butt. Just in the past month or so, I’ve answered letters from a straight guy with a weird fetish that suddenly stopped delivering the jollies it used to, a straight guy who was juggling a woman from the Ukraine and a woman from Jersey, a woman who had an issue with sticking her finger up her boyfriend’s butt, a 19-year-old woman who was getting pressured to have sex with her boyfriend, and on and on. If, for some reason, you happen to be obsessing over the gay and bi questions and not acknowledging the straight ones, that’s your issue, not mine.  

And another:  

Q I always read your column to see if I can learn something and just for shits and giggles. The one thing that has always bothered me is your preoccupation with gay and bi problems. Gays and lesbians get their own special section of three to four pages!  

A First of all, dude, you sound like one of those total ass-wipes who believes that gay people somehow have all these special privileges that straight people aren’t entitled to. Honestly, I have no idea what the percentage is of gay/bi versus straight issues that end up in the column, because it doesn’t matter. If you removed your head from your ass, you’d realize that so many sexual issues are universal and that you can learn something from all sorts of people who don’t fit into your specific demographic.  

When confronted with the data he once again reffered to a “head lodged up [a] rectum” and suggested the reader was “paranoid.”  

Q As a statistician I am disappointed by your response to a question in the November 4 issue [TONY 788]. The reader wrote, “I see statistics that show NYC to be 6 percent gay, lesbian and bi, and yet in ‘Get Naked’ you feature letters from them almost to the exclusion of heteros. Why the preoccupation with them in your column? … As one of the other 94 percent, I am disappointed and offended weekly.” You responded by citing individual examples of heterosexual questions you’ve fielded, which is not a valid form of proof. I went through about ten months’ worth of “Get Naked” columns on the TONY website and found that approximately 19 percent of the questions were from gay (15 percent) or lesbian (4 percent) readers. Whether or not that percentage is representative of the general population is not my concern. I just feel that Jamie should have his data correct and not write, “You’ve got your head up your butt.”  

A I seriously cannot believe I am still getting letters about this. Okay, Mr. Disappointed Statistician: If you don’t want to come off as someone who has his head lodged up his rectum, it would be an awesome idea not to leap to the defense of some jackass who claims I cater to homo letters “almost to the exclusion of heteros” and then point out that straight issues actually make up a full 81 percent of the subject matter here in “Get Naked.” What I want to know is, why are you even keeping score? Are you really that insecure about the amount of attention heterosexual sex gets in the media? If so, that’s both laughable and sad. This is the last time I’m addressing this, so here’s my final bit of advice to you (and your like-minded brethren): Stop being so paranoid.  

Since Jamie is so rude to his readers and clearly doesn’t have any sense of the data, I thought I’d take a look at the numbers.  Results after the break.   

Continue reading

Last week Slice ran a post about a tomato taste test they conducted with Scott Wiener (of Scott’s NYC Pizza Tours), Brooks Jones, Jason Feirman, Nick Sherman and Roberto Caporuscio from Keste.  While the methods used may not be rigorous enough for definitive results, I took the summary data that was in the post and performed some simple analyses.

The first thing to note is that there are only 16 data points, so multiple regression is not an option.  We can all thank the Curse of Dimensionality for that.  So I stuck to simpler methods and visualizations.  If I can get the raw data from Slice, I can get a little more advanced.

For the sake of simplicity I removed the tomatoes from Eataly because their price was such an outlier that it made visualizing the data difficult.  As usual, most of the graphics were made using ggplot2 by Hadley Wickham.  The coefficient plots were made using a little function I wrote.  Here is the code.  Any suggestions for improvement are greatly appreciated, especially if you can help with increasing the left hand margin of the plot.  And as always, all the work was done in R.

The most obvious relationship we want to test is Overall Quality vs. Price.  As can be seen from the scatterplot below with a fitted loess curve, there is not a linear relationship between price and quality.

More after the break. Continue reading