Last night we celebrated Rounded Pi Day by rounding at the 10,000’s digit to get 3.1416 which nicely works with the date 3/14/16. This was great after Mega Pi Day worked out so perfectly last year. And this all built uponpreviousyears’celebrations.

We ate a large quantity of pizza at Lombardi’s. and for the second year in a row we got the Pi Cake from Empire Cakes with peanut butter and chocolate flavors. The base was inscribed with historic approximations of Pi: 25/8, 256/81, 339/108, 223/71, 377/120, 3927/1250, 355/113, 62832/20000, 22/7.

This was the first time we utilized three instructors (as opposed to a main instructor and assistants which we often use for large classes) and it led to an amazing dynamic. Bob laid the theoretical foundation for Markov chain Monte Carlo (MCMC), explaining both with math and geometry, and discussed the computational considerations of performing simulation draws. Daniel led the participants through hands-on examples with Stan, covering everything from how to describe a model, to efficient computation to debugging. Andrew gave his usual, crowd dazzling performance use previous work as case studies of when and how to use Bayesian methods.

It was an intensive three days of training with an incredible amount of information. Everyone walked away knowing a lot more about Bayes, MCMC and Stan and eager to try out their new skills, and an autographed copy of Andrew’s book, BDA3.

A big help, as always was Daniel Chen who put in so much effort making the class run smoothly from securing the space, physically moving furniture and running all the technology.

On April 24th and 25th Lander Analytics and Work-Bench coorganized the (sold-out) inaugural New York R Conference. It was an amazing weekend of nerding out over R and data, with a little Python and Julia mixed in for good measure. People from all across the R community gathered to see rockstars discuss their latest and greatest efforts.

This year we celebrated Mega Pi Day with the date (3/14/15) covering the firstfourdigits of Pi. And of course, we unveiled the Pi Cake at 9:26 to get the next three digits. This year the cake came from Empire Cakes and was peanut butter flavored. We even had the bakery put as many digits as would fit around the cake.

A large group from the NYC Data Mafia came out and Scott Wiener of Scott’s Pizza Tours ensured we had the perfect assortment and quantity of pizza.

However, he wondered if the preponderance of dollar slice shops has dropped the price of a slice below that of the subway and playfully joked that he wished there was a statistician in the audience.

Naturally, that night I set off to calculate the current price of a slice in New York City using listings from MenuPages. I used R’sXML package to pull the menus for over 1,800 places tagged as “Pizza” in Manhattan, Brooklyn and Queens (there was no data for Staten Island or The Bronx) and find the price of a cheese slice.

After cleaning up the data and doing my best to find prices for just cheese/plain/regular slices I found that the mean price was $2.33 with a standard deviation of $0.52 and a median price of $2.45. The base subway fare is $2.50 but is actually $2.38 after the 5% bonus for putting at least $5 on a MetroCard.

So, even with the proliferation of dollar slice joints, the average slice of pizza ($2.33) lines up pretty nicely with the cost of a subway ride ($2.38).

Taking it a step further, I broke down the price of a slice in Manhattan, Queens and Brooklyn. The vertical lines represented the price of a subway ride with and without the bonus. We see that the price of a slice in Manhattan is perfectly right there with the subway fare.

MenuPages even broke down Queens Neighborhoods so we can have a more specific plot.

The code for downloading the menus and the calculations is after the break.

## polla_qid Answer Votes pollq_id Question
## 1 2 Excellent 0 2 How was Pizza Mercato?
## 2 2 Good 6 2 How was Pizza Mercato?
## 3 2 Average 4 2 How was Pizza Mercato?
## 4 2 Poor 1 2 How was Pizza Mercato?
## 5 2 Never Again 2 2 How was Pizza Mercato?
## 6 3 Excellent 1 3 How was Maffei's Pizza?
## Place Time TotalVotes Percent
## 1 Pizza Mercato 1.344e+09 13 0.0000
## 2 Pizza Mercato 1.344e+09 13 0.4615
## 3 Pizza Mercato 1.344e+09 13 0.3077
## 4 Pizza Mercato 1.344e+09 13 0.0769
## 5 Pizza Mercato 1.344e+09 13 0.1538
## 6 Maffei's Pizza 1.348e+09 7 0.1429

require(ggplot2)
ggplot(pizza, aes(x = Place, y = Percent, group = Answer, color = Answer)) +
geom_line() + theme(axis.text.x = element_text(angle = 46, hjust = 1), legend.position = "bottom") +
labs(x = "Pizza Place", title = "Pizza Poll Results")

But given this is live data that will change as more polls are added I thought it best to use a plot that automatically updates and is interactive. So this gave me my first chance to needrCharts by Ramnath Vaidyanathan as seen at October’s meetup.

There are still a lot of things I am learning, including how to use a categorical x-axis natively on linecharts and inserting chart titles. I found a workaround for the categorical x-axis by using tickFormat but that is not pretty. I also would like to find a way to quickly switch between a line chart and a bar chart. Fitting more labels onto the x-axis or perhaps adding a scroll bar would be nice too.

The class starts with the very basics such as variable types, vectors, data.frames and matrices. After that we explore munging data with aggregate, plyr and reshape2. Once the data is prepared we will use ggplot2 to visualize it and then fit models using lm, glm and decision trees.

Most of the material comes from my upcoming book R for Everyone.

Participants are encouraged to bring computers so they can code along with the live examples. They should also have R and RStudio preinstalled.

Continuing the annualtradition of Pi Cakes from Chrissie Cook we have gotten another Pi Cake! This year we let Drew Conway’s wife pick the flavors and she went with vanilla and red velvet (the blue color is to cause some cognitive dissonance). Looking forward to enjoying this tonight after some pizza.