Thanks to Drew Conway for posting this video of me dicussing my thesis (pdf) on NYC pizza.  It was part of the New York R User Meetup on Applications of R.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Slice recently reported that Fark user “Certainly You Jest” tabulated a list of the 25 most mentioned pizzerias.  Naturally, I decided to play with the numbers.  Rather than write up another formal paper, I did some quick ad hoc analysis for posting on this blog and I will skip some of the more technical aspects.

First, I augmented the data with the price of a typical plain pie that could feed two to four people and the pizzeria’s distance from New York City.  Adding the distance meant I had to remove the multi-state chains, like Monical’s, from the data.

While the number of times a pizzeria is mentioned is count data, it doesn’t quite fit a poisson distribution, and the poisson regression didn’t seem to be a good fit.  This makes sense since I have three predictors (distance from New York, price and their interaction).  You can see this in the two histograms below.

  Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

This Thursday, April 8th, I’ll be giving two brief talks (5 to 10 minutes) about statistical methods at the New York R User Meetup.  The first will be applying multilevel models to World Health Organization data to study noncommunicable diseases.  The second, and probably more fun, will be a presentation of my pizza paper (pdf) that was featured on Slice.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

I just filled out my Census form and I have to say it was fairly painless and simple.  The short form (pdf) really only asks about age, ethnicity and other residences.  If anyone has a long form (now called the American Community Survey), please let me know your experiences filling that out.

The question concerning residence can be a bit tricky these days with so many people having multiple residences, children who live on their own but visit home frequently and couples who live togetherbut also maintain separate residences.

Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Drew Conway has a piece on his Zero Intelligence Agents blog about how well informed Tea Party protesters are about tax policy.  His analysis is pretty technical and he even offers up the R code he used to analyze the data and build the graphs which were made with a package called ggplot2 by Hadley Wickham at Rice University.

More after the break. Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Being a stats junkie I’m probably more excited about filling out my Census forms than most people.  That said, a lot of my friends have expressed glee at receiving their Census forms.  Perhaps that says something about social group.

So you can imagine my delight when I came across this giant, inflatable Census form in Union Square last Saturday night.

I’m not sure about other markets, but there has been a huge advertising blitz for the Census in New York including the commercials featuring actors from the “Best in Show” and “A Mighty Wind” movies.

One more closeup of the moon-bounce inspired Census form after the break.

Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

I don’t mean to shamelessly self-promote here, but I wanted to note that the Slice story on my pizza paper (pdf) has also been picked up by NBC New York’s food blog, Feast, and by Revolution Computing’s blog.  For people who don’t know, Revolution Computing optimizes R, the language used by a large number of statisticians for computations.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

This article from the New York Times about grilock in New York is from two nights ago, but I think it’s worth a glance.  The article is a great look at how slowly cars move.  I especially like the line, “Weekday traffic in the district moved at an average of 9.5 miles per hour — about the speed of a farmyard chicken at full gallop.”

This goes to show how we often misperceive reality regardless of the underlying data.  I know there have been plenty of times that I felt I made much faster progress during midday traffic, but the numbers don’t lie.

I wonder if they account for the different driving patterns between taxis and private cars and if that would make a difference.  I wish the Times had posted a link to the original study so I could see the methods they used.  I would guess they use spatial statistics that can track autocorrelation in time and space and there is a lot of power in those kind of tools.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

The Slice article got picked up by MidtownLunch.

A lot of people have been asking for my favorite pizza places.  The answer depends on what type of pizza I’m looking for, but Maffei’s grandma slice and and Keste are two places that pop into my head a lot.

Someone at the ML forum asked about New York Pizza Suprema, and yes, I love their upside down slice.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Slice has a nice writeup of a paper I wrote performing a statistical analysis of New York City.  The article is nicely written and distills the analysis to the parts people will care about.  See here for the corresponding PowerPoint presentation.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.