This Monday I’ll be talking at the Amsterdam R meetup, better known as amst-R-dam. At their request I’ll discuss the differences between the New York and Silicon Valley data scenes. Time permitting I’ll also go over some topic that I’ll let the audience choose.

# Ringing Bells in Myanmar

While President Obama made big news for his trip to Myanmar I would like to point out I rang the same bell as him (picture above) three years before he did.

# More Lotto Statistics

Thanks to Rachel Schutt, who I’m teaching with at Columbia, and Cathy O’Neil from MathBabe I had the opportunity to go on TV and talk about the statistics of tonight’s Powerball lottery.

There’s an article with a brief quote from me and a video where I may a very quick appearance at the 1:14 mark. My interview during the live broadcast actually went on for about three minutes but I can’t find that online. If I can transfer the video from my DVR, I’ll post that too.

In the longer interview I discussed the probability of winning and the expected value of a given ticket and other such statistical nuggets. In particular I broke down how choosing numbers based on birthdays eliminates any number higher than 31 mean you are missing out on 28 of the 59 possible numbers that are uniformly distributed. Hopefully I’ll find that longer cut.

The video can be found here: Video

# Pizza Polls

**How was Prince Street Pizza?**

- Good (37%, 11 Votes)
- Average (33%, 10 Votes)
- Poor (23%, 7 Votes)
- Excellent (3%, 1 Votes)
- Never Again (3%, 1 Votes)

Total Voters: **30**

Aggregated results.

Results from individual previous polls are below. Continue reading

# Yankees and Republicans

A friend of mine has told me on numerous occasions that since 1960 the Yankees have not won a World Series while a Republican was President. Upon hearing this my Republican friends (both Yankee and Red Sox fans) turn incredulous and say that this is ridiculous. So I decided to investigate. To be clear this is in no way shows causality, but just checks the numbers.

The data was easily attainable so it really came down to plotting.

The plot above shows every Yankee win (and loss) since 1960 and the party of the President at the time. It is clear to see that all nine Yankees World Series wins came while a Democrat inhabited the White House. The fluctuation plot below shows Yankee wins both before and after 1960 and the complete lack of a block for Republican/Post-1960 simply makes the case.

There are similar plots for the American League after the jump.

# How was Pizza Mercato?

**How was Pizza Mercato?**

- Good (46%, 6 Votes)
- Average (31%, 4 Votes)
- Never Again (15%, 2 Votes)
- Poor (8%, 1 Votes)
- Excellent (0%, 0 Votes)

Total Voters: **13**

# EDA, Visualization and Collaboration on the Web

Wes McKinney and I are hosting our first ever Open Statistical Programming meetup tomorrow night after taking over for Drew Conway. Please attend, have some pizza, enjoy the talk then come out for some beer.

This meetup is about EDA, Visualization and Collaboration on the Web and will be presented by Carlos Scheidegger from AT&T Labs.

This month’s pizza will be from Pizza Mercato in the Village.

# Quote in Wired Magazine

Couldn’t resist showing off this article in Wired Magazine that quotes me. It’s a good take on the new, semi-corporate hacking culture, but then again, I may be a bit biased.

http://www.wired.com/business/2012/06/hackathons-arent-just-for-hacking/

# Lotto Odds

With tonight’s Mega Millions jackpot estimated to be over $640 million there are long lines of people waiting to buy tickets. Of course you always hear about the probability of winning which is easy enough to calculate: Five numbers ranging from 1 through 56 are drawn (without replacement) then a sixth ball is pulled from a set of 1 through 46. That means there are choose(56, 5) * 46 = 175,711,536 possible different combinations. That is why people are constantly reminded of how unlikely they are to win.

But I want to see how likely it is that SOMEONE will win tonight. So let’s break out R and ggplot!

As of this afternoon it was reported (sorry no source) that two tickets were sold for every American. So let’s assume that each of these tickets is an independent Bernoulli trial with probability of success of 1/175,711,536.

Running 1,000 simulations we see the distribution of the number of winners in the histogram above.

So we shouldn’t be surprised if there are multiple winners tonight.

The R code:

winners <- rbinom(n=1000, size=600000000, prob=1/175000000) qplot(winners, geom="histogram", binwidth=1, xlab="Number of Winners")

# Pi Cake 2012

This year’s Pi Cake courtesy of Chrissie Cook:

Side View:

And don’t forget this is Albert Einstein’s birthday was well.

How are you celebrating this fantastic geek holiday?

Like last year, the NYC Data Mafia will be out celebrating with (round) pizza.