Today, Google announced two new services that are sure to be loved by data geeks.  First is their BigQuery which lets you analyze “Terabytes of data, trillions of records.”  This is great for people with large datasets.  I wonder if a program like R(my favorite statistical analysis package) can read it?  If so would R just pull down the data like it would from any other database?  That would most likely result in a data.frame that is far too large for a standard computer to handle.  Maybe R can be ran in a way that it hits the BigQuery service and leaves the data in there.  Maybe even the processing can be done on Google’s end, allowing for much better computation time.  This is something I’ve been dreaming of for a while now.

Further, can BigQuery produce graphics?  If so, this might be a real shot at Business Intelligence tools like QlikView or Cognosthat specialize in handling LARGE datasets. Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Thanks to Drew Conway for posting this video of me dicussing my thesis (pdf) on NYC pizza.  It was part of the New York R User Meetup on Applications of R.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

Drew Conway has a piece on his Zero Intelligence Agents blog about how well informed Tea Party protesters are about tax policy.  His analysis is pretty technical and he even offers up the R code he used to analyze the data and build the graphs which were made with a package called ggplot2 by Hadley Wickham at Rice University.

More after the break. Continue reading

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.

I don’t mean to shamelessly self-promote here, but I wanted to note that the Slice story on my pizza paper (pdf) has also been picked up by NBC New York’s food blog, Feast, and by Revolution Computing’s blog.  For people who don’t know, Revolution Computing optimizes R, the language used by a large number of statisticians for computations.

Related Posts



Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.