Drew Conway has a piece on his Zero Intelligence Agents blog about how well informed Tea Party protesters are about tax policy. His analysis is pretty technical and he even offers up the R code he used to analyze the data and build the graphs which were made with a package called ggplot2 by Hadley Wickham at Rice University.
More after the break.
Drew readily offers up that the sample was NOT scientifically collected so these results are less than valid. The cool thing he did was to use a method called bootstrapping, as statistical technique created by Bradley Efron at Stanford that allows you to generate more data based on existing data which produces more accurate analysis.
There is a red line in the graph that represents the true value for the question asked, which in this case is how much taxes are taken out of the national economy and the average taxes paid by a family earning $50,000. The dark shape (curve) is the distribution of answers from the sample (and from the data generated through bootstrapping). The highest point is the most frequent response and lower points are less frequent.
The closer the red line is to the center and tallest part of the curve, the closer the respondents are to reality. Being that the red lines are to the far left of the curves, that means the respondents generally overestimated these tax questions and thought Americans carried a heavier tax burden than they actually do. However, Drew points out that the red lines are not too extremely far away from the center of the curves meaning that the respondents were not too far off as a whole.
It is important to remember that the data came from a non-scientific poll and therefore should not be viewed as authoritative.
It would be interesting to see a similar analysis done for the population as a whole and compare that to this one to see whether or not the Tea Party protesters fall in line with the national average.
Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.