<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jared Lander</title>
	<atom:link href="http://www.jaredlander.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jaredlander.com</link>
	<description></description>
	<lastBuildDate>Wed, 08 Feb 2012 22:02:02 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Another Kind of Super Bowl Pool</title>
		<link>http://www.jaredlander.com/2012/02/another-kind-of-super-bowl-pool/</link>
		<comments>http://www.jaredlander.com/2012/02/another-kind-of-super-bowl-pool/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 22:02:02 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[New York]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Sports]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=791</guid>
		<description><![CDATA[
Shortly after the Giants fantastic defeat of the Patriots in Super Bowl XLVI (I was a little disappointed that Eli, Coughlin and the Vince Lombardi Trophy all got off the parade route early and the views of City Hall were obstructed by construction trailers, but Steve Weatherford was awesome as always) a friend asked me ]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Giants-Parade.jpg"><img class="aligncenter size-full wp-image-799" title="Giants Parade" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Giants-Parade.jpg" alt="" width="480" height="360" /></a></p>
<p>Shortly after the <a href="http://www.giants.com/">Giants</a> <a href="http://www.myfoxny.com/dpp/news/new-york-giants-super-bowl-victory-parade-2012">fantastic</a> <a href="http://pleasantville.patch.com/articles/photos-big-blue-fan-demonium-at-giants-parade#photo-9061098">defeat</a> of the <a href="http://twitpic.com/8gox3t">Patriots</a> in <a href="http://www.jaredlander.com/2012/02/football-score-distributions/">Super Bowl XLVI</a> (I was a little disappointed that Eli, Coughlin and the Vince Lombardi Trophy all got off the parade route early and the views of City Hall were obstructed by construction trailers, but Steve Weatherford was <a href="http://www.sportsgrid.com/nfl/steve-weatherford-giants-parade/">awesome</a> as always) a friend asked me to settle a debate amongst some people in a Super Bowl pool.</p>
<p>He writes:</p>
<blockquote><p>We have 10 participants in a superbowl pool.  The pool is a “pick the player who scores first” type pool.  In a hat, there are 10 Giants players.  Each participant picks 1 player out of the hat (in no particular order) until the hat is emptied.  Then 10 Patriots players go in the hat and each participant picks again.</p>
<p>In the end, each of the 10 participants has 1 Giants player and 1 Patriots player.  No one has any duplicate players as 10 different players from each team were selected.  Pool looks as follows:</p></blockquote>
<table>
<tbody>
<tr>
<td>Participant 1</td>
<td>Giant A</td>
<td>Patriot Q</td>
</tr>
<tr>
<td>Participant 2</td>
<td>Giant B</td>
<td>Patriot R</td>
</tr>
<tr>
<td>Participant 3</td>
<td>Giant C</td>
<td>Patriot S</td>
</tr>
<tr>
<td>Participant 4</td>
<td>Giant D</td>
<td>Patriot T</td>
</tr>
<tr>
<td>Participant 5</td>
<td>Giant E</td>
<td>Patriot U</td>
</tr>
<tr>
<td>Participant 6</td>
<td>Giant F</td>
<td>Patriot V</td>
</tr>
<tr>
<td>Participant 7</td>
<td>Giant G</td>
<td>Patriot W</td>
</tr>
<tr>
<td>Participant 8</td>
<td>Giant H</td>
<td>Patriot X</td>
</tr>
<tr>
<td>Participant 9</td>
<td>Giant I</td>
<td>Patriot Y</td>
</tr>
<tr>
<td>Participant 10</td>
<td>Giant J</td>
<td>Patriot Z</td>
</tr>
</tbody>
</table>
<blockquote><p>Winners = First Player to score wins half the pot.  First player to score in 2nd half wins the remaining half of the pot.</p>
<p>The question is, what are the odds that someone wins <strong>Both </strong>the 1st and 2nd half.  Remember, the picks were random.</p></blockquote>
<p>Before anyone asks about the <a href="http://content.usatoday.com/communities/gameon/post/2012/02/tom-bradys-super-bowl-safety-paid-50000-for-gambler/1">safety</a>, one of the slots was for Special Teams/Defense.</p>
<p>There are two probabilistic ways of thinking about this.  Both hinge on the fact that whoever scores first in each half is both independent and not mutually exclusive.</p>
<p>First, let&#8217;s look at the two halves individually.  In a given half any of 20 players can score first (10 from the Giants and 10 from the Patriots) and an individual participant can win with two of those.  So a participant has a 2/20 = 1/10 chance of winning a half.  Thus that participant has a (1/10) * (1/10) = 1/100 chance of winning both halves.  Since there are 10 participants there is an overall probability of 10 * (1/100) = 1/10 of any single participant winning both halves.</p>
<p>The other way is to think a little more combinatorically.  There are 20 * 20 = 400 different combinations of players scoring first in each half.  A participant has two players which are each valid for each half giving them four of the possible combinations leading to a 4 / 400 = 1/100 probability that a single participant will win both halves.  Again, there are 10 participants giving an overall 10% chance of any one participant winning both halves.</p>
<p>Since both methods agreed I am pretty confidant in the results, but just in case I ran some simulations in <a href="http://www.jaredlander.com/tag/r/">R</a> which you can find after the break.</p>
<p><span id="more-791"></span>For the simulation I built a function that randomly assigned a player from the Giants and a player from the Patriots to each participant, then randomly chose&#8211;from those 20 players&#8211;someone to score first and someone to score second, allowing the two to be the same.  If the two scorers belonged to the same participant the function returned 1, otherwise 0.  The code is below.</p>
<pre>runGame &lt;- function(participants, pick1, pick2)
{
    # build data.frame assigning a random selection from each team to each player
    choices &lt;- data.frame(Participants=participants, Pick1=sample(x=pick1, size=length(pick1), replace=FALSE), Pick2=sample(x=pick2, size=length(pick2), replace=FALSE))
    # randomly pick a player to score first
    firstScore &lt;- sample(c(pick1, pick2), 1)
    # randomly pick a player to score second
    secondScore &lt;- sample(c(pick1, pick2), 1)
    # see if the players who scored were both "owned" by the same participant
    if(max(which(choices$Pick1 == firstScore), which(choices$Pick2 == firstScore)) == max(which(choices$Pick1 == secondScore), which(choices$Pick2 == secondScore)))
    {
        return(1)
    }else
    {
        return(0)
    }
}</pre>
<p>A second function run the simulation however many times are desired.</p>
<pre>runSim &lt;- function(participants, pick1, pick2, n=10000)
{
    # number of simulations
    # hold the results
    results &lt;- rep(NA, n)

    for(i in 1:n)
    {
        results[i] &lt;- runGame(participants=participants, pick1=pick1, pick2=pick2)
    }
    # find percentage
    return(results)
}</pre>
<p>Running this simulation 10,000 times, I found that the average rate for the same participant to win both parts of the pool was .1041 with a standard error of .003.</p>
<p>So, the simulations confirm the probability theory that about 10% of the time the same participant will both sections of the pool, pissing off all of his friends.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2012/02/another-kind-of-super-bowl-pool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Football Score Distributions</title>
		<link>http://www.jaredlander.com/2012/02/football-score-distributions/</link>
		<comments>http://www.jaredlander.com/2012/02/football-score-distributions/#comments</comments>
		<pubDate>Sun, 05 Feb 2012 05:54:16 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[New York]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Football]]></category>
		<category><![CDATA[ggplot]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Sports]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=766</guid>
		<description><![CDATA[
With the Super Bowl only hours away now is your last chance to buy your boxes.  Assuming the last digits are not assigned randomly you can maximize your chances with a little analysis.  While I&#8217;ve seen plenty of sites giving the raw numbers, I thought a little visualization was in order.
In the graph above (made ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Last-Digit-Distribution1.png"><img class="aligncenter size-full wp-image-783" title="Last Digit Distribution" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Last-Digit-Distribution1.png" alt="" width="473" height="474" /></a></p>
<p style="text-align: left;"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Last-Digit-Distribution1.png"></a>With the <a href="http://www.nfl.com/superbowl/46">Super Bowl</a> only hours away now is your last chance to buy your <a href="http://www.printyourbrackets.com/superbowl100squares.html">boxes</a>.  Assuming the last digits are not assigned randomly you can maximize your chances with a <a href="http://statsinthewild.wordpress.com/2012/02/02/super-bowl-squares/">little analysis</a>.  While I&#8217;ve seen plenty of sites giving the raw numbers, I thought a little visualization was in order.</p>
<p style="text-align: left;">In the graph above (made using <a href="http://www.jaredlander.com/tag/ggplot/">ggplot2</a> in <a href="http://www.jaredlander.com/tag/r/">R</a>, of course) the bigger squares represent greater frequency.  The axes are labelled &#8220;Home&#8221; and &#8220;Away&#8221; for orientation, but in the Super Bowl that probably doesn&#8217;t matter too much, especially considering that Indianapolis is <a href="http://profootballtalk.nbcsports.com/2012/01/30/tom-coughlin-thinks-colts-fans-will-become-giants-fans-this-week/">(Peyton) Manning</a> territory so the locals will most likely be rooting for the Giants.  Further, I believe <a href="http://www.nfl.com/superbowl/42">Super Bowl XLII</a>, featuring the same two teams, had a disproportionate number of Giants fans.  Bias disclaimer:  <a href="http://www.giants.com/">GO BIG BLUE!!!</a></p>
<p style="text-align: left;"><a href="http://www.giants.com/"></a>Below is the same graph broken down by year to see how the distribution has changed over the past 20 years.<br />
<a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Last-Digit-Distribution-by-Year5.png"><img class="aligncenter size-full wp-image-781" title="Last Digit Distribution by Year" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2012/02/Last-Digit-Distribution-by-Year5.png" alt="" width="473" height="474" /></a></p>
<p style="text-align: left;">All the data was scraped from <a href="http://www.pro-football-reference.com/">Pro Football Reference</a>.  All of my code and other graphs that didn&#8217;t make the cut are at my <a href="https://github.com/jaredlander/FootballScores">github site</a>.</p>
<p style="text-align: left;">As always, send any questions my way.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2012/02/football-score-distributions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Plumbing</title>
		<link>http://www.jaredlander.com/2011/12/plumbing/</link>
		<comments>http://www.jaredlander.com/2011/12/plumbing/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 17:51:29 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Food]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=754</guid>
		<description><![CDATA[Taking a break from my normal exposition on stats, New York or pizza I&#8217;d like to espouse the wonders of baking soda and vinegar!
My sink was clogged, not with anything specific, but just years worth of gunk.  So after scraping out what I could with my hands and a wire hanger&#8211;and wanting to avoid caustic ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/12/IMG_6278.jpg"><img class="aligncenter size-medium wp-image-758" title="IMG_6278" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/12/IMG_6278-300x225.jpg" alt="" width="300" height="225" /></a>Taking a break from my normal exposition on <a href="http://www.jaredlander.com/category/statistics/" target="_self">stats</a>, <a href="http://www.jaredlander.com/category/new-york/" target="_self">New York</a> or <a href="http://www.jaredlander.com/category/pizza/" target="_self">pizza</a> I&#8217;d like to espouse the wonders of baking soda and vinegar!</p>
<p>My sink was clogged, not with anything specific, but just years worth of gunk.  So after scraping out what I could with my hands and a wire hanger&#8211;and wanting to avoid caustic chemicals like <a href="http://www.drano.com/en-US/Pages/Home.aspx" target="_self">Drano</a>&#8211;I searched the Internet to see if <a href="http://www.listerine.com/" target="_self">Listerene</a> or<a href="http://www.coca-cola.com/" target="_self"> Coca-Cola</a> might do the trick.  But <a href="http://home.howstuffworks.com/home-improvement/plumbing/how-to-unclog-a-drain.htm" target="_self">extensive</a> <a href="http://www.motherearthnews.com/Do-It-Yourself/2007-12-01/How-to-Unclog-Drains-Without-Chemicals.aspx?page=2" target="_self">searching</a> <a href="http://bonzaiaphrodite.com/2009/06/natural-homemade-drain-o-or-how-to-unclog-without-chemicals/" target="_self">led</a> me to <a href="http://en.wikipedia.org/wiki/Sodium_bicarbonate" target="_self">baking soda</a> and <a href="http://en.wikipedia.org/wiki/Vinegar" target="_self">vinegar</a>.</p>
<p>It&#8217;s very simple:  Stuff a half cup of <a href="http://www.armandhammer.com/deodorization/baking-soda/landing.aspx" target="_self">baking soda</a> into the train then pour a half cup of <a href="http://www.heinzvinegar.com/products-distilled-white-vinegar.aspx">vinegar</a> down it, return the sink stopper and wait 15 minutes.  Then pour down another half cup of vinegar, close the stopper and wait another 15 minutes.  After that pour a gallon (a tea kettle&#8217;s worth) of boiling water down the drain and you&#8217;re done!  Not only will it unclog your drain, it leaves all the chrome shining like new!</p>
<p>For those of us who never got to make a <a href="http://www.volcanolive.com/model.html">model volcano</a> in science class it was really awesome watching the baking soda and vinegar react</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/12/plumbing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cell Phone Tracking for Disaster Relief</title>
		<link>http://www.jaredlander.com/2011/09/cell-phone-tracking-for-disaster-relief/</link>
		<comments>http://www.jaredlander.com/2011/09/cell-phone-tracking-for-disaster-relief/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 03:11:29 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[cell phone]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[NYC Data Mafia]]></category>
		<category><![CDATA[phone]]></category>
		<category><![CDATA[Richard Garfield]]></category>
		<category><![CDATA[spatial]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=739</guid>
		<description><![CDATA[A new study, reported in the New York Times, tracked population movements in post-earthquake Haiti using cell phone data.  The article grabbed my attention because one of the authors, Richard Garfield (whom I have done numerous projects with and who has his own Wikipedia entry!), had told me about this very study just a few ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/09/Nokia-1100.jpg"><img class="alignleft size-medium wp-image-748" title="Nokia 1100" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/09/Nokia-1100-135x300.jpg" alt="" width="135" height="300" /></a>A new <a href="http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.1001083;jsessionid=BA6AEC2DC467EFD46AC4488256EA1829.ambra02">study</a>, reported in the <a href="http://www.nytimes.com/2011/09/06/health/06global.html?_r=1&amp;src=rechp">New York Times</a>, tracked population movements in <a href="http://www.msnbc.msn.com/id/34835478/ns/world_news-haiti/t/haiti-earthquake-how-help/#.TmgsF2iF_ug">post-earthquake Haiti</a> using cell phone data.  The article grabbed my attention because one of the authors, <a href="http://sklad.cumc.columbia.edu/nursing/newFacProfiles/profile2.php?uni=rmg3">Richard Garfield</a> (whom I have done numerous <a href="http://www.jaredlander.com/projects/">projects</a> with and who has his own <a href="http://en.wikipedia.org/wiki/Richard_Garfield_(nursing_professor)">Wikipedia entry</a>!), had told me about this very study just a few months ago.</p>
<p>Over dinner in New York&#8217;s <a href="http://chowhound.chow.com/topics/257037">Little India</a> he explained how the largest cell phone company in Haiti provided him with anonymized cell tower records.  As <a href="http://www.tuaw.com/2011/04/25/a-roundup-of-todays-locationgate-news/">many</a> <a href="http://nymag.com/daily/intel/2011/04/how_you_feel_about_smartphones.html">people</a> are <a href="http://www.bgr.com/2011/04/22/google-our-smartphone-location-tracking-is-opt-in/">aware</a>, cell phones&#8211;even those <a href="http://www.gsmarena.com/nokia_6230-566.php">without GPS</a>&#8211;report their locations back to cell towers at regular intervals.  By tracking the daily position of the phones before and after the earthquake they were able to determine that 20% of Port-Au-Prince&#8217;s population had left the capitol within 19 days of the disaster.</p>
<p>They used plenty of solid math in the analysis and amazingly did it all without resorting to spatial statistics.  They have some nice map-based visualizations but I&#8217;ve been meaning to get the data from Dr. Garfield so I can attempt something similar to the amazing work done by the <a href="http://www.jaredlander.com/?s=data+mafia&amp;x=0&amp;y=0">NYC Data Mafia</a> on the <a href="http://bits.blogs.nytimes.com/2010/08/18/visualizing-the-wikileaks-war-logs/">WikiLeaks Afghanistan data</a>.  Though I don&#8217;t promise anything nearly as good.</p>
<p>It is also worth noting that they did this at a fraction of the cost and time of an extensive <a href="http://www.nytimes.com/2011/09/06/health/06global.html?_r=1&amp;src=rechp">UN</a> survey.  That survey only had about 2,500 respondents whereas the cell phone project incorporated around 1.9 million people without them spending valuable time with an interviewer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/09/cell-phone-tracking-for-disaster-relief/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to Succeed in Scrabble</title>
		<link>http://www.jaredlander.com/2011/06/how-to-succeed-in-scrabble/</link>
		<comments>http://www.jaredlander.com/2011/06/how-to-succeed-in-scrabble/#comments</comments>
		<pubDate>Mon, 27 Jun 2011 13:00:40 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[brute force]]></category>
		<category><![CDATA[games]]></category>
		<category><![CDATA[plyr]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[scrabble]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=693</guid>
		<description><![CDATA[

While playing Words with Friends my randomly chosen opponent played &#8220;radiale&#8221; as her first word.  Since that used up all of her tiles, she received a bonus on top of all the points the word itself got, resulting in a one-move score of 53 points!  Rather than being impressed I was upset at the large ]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/06/Scrabble-2.png"></a><br />
<a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/06/Scrabble-11.png"><img class="aligncenter size-medium wp-image-699" title="First Scrabble Move" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/06/Scrabble-11-e1309141834282-252x300.png" alt="" width="252" height="300" /></a></p>
<p>While playing <a href="http://www.wordswithfriends.com/">Words with Friends</a> my randomly chosen opponent played &#8220;radiale&#8221; as her first word.  Since that used up all of her tiles, she received a bonus on top of all the points the word itself got, resulting in a one-move score of 53 points!  Rather than being impressed I was upset at the large deficit I would have to overcome.</p>
<p>To combat this I did what comes naturally:  Write an <a href="http://www.jaredlander.com/tag/r/">R</a> script to find the perfect word!</p>
<p>Needing to combine my seven letters with one of her letters there were two routes I could take.  The first would be for each combination of my seven letters and one of hers, find all 40,320 (8!) permutations then hit dictionary.com to see if it is a real word for a total of 282,240 (8!*7) http calls.  That seemed a bit excessive and impractical so I moved on to the next idea.</p>
<p>So, first thing I did was pull a list of common <a href="http://www.poslarchive.com/math/scrabble/lists/common-8.html">eight-letter words</a>. Then for each combination of my letters and one of hers (only 7 iterations) I checked if those letters (in any order) matched the letters in any of the possible words.  Once a match was found there was a check for the counts of the letters and if that passed then the word was recorded as a true match.</p>
<p>The algorithm took about 17 seconds to run and found me one possible word for my letters combined with one of hers:  &#8221;headrace&#8221;, for 63 points!  Perhaps I should have been able to figure that out on my own, but where would be the fun in that.  Find the code after the break.</p>
<p style="text-align: center;"><img class="aligncenter" title="Second Scrabble Move" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/06/Scrabble-2-e1309142224266-256x300.png" alt="" width="256" height="300" /></p>
<div>
<p><span style="color: #0000ee; -webkit-text-decorations-in-effect: underline;"><span id="more-693"></span></span></p>
<pre><code lang="r">require(RCurl)
require(XML)
require(plyr)

myLetters &lt;- "ercaehd"
theirLetters &lt;- "radiale"

wordSite &lt;- getURL("http://www.poslarchive.com/math/scrabble/lists/common-8.html")  # get page with words

# parse out the page
wordsParsed &lt;- htmlTreeParse(wordSite, handlers=list("pre"=function(x, ...){ x }, "p"=function(x, ...){ NULL },
                                                    "head"=function(x, ...){ NULL }), asTree=TRUE)

# get just the list of words
words &lt;- as.character(wordsParsed$children$html[["body"]][["pre"]][["text"]])
words &lt;-gsub("\n", " ", words)  # sub out carriage return characters for spaces

## Turn the block of words into a nice list, each element of which is a vector of characters
words &lt;- strsplit(words, split=" ")[[6]]    # split the long string into a vector seperating by spaces
words &lt;- as.list(words)     # change from vector to list
words &lt;- lapply(words, strsplit, split="")  # split each element of the list into a vector of it's characters

findAllWords &lt;- function(baseLetters, theirWord, listOfWords)
{
    extraLetters &lt;- strsplit(theirWord, split="")[[1]]   # split their word into letters, probably should have removed redundant letters first for a slight efficiency gain

    wordSuccess &lt;- vector("list", length(extraLetters))

    for(a in 1:length(extraLetters))
    {
        seedLetters &lt;- paste(baseLetters, extraLetters[a], sep="")
        #seedLetters &lt;- strsplit(seedLetters, split="")[[1]]

        wordSuccess[[a]] &lt;- checkForWord(words=listOfWords, letters=seedLetters)
    }

    return(wordSuccess)
}

checkForWord &lt;- function(words, letters)
{
    holder &lt;- vector("list", length(words)) # to hold results
    letters &lt;- strsplit(letters, split="")[[1]]  # split up letters
    letterFrame &lt;- data.frame(Letters=letters, stringsAsFactors=FALSE)  # to facilitate counting
    letterCount &lt;- ddply(letterFrame, .(Letters), nrow)   # counts for each letter
    rm(letterFrame); gc()
    names(letterCount)[2] &lt;- "LetterCount"

    ## loop through every word in the list and see if our letters are found in there
    for(i in 1:length(words))
    {
        if(all(letters %in% words[[i]][[1]]))   # if all the letters are found in the word
        {
            # check counts
            wordFrame &lt;- data.frame(Letters=words[[i]][[1]], stringsAsFactors=FALSE)    # put letters from word into DF
            wordCount &lt;- ddply(wordFrame, .(Letters), nrow) # get counts
            rm(wordFrame); gc()
            names(wordCount)[2] &lt;- "WordLetterCount"

            joinedLetters &lt;- join(letterCount, wordCount, by="Letters") # join the tables for a comparison

            holder[[i]] &lt;- all(with(joinedLetters, LetterCount == WordLetterCount)) # holder gets the result of the comparison
        }else
        {
            holder[[i]] &lt;- FALSE
        }
    }

    holder &lt;- laply(holder, function(x) x) # convert to vector

    if(sum(holder) == 0)
    {
        return(sum(holder))
    }else
    {
        return(which(holder == TRUE))
    }
}

system.time(result &lt;- findAllWords(baseLetters=myLetters, theirWord=theirLetters, listOfWords=words))</code></pre>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/06/how-to-succeed-in-scrabble/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The FBI Needs Your Help</title>
		<link>http://www.jaredlander.com/2011/03/the-fbi-needs-your-help/</link>
		<comments>http://www.jaredlander.com/2011/03/the-fbi-needs-your-help/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 03:06:48 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Cryptography]]></category>
		<category><![CDATA[FBI]]></category>
		<category><![CDATA[NSA]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=689</guid>
		<description><![CDATA[The FBI has put out a public request for help cracking a code.  The code above was found in the pants of a murder victim over 10 years ago.  Despite some of the best code breakers in the world give it a shot, they have not been able to break the code.  I wonder if ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/FBI-Puzzle.bmp"><img class="aligncenter size-full wp-image-690" title="FBI Puzzle" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/FBI-Puzzle.bmp" alt="" /></a>The <a href="http://www.fbi.gov">FBI</a> has put out a public request for <a href="http://www.fbi.gov/news/stories/2011/march/cryptanalysis_032911">help cracking a code</a>.  The code above was found in the pants of a murder victim over 10 years ago.  Despite some of the best code breakers in the world give it a shot, they have not been able to break the code.  I wonder if the <a href="http://www.nsa.gov/">NSA</a> had a go at it.  Couldn&#8217;t they try brute force like in <a href="http://www.danbrown.com">Dan Brown&#8217;s</a> <em><a href="http://www.amazon.com/Digital-Fortress-Thriller-Dan-Brown/dp/0312263120">Digital Fortress</a></em>?  Yes I referenced Dan Brown in the same paragraph as the NSA, deal with it.</p>
<p>If you think you can help send a letter to:</p>
<p>FBI Laboratory<br />
Cryptanalysis and Racketeering Records Unit<br />
2501 Investigation Parkway<br />
Quantico, VA 22135<br />
Attn: Ricky McCormick Case</p>
<p>There&#8217;s no reward but you&#8217;d be helping your country.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/03/the-fbi-needs-your-help/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Pi Day Photos</title>
		<link>http://www.jaredlander.com/2011/03/pi-day-photos/</link>
		<comments>http://www.jaredlander.com/2011/03/pi-day-photos/#comments</comments>
		<pubDate>Tue, 15 Mar 2011 16:53:17 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Food]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[New York]]></category>
		<category><![CDATA[Pizza]]></category>
		<category><![CDATA[Artichoke]]></category>
		<category><![CDATA[Drew Conway]]></category>
		<category><![CDATA[Harlan Harris]]></category>
		<category><![CDATA[High Line]]></category>
		<category><![CDATA[John]]></category>
		<category><![CDATA[Mike Dewar]]></category>
		<category><![CDATA[NYC Data Mafia]]></category>
		<category><![CDATA[Pi Day]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=675</guid>
		<description><![CDATA[As mentioned earlier, yesterday was Pi Day so a bunch of statisticians and other such nerds celebrated at the new(ish) Artichoke Basille near the High Line.  We had three pies:  the signature Artichoke, the Margherita and the Anchovy, which was delicious but only some of us ate.  And of course we had our custom cake ]]></description>
			<content:encoded><![CDATA[<div id="attachment_676" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-01.jpg"><img class="size-medium wp-image-676" title="pi_day_2011_big 01" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-01-300x200.jpg" alt="" width="300" height="200" /></a>
<p class="wp-caption-text">Pi Day Celebrants</p>
</div>
<p style="text-align: left;"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-02.jpg"></a>As mentioned earlier, yesterday was <a href="http://www.jaredlander.com/2011/03/happy-pi-day/">Pi Day</a> so a bunch of statisticians and other such nerds celebrated at the new(ish) <a href="http://www.jaredlander.com/2010/09/the-new-artichoke/">Artichoke Basille</a> near the <a href="http://www.thehighline.org/">High Line</a>.  We had three pies:  the signature Artichoke, the Margherita and the Anchovy, which was delicious but only some of us ate.  And of course we had our custom cake from <a href="https://www.facebook.com/ChrissieCookCakes">Chrissie Cook</a>.</p>
<p>The photos were taken by <a href="http://www.jaredlander.com/tag/john/">John</a>.</p>
<p style="text-align: center;">
<div class="wp-caption aligncenter" style="width: 210px"><img title="pi_day_2011_big 02" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-02-200x300.jpg" alt="" width="200" height="300" />
<p class="wp-caption-text">Pi Cake 2011</p>
</div>
<div id="attachment_683" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-031.jpg"><img class="size-medium wp-image-683" title="pi_day_2011_big 03" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/pi_day_2011_big-031-300x200.jpg" alt="NYC Data Mafia" width="300" height="200" /></a>
<p class="wp-caption-text">NYC Data Mafia</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/03/pi-day-photos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Happy Pi Day! (Update: Sounds of Pi Video)</title>
		<link>http://www.jaredlander.com/2011/03/happy-pi-day/</link>
		<comments>http://www.jaredlander.com/2011/03/happy-pi-day/#comments</comments>
		<pubDate>Mon, 14 Mar 2011 21:10:19 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Food]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Pizza]]></category>
		<category><![CDATA[Albert Einstein]]></category>
		<category><![CDATA[Cake]]></category>
		<category><![CDATA[Chrissie Cook]]></category>
		<category><![CDATA[Drew Conway]]></category>
		<category><![CDATA[NYC Data Mafia]]></category>
		<category><![CDATA[Pi]]></category>
		<category><![CDATA[Pi Day]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=665</guid>
		<description><![CDATA[Happy Pi Day everybody!  I&#8217;ll be out celebrating with the rest of the NYC Data Mafia eating pizza and devouring the above Pi Cake, custom baked by Chrissie Cook.
Today is also Albert Einstein&#8217;s birthday so there are plenty of reasons to have fun.
The cake below was my first ever Pi Cake in what is sure ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/Pi-Cake-2011.jpg"><img class="aligncenter size-medium wp-image-666" title="Pi Cake 2011" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/Pi-Cake-2011-300x225.jpg" alt="Pi Cake" width="300" height="225" /></a>Happy Pi Day everybody!  I&#8217;ll be out celebrating with the rest of the <a href="http://www.jaredlander.com/2010/12/nyc-data-mafia-t-shirts/">NYC Data Mafia</a> eating pizza and devouring the above Pi Cake, custom baked by <a href="https://www.facebook.com/ChrissieCookCakes">Chrissie Cook</a>.</p>
<p>Today is also <a href="http://nobelprize.org/nobel_prizes/physics/laureates/1921/einstein-bio.html">Albert Einstein&#8217;s</a> birthday so there are plenty of reasons to <a href="http://news.cnet.com/8301-17938_105-20043006-1.html">have</a> <a href="http://www.google.com/hostednews/afp/article/ALeqM5gzaRyXRtY8cf0Z5jEl4JtKFlyRHQ?docId=CNG.9cac656ee218c88029a4490458898142.791">fun</a>.</p>
<p>The cake below was my first ever Pi Cake in what is sure to become an annual tradition.</p>
<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/IMG_1325.jpg"><img class="aligncenter size-medium wp-image-667" title="Pi Cake 2009" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/03/IMG_1325-300x225.jpg" alt="Pi Cake 2009" width="300" height="225" /></a></p>
<p><strong>Update:</strong>  Drew Conway does far more <a href="http://www.drewconway.com/zia/?p=2667">justice</a> to our fair, irrational, transcendental number.</p>
<p><strong>Update 2</strong>:  Engadget posted this awesome video of &#8220;<a href="http://www.engadget.com/2011/03/14/its-pi-day-do-you-know-what-3-1415926535897932384626433832795/">What Pi Sounds Like.</a>&#8220;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/03/happy-pi-day/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Texting Patterns</title>
		<link>http://www.jaredlander.com/2011/02/texting-patterns/</link>
		<comments>http://www.jaredlander.com/2011/02/texting-patterns/#comments</comments>
		<pubDate>Tue, 22 Feb 2011 15:37:44 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[Statistics]]></category>
		<category><![CDATA[ggplot]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[Poisson]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Regression]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=568</guid>
		<description><![CDATA[


Fig. 1: This graph shows received and sent text messages by month. Notice the spike in July 2010.


A few weeks ago my iPhone for some reason erased ALL of my previous text messages (SMS and MMS) and it was as if I was starting with a new phone. After doing some digging I discovered that ]]></description>
			<content:encoded><![CDATA[<div class="mceTemp mceIEcenter"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Data-Sources.png"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Create-Data-Source.png"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Source-Name.png"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Driver-Connect.png"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Select-DB.png"></a></div>
<dl id="attachment_582" class="wp-caption aligncenter" style="width: 310px;">
<dt class="wp-caption-dt"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Month.png"><img class="size-medium wp-image-582  " title="Texts by Month" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Month-300x299.png" alt="This graphs shows received and sent texts by month.  Notice the spike in July 2010." width="300" height="299" /></a></dt>
<dd class="wp-caption-dd">Fig. 1: This graph shows received and sent text messages by month. Notice the spike in July 2010.</dd>
</dl>
<p style="text-align: left;">
<p style="text-align: left;">A few weeks ago my iPhone for some reason erased ALL of my previous text messages (SMS and MMS) and it was as if I was starting with a new phone. After doing some digging I discovered that each time you <a href="http://support.apple.com/kb/ht1766">sync</a> your iPhone a copy of its text message database is <a href="http://osxdaily.com/2009/09/11/iphone-backup-location/">saved</a> on your computer which can be <a href="iphone-backup-encoded-names">accessed</a> without jailbreaking.    </p>
<p style="text-align: left;">My original intent was to take the old database and union it with the new database for all the texting I had done since then, thus restoring all of my text messages. But once I got into the <a href="http://www.ch-werner.de/sqliteodbc/">SQLite</a> database I realized that I had a ton of information on my hands that was begging to be analyzed. It also didn&#8217;t hurt that I was in a lovely but small Vermont town for the week without much else to do at night.    </p>
<p style="text-align: left;">My first finding, as seen above, is that my text messaging spiked after my girlfriend and I broke up around July of last year. Notice that for both years there is a dip in December. That&#8217;s because in 2009 I was in <a href="http://www.aseanhtf.org/periodicreview3_report.html">Burma</a> during December and for 2010 the data stopped on December 6th when the last backup was made. A simple t-test confirmed that my texting did indeed increase after the breakup.    </p>
<div id="attachment_575" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Month-and-Gender.png"><img class="size-medium wp-image-575  " title="Texts by Month and Gender" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Month-and-Gender-300x299.png" alt="" width="300" height="299" /></a>
<p class="wp-caption-text">Fig. 2: This graph shows my text messaging pattern over time for both men and women. Notice the crossover around August 2010.</p>
</div>
<p>  </p>
<p style="text-align: left;">More interestingly, is that before my girlfriend and I broke up last year I texted more men than women, but shortly after we broke up that flipped. I don&#8217;t think that needs much of an explanation. The above graph and further analysis excludes her and family members because they would bias the gender effect. Being a good statistician I ran a poisson regression to see if there really was a significant change. The <a href="http://www.jaredlander.com/2010/10/coefficient-plot/">coefficient plot</a> below (which is on the logarithmic scale) shows that my texting with males increased after the breakup (or Epoch) by 74% (calculated by summing the coefficients for &#8220;Epoch&#8221;, &#8220;Male&#8221; and &#8220;Male:Epoch&#8221; and then exponentiating) while my texting with females increased 127%.    </p>
<p style="text-align: center;">
<div id="attachment_585" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Coefficient-Plot-for-Gender-and-Epoch.png"><img class="size-medium wp-image-585   " title="Coefficient Plot for Gender and Epoch" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Coefficient-Plot-for-Gender-and-Epoch-300x299.png" alt="" width="300" height="299" /></a>
<p class="wp-caption-text">Fig. 3: Here the &quot;Male&quot; coefficient seems statistically insignificant but its direction makes sense so it is left in the model. The &quot;Intercept&quot; is interpreted as the texting rate with females before the breakup, &quot;Epoch&quot; is the increase with females after the breakup, &quot;Intercept&quot; plus &quot;Male&quot; is the rate with males before the breakup. &quot;Epoch&quot; combined with &quot;Male:Epoch&quot; is the change in rate for texts with males after the breakup.</p>
</div>
<p>  </p>
<p style="text-align: left;">
<p style="text-align: left;">Further analysis and a how-to after the break.    </p>
<p style="text-align: left;"><span id="more-568"></span>    </p>
<p style="text-align: left;">Next I saw that my text messages&#8211;both incoming and outgoing&#8211;surge, unsurprisingly, toward the end of the week. I ran a regression using the day of the week as an unordered factor and saw that texts do trend up toward the end of the week, particularly Sunday which I can&#8217;t explain. Then again, it&#8217;s probably counting late night Saturday, after midnight, when I&#8217;m texting people while enjoying New York&#8217;s nightlife.    </p>
<div id="attachment_573" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Usage-by-Day-of-Week1.png"><img class="size-medium wp-image-573  " title="Usage by Day of Week" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Usage-by-Day-of-Week1-300x299.png" alt="" width="300" height="299" /></a>
<p class="wp-caption-text">Fig 4: This graph shows a clear upward trend toward the end of the week with a bump on Thursday as well.</p>
</div>
<p>  </p>
<p style="text-align: left;">
<div id="attachment_590" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Avg-Texts-by-Day.png"><img class="size-medium wp-image-590 " title="Avg Texts by Day" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Avg-Texts-by-Day-300x299.png" alt="" width="300" height="299" /></a>
<p class="wp-caption-text">Fig. 5: This coefficient plot shows that Sunday clearly stands out far and above the other days in terms of text messages sent and receive. Day 1 is Monday and Day 7 is Sunday. Putting the days in their natural order isn&#39;t built into my coefficient plot yet.</p>
</div>
<p>  </p>
<p style="text-align: left;"><strong>Update</strong>:  As suspected, the uptick on Sunday is really due to late night Saturday texts as seen in the graph below.  I didn&#8217;t run ay test on it, the graph was good enough for me.  </p>
<p style="text-align: center;">
<div id="attachment_661" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Day-AM-PM2.png"><img class="size-medium wp-image-661" title="Texts by Day AM PM" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Texts-by-Day-AM-PM2-300x299.png" alt="" width="300" height="299" /></a>
<p class="wp-caption-text">Figure 6: Now the lines represent calls in the AM or PM. PM Calls steadily increases toward the end of the week and AM calls jump dramatically on Saturday and Sunday. This is due to late night texting on Friday and Saturday when I am out well past midnight.</p>
</div>
<p style="text-align: center;"><strong>And now for code and instructions</strong>  </p>
<p style="text-align: left;">First, you have to locate the two necessary databases in your backup folder. These are in different places depending on your OS, so please see these <a href="http://osxdaily.com/2009/09/11/iphone-backup-location/">instructions</a>. The two databases you want are 3d0d7e5fb2ce288813306e4d4636395e047a3d28 (the messages) and 31bb7ba8914766d4ba40d6dfb6113c8b614be442 (your contacts). Save those databases to a new folder (you don&#8217;t want to mess with originals, just in case) and rename them &#8220;sms.db&#8221; and &#8220;AddressBook sqlitedb&#8221; respectively. I don&#8217;t know if you need to rename them, but I did and it worked. A complete list of databases is available <a href="http://iqlik.wordpress.com/2010/09/26/reading-iphones-database-2/">here</a>.    </p>
<p style="text-align: left;">Next you must download and install the appropriate driver for SQLite from <a href="http://www.ch-werner.de/sqliteodbc/">this site</a>. Then you need to use that driver to make connections to both databases under the &#8220;File DSN&#8221; section of ODBC Data Source Administrator. Sorry, I don&#8217;t know where to do this on Mac or Linux.    </p>
<p style="text-align: center;"><img class="aligncenter" title="Data Sources" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Data-Sources-300x248.png" alt="" width="300" height="248" />    </p>
<p style="text-align: left;">Click &#8220;Add&#8221; and select SQLite3 ODBC Driver.    </p>
<p style="text-align: center;"><img class="aligncenter" title="Create Data Source" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Create-Data-Source-300x224.png" alt="" width="300" height="224" />    </p>
<p style="text-align: left;">Click &#8220;Next&#8221; then give a name for the connection such as &#8220;sms.db.dsn&#8221; in the appropriate folder.    </p>
<p style="text-align: center;"><img class="aligncenter" title="Source Name" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Source-Name-300x224.png" alt="" width="300" height="224" />    </p>
<p style="text-align: left;">Click &#8220;Next&#8221; then &#8220;Finish&#8221; which brings you to following screen where you should click &#8220;Browse&#8221; then select the &#8220;sms.db&#8221; database. Do the same for the &#8220;AddressBook sqlitedb&#8221; database calling the connection &#8220;AddressBook sqlitedb.dsn&#8221;.    </p>
<p style="text-align: center;"><img class="aligncenter" title="Driver Connect" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Driver-Connect-300x223.png" alt="" width="300" height="223" />    </p>
<p style="text-align: left;">As usual, all of my analysis was done in <a href="http://www.jaredlander.com/tag/r/">R</a> and the graphs were made with <a href="http://www.jaredlander.com/?s=ggplot">ggplot</a>.    </p>
<p style="text-align: left;">First up, load in the necessary libraries.    </p>
<p style="text-align: left;">
<pre class="qoate-code">## Load needed libraries
library(RODBC) # to read the database
library(ggplot2) # to make the nice charts and facilitate table joins
library(lmtest) # to test how well the models fit the data
source("http://www.jaredlander.com/code/plotCoef.r") # grab the coefficient plot function
source("http://www.jaredlander.com/code/overdispersion.r") # grab the overdispersion test function</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">Now we connect to the databases.    </p>
<p style="text-align: left;">
<pre class="qoate-code"># Connect to messages DB
DB &lt;- odbcDriverConnect()
# Connect to the Contacts DB
DB2 &lt;- odbcDriverConnect()</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">Each of those commands pulls up a dialog like the one below. First choose &#8220;sms.db.dsn&#8221; and then for the second dialog choose &#8220;AddressBook sqlitedb.dsn.&#8221; The order is important.    </p>
<p style="text-align: center;"><img class="aligncenter" title="Select DB" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/02/Select-DB-300x262.png" alt="" width="300" height="262" />    </p>
<p style="text-align: left;">
<p style="text-align: left;">This next bunch of code cleans up the data. Sorry for the text wrapping.    </p>
<p style="text-align: left;">
<pre class="qoate-code">Days &lt;- data.frame(DayNum=1:7, DayOfWeek=c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))
Days$DayOfWeek &lt;- as.character(Days$DayOfWeek) # change to character
messages &lt;- sqlQuery(DB,paste("SELECT * FROM message")) # grab messages data
messages$Type &lt;- ifelse(messages$flags == 2, 'Received', ifelse(messages$flags == 3, 'Sent', ifelse(messages$flags == 33, 'Message send failure', ifelse(messages$flags == 129, 'Deleted', ifelse(messages$flags == 35, 'Other', 'Other'))))) ## Denote message type
messages$Date &lt;- as.Date(floor(messages$date/60/60/24), origin="1970-01-01") ## Set dates
messages$DayOfWeek &lt;- weekdays(messages$Date) ## Find out what day of the week that day fell on
messages$Year &lt;- substr(messages$Date, 1, 4) # get the Year
messages &lt;- within(messages, Month &lt;- droplevels(factor(strftime(Date, format = "%B"),levels = month.name))) # Clean up the data
messages &lt;- messages[messages$Type %in% c("Sent", "Received"), ] # only keep data on messages successfully sent or received
messages &lt;- join(x=messages, y=Days, by="DayOfWeek") # join the days of week information to the messages table
members &lt;- sqlQuery(DB,paste("SELECT * FROM group_member")) # This table is used as a link from the messages table to the contacts table
members$TextLink &lt;- members$address # Creates the key field to the contacts table
members$Number &lt;- members$address # Get the cleanest version of phone number available
personNumbers &lt;- sqlQuery(DB2,paste("SELECT * FROM ABMultiValue")) # The table that joins between the members table and the contacts table
personNumbers$TextLink &lt;- personNumbers$value # Creates the key field to the members table
personNumbers$ContactLink &lt;- personNumbers$record_id # Creates the key field to the contacts table    

addresses &lt;- sqlQuery(DB2, paste("SELECT * FROM ABPerson")) # The Contacts table
addresses$ContactLink &lt;- addresses$ROWID # Creates the key field to the personNumbers table
addresses$AddressRow &lt;- addresses$ROWID # For indexing reasons
addresses$Name &lt;- paste(addresses$First, addresses$Last, sep=" ") # Make a combined name</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">I have gone through and labelled my contacts male or female. If you don&#8217;t want to go through that work, you can skip the gender analysis or scrape a site like <a href="http://www.gpeters.com/names/baby-names.php?name=jared">Baby Name Guesser</a>. It&#8217;s fairly simple, you put the name of interest in the URL then use readLines() to get what the site guesses.    </p>
<p style="text-align: left;">
<pre class="qoate-code">gender &lt;- read.csv("C:\\Users\\Jared\\Documents\\iPhone SMS Backup\\Gender.csv", sep=",", header=T, stringsAsFactors=F)</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">Now join all those tables together.    </p>
<p style="text-align: left;">
<pre class="qoate-code">## Join all of our tables
master &lt;- join(x=messages, y=members, by="group_id")
master &lt;- join(x=master, y=personNumbers, by="TextLink")
master &lt;- join(x=master, y=addresses, by="ContactLink")
master &lt;- join(x=master, y=gender, by="Name")</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">Build the data for plotting texts by month and year in Figure 1.    </p>
<p style="text-align: left;">
<pre class="qoate-code"> # Count per month, year, type
textMonth &lt;- aggregate(ROWID ~ Year + Month + Type, data=master, length)
# Plot by month, year, type
qplot(Month, ROWID, data=textMonth, group=Type, colour=Type, geom="line", ylab="# Texts", main="Texts by Month") + facet_grid(Year ~ .) + opts(axis.text.x=theme_text(angle=45, hjust=1)) # Figure 1</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">The following t-test (and aggregation) tests to see if texting increased after a certain date.    </p>
<p style="text-align: left;">
<pre class="qoate-code">textByDay &lt;- aggregate(ROWID ~ Date + DayOfWeek + DayNum + Year + Month, data=master, length)
textByDay$Epoch &lt;- ifelse(textByDay$Date &gt;= "YYYY-MM-DD", 1, 0)
t.test(ROWID ~ Epoch, data=textByDay[textByDay$Month != "December", ], alternastive="less")</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">Now to make the data ready to check the gender crossover, plotting it (Figure 2), and testing it (Figure 3).    </p>
<p style="text-align: left;">
<pre class="qoate-code"># Count per month, year, gender
textGenderMonth &lt;- aggregate(ROWID ~ Year + Month + Gender, data=master[!master$Number %in% c("(xxx) xxx-xxxx", "(xxx) xxx-xxxx", "(xxx) xxx-xxxx", "(xxx) xxx-xxxx"), ], length)
textGenderMonth$Gender &lt;- ifelse(textGenderMonth$Gender == "f", "Female", "Male")
# Plot by day, year, type
# Shows crossover point for texting girls vs guys
qplot(Month, ROWID, data=textGenderMonth, group=Gender, colour=Gender, geom="line", ylab="# Texts") + facet_grid(Year ~ .) + opts(axis.text.x=theme_text(angle=45, hjust=1)) # Figure 2
textByDayGender &lt;- aggregate(ROWID ~ Date + DayOfWeek + DayNum + Year + Month + Gender, data=master[!master$Number %in% c("(xxx) xxx-xxxx", "(xxx) xxx-xxxx", "(xxx) xxx-xxxx", "(xxx) xxx-xxxx"), ], length)
textByDayGender$Epoch &lt;- ifelse(textByDayGender$Date &gt;= "YYYY-MM-DD", 1, 0)
textByDayGender$Gender &lt;- ifelse(textByDayGender$Gender == "f", "Female", "Male")
textByDayGender$Gender &lt;- factor(textByDayGender$Gender)
femaleLM &lt;- lm(ROWID ~ Gender*Epoch, data = textByDayGender[textByDayGender$Month != "December", ])
femaleGLM &lt;- glm(ROWID ~ Gender*Epoch, data = textByDayGender[textByDayGender$Month != "December", ], family="quasipoisson")
overdispersionTest(femaleGLM)
bptest(femaleGLM)
resettest(femaleGLM)
plotCoef(femaleGLM, Main="Coefficient Plot for Poisson Regression", YLab="Variable", XLab="Coefficient") # Figure 3</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">And now we check for the trend in texts over the course of a week (Figures 4 and 5).    </p>
<p style="text-align: left;">
<pre class="qoate-code">textDayNoYear &lt;- aggregate(ROWID ~ DayOfWeek + DayNum + Type, data=master, length)
qplot(reorder(DayOfWeek, DayNum), ROWID, data=textDayNoYear, group=Type, colour=Type, geom="line", xlab="Day of Week", ylab="# Texts") + opts(axis.text.x=theme_text(angle=45, hjust=1)) # Figure 4
textByDay &lt;- aggregate(ROWID ~ Date + DayOfWeek + DayNum + Year + Month, data=master, length)
textByDay$Epoch &lt;- ifelse(textByDay$Date &gt;= "YYYY-MM-DD", 1, 0)
dayOfWeekLM &lt;- lm(ROWID ~ factor(DayNum) - 1, data=textByDay)
plotCoef(dayOfWeekLM, Main="Average Texts by Day", YLab="Day", XLab="Texts") #Figure 5</pre>
<p>    </p>
<p style="text-align: left;">
<p style="text-align: left;">And that&#8217;s the basic idea. I left out a lot of the nitty gritty of model checking and all the troubleshooting. I hope everyone enjoyed it and please feel free to <a href="http://www.jaredlander.com/contact/">message</a> (or @jaredlander on Twitter) me with any questions.    </p>
<p style="text-align: left;">
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/02/texting-patterns/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Supreme Court Justice Rules for NY Pizza</title>
		<link>http://www.jaredlander.com/2011/01/supreme-court-justice-rules-for-ny-pizza/</link>
		<comments>http://www.jaredlander.com/2011/01/supreme-court-justice-rules-for-ny-pizza/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 23:26:55 +0000</pubDate>
		<dc:creator>Jared</dc:creator>
				<category><![CDATA[New York]]></category>
		<category><![CDATA[Pizza]]></category>
		<category><![CDATA[Chicago]]></category>
		<category><![CDATA[Food]]></category>
		<category><![CDATA[IL]]></category>
		<category><![CDATA[Illinois]]></category>
		<category><![CDATA[New Jersey]]></category>
		<category><![CDATA[NJ]]></category>
		<category><![CDATA[NY]]></category>
		<category><![CDATA[Slice]]></category>
		<category><![CDATA[Trenton]]></category>

		<guid isPermaLink="false">http://www.jaredlander.com/?p=546</guid>
		<description><![CDATA[Daily Intel caught wind of a California Lawyer interview with US Supreme Court Justice Antonin Scalia where he proclaims New York pizza &#8220;is infinitely better than Washington pizza, and infinitely better than Chicago pizza.&#8221;  I may be biased to New York pizza as well, but that is a debate I&#8217;ll save for another day.
It gets really interesting ]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/New-York-slice.jpg"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Tomato-Pie.bmp"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Tomato-Pie.bmp"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Antonin-Scalia.jpg"><img class="alignleft size-medium wp-image-547" title="Antonin Scalia" src="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Antonin-Scalia-235x300.jpg" alt="Supreme Court Justice Antonin Scalia" width="235" height="300" /></a><a href="http://nymag.com/daily/intel/2011/01/antonin_scalia_rules_that_chic.html">Daily Intel</a> caught wind of a <a href="http://www.callawyer.com/story.cfm?eid=913358&amp;evid=1">California Lawyer interview</a> with US <a href="http://www.supremecourt.gov/about/biographies.aspx">Supreme Court Justice</a> Antonin Scalia where he <a href="http://www.jaredlander.com/?s=new+york">proclaims</a> New York <a href="http://www.jaredlander.com/tag/pizza/">pizza</a> &#8220;is infinitely better than Washington pizza, and infinitely better than Chicago pizza.&#8221;  I may be biased to <a href="http://slice.seriouseats.com/archives/2008/01/a-list-of-regional-pizza-styles-slideshow.html#show-85723">New York</a> <a href="http://slice.seriouseats.com/archives/2008/01/a-list-of-regional-pizza-styles-slideshow.html#show-85722">pizza</a> as well, but that is a debate I&#8217;ll save for another day.</p>
<p>It gets really interesting when he says, <strong>&#8220;You know these deep-dish pizzas—it&#8217;s not pizza. It&#8217;s very good, but &#8230; call it tomato pie or something.&#8221;</strong>  While an argument can certainly me made that <a href="http://slice.seriouseats.com/archives/2008/01/a-list-of-regional-pizza-styles-slideshow.html#show-85732">deep-dish</a> pizza is almost a casserole, I think the folks down in Trenton (where Scalia was born) have already claimed the name <a href="http://slice.seriouseats.com/archives/2008/01/a-list-of-regional-pizza-styles-slideshow.html#show-85729">tomato pie</a>, referring to a round pie with the sauce on top.</p>
<p>Hopefully <a href="http://slice.seriouseats.com/">Slice</a> will chime in on this.</p>
<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/New-York-Pie.jpg"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Deep-Dish-Slice.jpg"></a><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/New-York-Pie1.jpg"></a></p>
<p><a href="http://www.jaredlander.com/wordpress/wordpress-2.9.2/wordpress/wp-content/uploads/2011/01/Tomato-Pie1.bmp"></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.jaredlander.com/2011/01/supreme-court-justice-rules-for-ny-pizza/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

