While playing Words with Friends my randomly chosen opponent played “radiale” as her first word.  Since that used up all of her tiles, she received a bonus on top of all the points the word itself got, resulting in a one-move score of 53 points!  Rather than being impressed I was upset at the large deficit I would have to overcome.

To combat this I did what comes naturally:  Write an R script to find the perfect word!

Needing to combine my seven letters with one of her letters there were two routes I could take.  The first would be for each combination of my seven letters and one of hers, find all 40,320 (8!) permutations then hit dictionary.com to see if it is a real word for a total of 282,240 (8!*7) http calls.  That seemed a bit excessive and impractical so I moved on to the next idea.

So, first thing I did was pull a list of common eight-letter words. Then for each combination of my letters and one of hers (only 7 iterations) I checked if those letters (in any order) matched the letters in any of the possible words.  Once a match was found there was a check for the counts of the letters and if that passed then the word was recorded as a true match.

The algorithm took about 17 seconds to run and found me one possible word for my letters combined with one of hers:  “headrace”, for 63 points!  Perhaps I should have been able to figure that out on my own, but where would be the fun in that.  Find the code after the break.

``````require(RCurl)
require(XML)
require(plyr)

myLetters <- "ercaehd"

wordSite <- getURL("http://www.poslarchive.com/math/scrabble/lists/common-8.html")  # get page with words

# parse out the page
wordsParsed <- htmlTreeParse(wordSite, handlers=list("pre"=function(x, ...){ x }, "p"=function(x, ...){ NULL },
"head"=function(x, ...){ NULL }), asTree=TRUE)

# get just the list of words
words <- as.character(wordsParsed\$children\$html[["body"]][["pre"]][["text"]])
words <-gsub("\n", " ", words)  # sub out carriage return characters for spaces

## Turn the block of words into a nice list, each element of which is a vector of characters
words <- strsplit(words, split=" ")[[6]]    # split the long string into a vector seperating by spaces
words <- as.list(words)     # change from vector to list
words <- lapply(words, strsplit, split="")  # split each element of the list into a vector of it's characters

findAllWords <- function(baseLetters, theirWord, listOfWords)
{
extraLetters <- strsplit(theirWord, split="")[[1]]   # split their word into letters, probably should have removed redundant letters first for a slight efficiency gain

wordSuccess <- vector("list", length(extraLetters))

for(a in 1:length(extraLetters))
{
seedLetters <- paste(baseLetters, extraLetters[a], sep="")
#seedLetters <- strsplit(seedLetters, split="")[[1]]

wordSuccess[[a]] <- checkForWord(words=listOfWords, letters=seedLetters)
}

return(wordSuccess)
}

checkForWord <- function(words, letters)
{
holder <- vector("list", length(words)) # to hold results
letters <- strsplit(letters, split="")[[1]]  # split up letters
letterFrame <- data.frame(Letters=letters, stringsAsFactors=FALSE)  # to facilitate counting
letterCount <- ddply(letterFrame, .(Letters), nrow)   # counts for each letter
rm(letterFrame); gc()
names(letterCount)[2] <- "LetterCount"

## loop through every word in the list and see if our letters are found in there
for(i in 1:length(words))
{
if(all(letters %in% words[[i]][[1]]))   # if all the letters are found in the word
{
# check counts
wordFrame <- data.frame(Letters=words[[i]][[1]], stringsAsFactors=FALSE)    # put letters from word into DF
wordCount <- ddply(wordFrame, .(Letters), nrow) # get counts
rm(wordFrame); gc()
names(wordCount)[2] <- "WordLetterCount"

joinedLetters <- join(letterCount, wordCount, by="Letters") # join the tables for a comparison

holder[[i]] <- all(with(joinedLetters, LetterCount == WordLetterCount)) # holder gets the result of the comparison
}else
{
holder[[i]] <- FALSE
}
}

holder <- laply(holder, function(x) x) # convert to vector

if(sum(holder) == 0)
{
return(sum(holder))
}else
{
return(which(holder == TRUE))
}
}

system.time(result <- findAllWords(baseLetters=myLetters, theirWord=theirLetters, listOfWords=words))``````

Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Washington DC R Conferences and author of R for Everyone.