While playing Words with Friends my randomly chosen opponent played “radiale” as her first word.  Since that used up all of her tiles, she received a bonus on top of all the points the word itself got, resulting in a one-move score of 53 points!  Rather than being impressed I was upset at the large deficit I would have to overcome.

To combat this I did what comes naturally:  Write an R script to find the perfect word!

Needing to combine my seven letters with one of her letters there were two routes I could take.  The first would be for each combination of my seven letters and one of hers, find all 40,320 (8!) permutations then hit dictionary.com to see if it is a real word for a total of 282,240 (8!*7) http calls.  That seemed a bit excessive and impractical so I moved on to the next idea.

So, first thing I did was pull a list of common eight-letter words. Then for each combination of my letters and one of hers (only 7 iterations) I checked if those letters (in any order) matched the letters in any of the possible words.  Once a match was found there was a check for the counts of the letters and if that passed then the word was recorded as a true match.

The algorithm took about 17 seconds to run and found me one possible word for my letters combined with one of hers:  “headrace”, for 63 points!  Perhaps I should have been able to figure that out on my own, but where would be the fun in that.  Find the code after the break.

require(RCurl)
require(XML)
require(plyr)

myLetters <- "ercaehd"
theirLetters <- "radiale"

wordSite <- getURL("http://www.poslarchive.com/math/scrabble/lists/common-8.html")  # get page with words

# parse out the page
wordsParsed <- htmlTreeParse(wordSite, handlers=list("pre"=function(x, ...){ x }, "p"=function(x, ...){ NULL },
                                                    "head"=function(x, ...){ NULL }), asTree=TRUE)

# get just the list of words
words <- as.character(wordsParsed$children$html[["body"]][["pre"]][["text"]])
words <-gsub("\n", " ", words)  # sub out carriage return characters for spaces

## Turn the block of words into a nice list, each element of which is a vector of characters
words <- strsplit(words, split=" ")[[6]]    # split the long string into a vector seperating by spaces
words <- as.list(words)     # change from vector to list
words <- lapply(words, strsplit, split="")  # split each element of the list into a vector of it's characters

findAllWords <- function(baseLetters, theirWord, listOfWords)
{
    extraLetters <- strsplit(theirWord, split="")[[1]]   # split their word into letters, probably should have removed redundant letters first for a slight efficiency gain

    wordSuccess <- vector("list", length(extraLetters))

    for(a in 1:length(extraLetters))
    {
        seedLetters <- paste(baseLetters, extraLetters[a], sep="")
        #seedLetters <- strsplit(seedLetters, split="")[[1]]

        wordSuccess[[a]] <- checkForWord(words=listOfWords, letters=seedLetters)
    }

    return(wordSuccess)
}

checkForWord <- function(words, letters)
{
    holder <- vector("list", length(words)) # to hold results
    letters <- strsplit(letters, split="")[[1]]  # split up letters
    letterFrame <- data.frame(Letters=letters, stringsAsFactors=FALSE)  # to facilitate counting
    letterCount <- ddply(letterFrame, .(Letters), nrow)   # counts for each letter
    rm(letterFrame); gc()
    names(letterCount)[2] <- "LetterCount"

    ## loop through every word in the list and see if our letters are found in there
    for(i in 1:length(words))
    {
        if(all(letters %in% words[[i]][[1]]))   # if all the letters are found in the word
        {
            # check counts
            wordFrame <- data.frame(Letters=words[[i]][[1]], stringsAsFactors=FALSE)    # put letters from word into DF
            wordCount <- ddply(wordFrame, .(Letters), nrow) # get counts
            rm(wordFrame); gc()
            names(wordCount)[2] <- "WordLetterCount"

            joinedLetters <- join(letterCount, wordCount, by="Letters") # join the tables for a comparison

            holder[[i]] <- all(with(joinedLetters, LetterCount == WordLetterCount)) # holder gets the result of the comparison
        }else
        {
            holder[[i]] <- FALSE
        }
    }

    holder <- laply(holder, function(x) x) # convert to vector

    if(sum(holder) == 0)
    {
        return(sum(holder))
    }else
    {
        return(which(holder == TRUE))
    }
}

system.time(result <- findAllWords(baseLetters=myLetters, theirWord=theirLetters, listOfWords=words))

Related Posts

Jared Lander is the Chief Data Scientist of Lander Analytics a New York data science and AI firm, Adjunct Professor at Columbia University, Organizer of the New York Open Statistical Programming meetup and the New York and Government Data Science and AI Conferences and author of R for Everyone.

Leave a Reply