In this project we write an anagram-solving tool, for solving crossword puzzles and other word games.
It lets you type in a word or set of letters:
and then displays all the anagrams for those letters:
Our starting point is a free dictionary of British English words. Several of these are available on the Web; the one I've chosen contains 60388 words, including plurals, but not abbreviations or proper nouns, such as place names.
The dictionary starts:
("aah" "aardvark" "aardvarks" "abacus" "abacuses" "abalone" ...
and ends with:
... "zooms" "zoos" "zucchini" "zucchinis" "zydeco" "zygote" "zygotes")
You can download it here:
We read the dictionary into a variable wordlist using the following routine:
(defparameter wordlist (with-open-file (stream (capi:prompt-for-file "Wordlist:" :operation :open)) (read stream)))
Making a hash table - make-hash-table
The anagram program will use a new type of Lisp object, called a hash table. This is an efficient way of storing data, indexed by a value known as a key. The key can be any type of Lisp object, such as a number or a string, and you specify the test to be used to compare keys in the hash table.
In this application the keys will be strings, and we define the hash table called dict using:
(defparameter dict (make-hash-table :test 'string=))
This specifies that the keys should be compared using the string= function.
Reading and changing a hash table entry - gethash
We can read the value of the entry for "cat" from the hash table by writing:
CL-USER > (gethash "cat" dict) NIL NIL
The two NIL values are to distinguish an empty entry from an entry that has been set to NIL.
To change the hash-table entry for "cat" we would write:
CL-USER > (setf (gethash "cat" dict) "maiou") "maiou"
CL-USER > (gethash "cat" dict) "maiou" T
How the anagram program works
The anagram program will work by first sorting each word so that its letters are in alphabetical order, using the sort-word procedure:
(defun sorted (word) (sort (copy-seq word) 'char<))
Note that the sort procedure is destructive, so we have to take a copy of the word we want to sort before sorting it, using copy-seq. The comparison we use is char<, the character equivalent of string<.
The reason for doing this is that two anagrams will have the same sorted representation:
CL-USER > (sorted "orchestra") "acehorrst"
CL-USER > (sorted "carthorse") "acehorrst"
We will store all the words from the wordlist into the hash table, using the sorted version of each word as the key. So under the key "acehorrst" will be a list of the words with that sorted representation:
CL-USER > (gethash "acehorrst" dict) ("orchestra" "carthorse") T
(defun add-word (word) (let* ((sword (sorted word))) (setf (gethash sword dict) (cons word (gethash sword dict)))))
It uses cons to add the word onto any words already stored under the sorted key.
We call it as follows to create the dictionary:
(defun add-words-to-dict (wlist) (if (null wlist) nil (let ((word (first wlist))) (add-word word) (add-words-to-dict (rest wlist)))))
This is like the routine sum-list we wrote in Defining Procedures. It goes along the list wlist, adding each of the words to the hash table dict. To call it we execute:
Using the anagram-solving tool
Now we're ready to use the anagram dictionary to solve anagrams. First a procedure to solve anagrams:
(defun find-anagram (word) (gethash (sorted word) dict))
and finally, an interactive tool to prompt for a string, and then display a list of the anagrams:
(defun anagram () (capi:prompt-with-list (find-anagram (capi:prompt-for-string "Find anagram:")) "Anagrams"))
We run this with:
Extending the project
You could also use the anagram dictionary to program word games, like Countdown, find the words with the most anagrams, or find anagrams of whole phrases.
blog comments powered by Disqus