Category Archives: Math

Delivering bad news in O(N) time

I’ve invented a way to break bad news gradually, instead of all at once. Suppose there is a big question—“Are you breaking up with me?” or “Do I have hepatitis, Doc?” The traditional algorithm is decidedly O(1); the girlfriend or the doctor simply says “yes” or “no,” and the news is broken. It would be nice if we could delay the news, so that the answer became gradually more clear as time passed. Here’s a procedure to do just that.

At each timestep, the doctor (say) flips a coin and hides the outcome from the patient. If it is heads, he simply says “heads.” If it is tails and the patient has hepatitis, he says “heads.” If it is tails and the patient does not have hepatitis, he says “tails.”

Let’s analyze this from the patient’s point of view, supposing that both answers start out equally likely in his mind. That is, \mathrm{P(hep)}=\mathrm{P(no\ hep)}. Suppose there have been N timesteps. If the doctor ever says “tails,” then the patient knows he’s in the clear. So the interesting question is how the patient’s degree of belief changes when the doctor has said “heads” every time for N timesteps.

Using Bayes’s theorem and some algebra, you can show that \mathrm{P}(\mathrm{hep}|N) =\Big[\mathrm{P}(N|\textrm{no hep})+1\Big]^{-1}. In order to get N “heads” responses given no hepatitis, the coin would have to land heads-up N times. And we know that has probability (1/2)^N. After a line of algebra, we get

\mathrm{P}(\mathrm{hep}|N) =\frac{2^N}{2^N+1}.

This approaches 100% as N tends toward infinity, which is what we expected. On the other hand, if the patient doesn’t have hepatitis then we expect a “tails” to come up after only 2 timesteps.

How to make people angry about prime numbers

I discovered a new way to make people upset. Tell them, at an opportune moment, that the prime numbers are completely arbitrary because they would be different if we used a number base other than base-10. (Not just different representations, but different amounts as well.) This is, of course, completely wrong. Amounts themselves multiply just the same in any representation you want to use.

When your target objects, give them the following example:
Pick a prime number like 57, and put it in base 5. 5 squared is 25, so put 2 in the 25’s place to get 50, then put 1 in the 5’s place to get 55, then put 2 in the 1’s place to get 57 base 10 or 212 base 5. But look what happens if you multiply 34 and 3 in base-5:


Fig. 1 Definitely a prime number, I swear.

4 times 3 is 12 base 10, which is 22 base 5—carry the 2—then do 3 times 3 plus 2, which is 11 base 10 or 21 base 5, so all together you get 212.

My sincere apologies for this post go to Tom Lehrer, who wrote the song “New Math,” and Alexander Grothendieck, who invented the Grothendieck prime.

Word Morphs in Java

You may remember “word morphs” from worksheets in elementary school. They give you a starting word and and ending word. The goal is to change one letter at a time to get from the start to the end. The key is that you have to make a word at every step. So, for example, you might have:



Usually they tell you how many steps it takes, and usually it’s the same as the number of letters. Examples like that are not very interesting because you only have to change each position (1st letter, 2nd letter, etc.) once, from the starting letter to the ending letter. If I know I have to go from JAR to BOT in three steps, I can start blindly swapping in letters at each step. If one position doesn’t work, try another. Only try to swap positions you haven’t tried yet. As a graph, the thought process looks like this:


If you know the number of steps is equal to the number of letters, there’s only one interesting thing that can happen. Sometimes you hit a dead end and have to retrace your steps. For example, if I want to go from TIME to PUNK in four steps, I get this:


But it’s no big deal—if you run into a dead end you just go back and try another position. And sometimes you’ll get lucky and try the correct path in the first place. Too easy, very boring.

Some word morphs require more steps than the number of letters. For example, try going from PUTTY to EASED. There are two dead-ends when you use the method we’ve been using, and there’s no connection to EASED.


There is a solution, however. Here is one:


We had to use a C in position 1, an R in position 3, and an S in position 5 even though none of these appears in PUTTY or in EASED. These three detours bring the total number of steps from five up to 5 + 3 = 8. (By “steps” I mean “→’s” or “leaps,” not total words. You wouldn’t count A → I as two steps; you’d count it as one.)

Any step can be taken either way. (That is to say, → = ←.) So of the nine words in the chain above, each can be morphed into any of the others. This is true in general for any morph chain. Thus,

the ability to morph from one word to another is symmetric and transitive.

Symmetry means that if I can morph from W_1 to W_k, then I can also morph from W_k to W_1. This is because the arrows go both ways: Given that I can write

W_1\to W_2\to \cdots\to W_k,

then I know that I can also write

W_1\leftarrow W_2\leftarrow \cdots\leftarrow W_k.

Transitivity means that if I can morph from W_1 to W_j, and I can also morph from W_j to W_k, then I can morph from W_1 to W_k. This is easy to show. Given that I can write

W_1\to W_2\to\cdots\to W_j and W_j\to W_{j+1}\to\cdots\to W_k,

I can join up the two chains as follows:

W_1\to W_2\to\cdots\to W_j\to W_{j+1}\to\cdots\to W_k.

Morphing is also reflexive: any word can be morphed into itself in 0 steps. This is going to sound dumb, but for any word W I can always write


Taken together, symmetry, transitivity, and reflexivity mean that we can define an equivalence class on words as follows: Two words are said to be equivalent if you can morph from one to the other. I want to write a Java program that does two things:

  • Given a word, find its equivalence class.
  • Given two words in the same equivalence class, find the shortest path between them.

This brings me to a data structure that I thought of in the shower, only to discover later that somebody else had already come up with it. It’s a special type of tree called a trie. (Clever.) If I wanted to store the words “car,” “cat,” “he,” “heat,” “hear,” and “heart,” then I could do it like this:



This sort of structure lets me do fast queries on partial word patterns. For example, I might want to look up words fitting the pattern “HEA_,” where the last letter can be anything. For the dictionary trie above, you’d find “hear” and “heat.”

The algorithm would do an exhaustive search (like DFS or BFS). At each height (distance from the root node), you look at the corresponding character in your query. If it’s a letter, look for an edge labeled by that letter. If it’s a blank (wildcard), then follow every edge coming out of the current node. Only some nodes are marked as representing a word. At the end of the search, return every word you found.

So a search for “BE_R_” in a more complete dictionary might return “beard,” “bears,” “beers,” and “berry.” I found a crossword help website that lets you play with this. (I don’t know how they implemented their search, but it’s probably similar.)

To find the equivalence class of a word, I only need queries with one blank at a time. Here’s the process:

  1. Start with the seed word. Add it to an empty set (Probably a HashSet.)
  2. For every position in the word, replace that letter with a wildcard and do a search.
  3. Add all the words you get to the equivalence class (if not already present).
  4. For each of those words, repeat steps 2–4.
  5. At the end, the set contains the equivalence class of the seed word.

That’s it! It’s a nice example of recursion, which could be a post of its own. One small thing, though—in order to find the shortest path between words, we need to keep track of one-step connections. For every possible → where only one letter changes, we want to maintain a graph edge between the two words. So at the end we’ll have a graph where words are nodes and their neighbors are all the words with one letter different.

There’s a small gotcha that can creep up on you in implementation if you’re not careful. When searching for all of a word’s neighbors, you may still need to add an edge even if a neighbor is already present in the graph. For example, maybe we started with BAT and added BET and BOT. When we look for all of BET’s neighbors, we will come across BOT. Even though it’s already in the set representing the equivalence class, BET and BOT may not be connected. So you have to check.

Note that a word’s neighbors aren’t always connected to each other. BAT is also neighbor to HAT, but HAT is not neighbor to BOT.

Anyway, once you have the graph then you can simply run Dijkstra’s algorithm to get the shortest path between any two words. (I’m starting to notice that this whole post has a lot of freshman- and sophomore-level CS concepts in it…equivalence classes, trees, hash sets, Dijkstra’s algorithm, etc. Not a terrible idea for a final project.)

I’m using wlist_match10.txt from this page for my word list. It’s pretty good.

Some fun results

“Jazz” is in its own equivalence class all by itself. “Love” is in a class of almost two thousand elements, including “alto,” “hunk,” and “whiz.” (I suspect that the “extraneous” words in the list tend to bloat the equivalence classes pretty heavily, so take this figure with a grain of salt.) The path from WIDE to CLUE is surprisingly long:


Let me know if you can find a shorter one. (Preferably without proper nouns. It’s more challenging if you only use common words.)

The two relevant classes are Dictionary and WordMorphs. You can find them both on my Wordplay repo. Your main may look something like this:

Dictionary dict = new Dictionary("English");
WordMorphs wm = new WordMorphs("time", dict);

System.out.println("Total: " + String.valueOf(wm.getSize()));
System.out.println(wm.shortestPath("time", "punk"));