Saturday, December 11, 2010

Phononyms in Python

If you use a mobile phone to send SMSs, and you use predictive text, you will probably notice that sometimes you type a word, e.g. 'HOME' and the predictive text thinks you are trying to type 'GOOD' since both HOME and GOOD are represented by 4663.

Recently a friend at work was suggesting a good name for these words that are equivalent on a mobile phone would be Nokianym. I think that I came up with phononym and found that other people on the internet had already thought of the same term.

Anyway, for me, coming up with a Python program to find all such words in a dictionary was a more interesting problem. So, here's my Python script that, given a dictionary and the layout of a keyboard, will find all words that have an equivalent numerical representation.

dictionary = '/usr/share/dict/words'

nokia = {'a':2, 'b':2, 'c':2,
    'd':3, 'e':3, 'f':3,
    'g':4, 'h':4, 'i':4,
    'j':5, 'k':5, 'l':5,
    'm':6, 'n':6, 'o':6,
    'p':7, 'q':7, 'r':7, 's':7,
    't':8, 'u':8, 'v':8,
    'w':9, 'x':9, 'y':9, 'z':9

def get_val(word):
    result = 0
    word = word.lower()
    for c in word:
        if nokia.has_key(c):
            result = result * 10 + nokia[c]
    return result        

phononyms = {}
for word in open(dictionary):
    word = word.rstrip('\n\r')
    val = get_val(word)
    if (not phononyms.has_key(val)):
        phononyms[val] = [] 

print [phononyms[val] for val in phononyms.keys() if len(phononyms[val]) > 1]


  1. You seem to have an extra line in the code that causes an error in a straight cut and paste.

    word = word.rstrip('\n\r')

    val = get_val(word)

  2. You probably need to expand your nokia array to cope with the ['over-greedy', 'overgreedy'] matches.

  3. @Griff: I tried to copy and paste on my iMac. I use Vim and it appears to work. I'll remove the blank line, though - thanks.