Monday, December 13, 2010

Python - Batteries included

It has often been said that Python comes with batteries included. This refers to the fact that it has a rich set of utilities included in the standard library. In the following program, the main program is a single line of code. It was made possible partly by the functions included in the random module. If we had made the specifications a bit tighter, we could have reduced the number of lines significantly.

Here is what the program does: We would like to generate passwords (or some other random string), but we want to be able to specify a template. For example, we may want to specify that each password is 7 lowercase characters. In this case we would use the template 'aaaaaaa'. Suppose we wanted a password that looks like a Victorian (Victoria, Australia) number plate, we would specify the template 'AAA-999', in other words, three random letters followed by a hyphen followed by three random digits. Finally, we may want to generate a simple password that is easy to remember by making it pronouncable - so we generate it using the following template: 'Cvcvc' The 'c' will be replaced by a consonant and the 'v' by a vowel, giving us something like, maybe, 'Gamir'.

Here is the program:

import string
import random

vowels_upper = 'AEIOU'
vowels_lower = 'aeiou'
consonants_upper = 'BCDFGHJKLMNPQRSTVWXYZ'
consonants_lower = 'bcdfghjklmnpqrstvwxyz'
alphanumeric = string.letters + string.digits
recognize = {
'A': string.ascii_uppercase, 'a': string.ascii_lowercase,
'C': consonants_upper, 'c': consonants_lower, 
'V': vowels_upper, 'v': vowels_lower, 
'd': string.digits, 'x': alphanumeric}

def create_password(template):
return ''.join([random.choice(recognize[ch]) 
if ch in recognize else ch for ch in template])

if __name__ == '__main__':
for i in range(5):
print create_password('AAA-ddd')

for i in range(5):
print create_password('Cvcvc')



When we run the above program, we get the following output:

MOI-138
KMQ-882
UCH-536
LBK-734
MGA-798
Wewof
Roqug
Colud
Gokix
Berid


That's quite a lot of power for what is essentially one line of code.

Saturday, December 11, 2010

Phononyms in Python

If you use a mobile phone to send SMSs, and you use predictive text, you will probably notice that sometimes you type a word, e.g. 'HOME' and the predictive text thinks you are trying to type 'GOOD' since both HOME and GOOD are represented by 4663.

Recently a friend at work was suggesting a good name for these words that are equivalent on a mobile phone would be Nokianym. I think that I came up with phononym and found that other people on the internet had already thought of the same term.

Anyway, for me, coming up with a Python program to find all such words in a dictionary was a more interesting problem. So, here's my Python script that, given a dictionary and the layout of a keyboard, will find all words that have an equivalent numerical representation.


dictionary = '/usr/share/dict/words'

nokia = {'a':2, 'b':2, 'c':2,
    'd':3, 'e':3, 'f':3,
    'g':4, 'h':4, 'i':4,
    'j':5, 'k':5, 'l':5,
    'm':6, 'n':6, 'o':6,
    'p':7, 'q':7, 'r':7, 's':7,
    't':8, 'u':8, 'v':8,
    'w':9, 'x':9, 'y':9, 'z':9
}

def get_val(word):
    result = 0
    word = word.lower()
    for c in word:
        if nokia.has_key(c):
            result = result * 10 + nokia[c]
    return result        

phononyms = {}
for word in open(dictionary):
    word = word.rstrip('\n\r')
    val = get_val(word)
    if (not phononyms.has_key(val)):
        phononyms[val] = [] 
        phononyms[val].append(word)

print [phononyms[val] for val in phononyms.keys() if len(phononyms[val]) > 1]

UNIX and Python - a marriage of convenience

Recently we were having a discussion about how many Friday 13ths there were in a particular year (or which year had the most, or something).

I thought it should be easy to write a solution in Python, so I started with the following code:



import datetime
from datetime import timedelta

start = datetime.date(2009, 1, 1)
end = datetime.date(2013, 1, 1)

while start < end:
print start.strftime('%Y-%m-%d %A')
start = start + timedelta(1)




This little program prints out all the dates from 2009-01-01 to 2012-12-31 in the following format:


2009-01-01 Thursday
2009-01-02 Friday
2009-01-03 Saturday
2009-01-04 Sunday
2009-01-05 Monday
...


I could then have added some more code to determine if a date was a Friday 13th and printed it - but decided not to. I had a general purpose tool that printed a calendar, so all I needed to do was the following:


python printcal | grep '13 Friday'


Which then gives the following:


2009-02-13 Friday
2009-03-13 Friday
2009-11-13 Friday
2010-08-13 Friday
2011-05-13 Friday
2012-01-13 Friday
2012-04-13 Friday
2012-07-13 Friday


So, why did I prefer to do this rather than modify my Python script. Let's look at the Unix Philosophy

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Doug McIlroy

A second part of the philosophy from Mike Gancarz is "Look for the 90-percent solution."

So, I have created a small tool that prints all the dates in a period (this is the 90-percent solution). It is easy to verify that it is doing what I want to do. I use the output of this program as the input to the standard unix program grep that finds all the output that contains Friday 13. Suppose I now want to find all years that have at 3 months where the first of the month is a Monday, I can do it easily by modifying the grep part of the pipeline. If I want to see how many Monday 5ths we had in 2008, I can pipe the output of grep to wc -l which will print the number of lines output.