GFX::Monk Home

Posts tagged python:

Recursively Default Dictionaries

Today I was asked if I knew how to make a recursively default dictionary (although not in so many words). What that means is that it’s a dictionary (or hash) which is defaulted to an empty version of itself for every item access. That way, you can throw data into a multi-dimensional dictionary without regard for whether keys already exist, like so:

h["a"]["b"]["c"] = 5

Without having to first initialise h[“a”] and h[“a”][“b”].

A dictionary with a default value of an empty hash sprang to mind, but after trying it out I realised that this only works for one level. Recursion was evidently required.

So, here’s the python solution:

from collections import defaultdict
new_dict = lambda: defaultdict(new_dict)
h = defaultdict(new_dict)

And the ruby, which seems overly noisy:

new_hash = lambda { |hash, key| hash[key] = Hash.new &new_hash }
h = Hash.new(&new_hash)

Autonose: continuous test runner for python's nosetests

Today I've put up the first “releaseable” version of autonose. Basically, it analyses your code’s imports, and determines exactly which tests rely on which code. So whenever you change a file, it’ll automatically run the tests that depend on the changed file (be it directly or transitively). Give it a go, and please let me know your feedback.

All you need do is:

$ easy_install autonose
$ cd [project_with_some_tests_in_it]
$ autonose

See the github project page for further information (and a screenshot).

The joys of functional programming with side-effects

Quick quiz: what will the following pieces of ostensibly identical python and ruby programs print out?

If you answered “1 2 3 4” or something similar, you’re half right. That’s what you’ll get from ruby. Surprisingly, python will give you “4 4 4 4”. I'm not the first to discover this, but that doesn’t make it any less startling.

The “fix” in python is to replace the lambda line with q.append(lambda i=i: puts(i)), because default function paramaters are evaluated at definition-time, not evaluation-time (another difference from ruby, potentially even related?). It makes sense once it’s explained what’s going on, but it’s hardly obvious.

I don’t know enough about how closures are done to say that python is wrong, but it’s certainly less desirable and more surprising…

(for those keeping score, this makes ruby: 1, python: still heaps ;P)

P.S. I couldn’t resist mentioning the title of a proposed solution to this problem: For-loop variable scope: simultaneous possession and ingestion of cake

update: As matt just pointed out, the ruby port is different in that it uses i as an argument to a block. A more faithful translation would be:

for i in (0..9)
  q << (lambda {puts i})
end

Which has exactly the same outcome as python.

So there you go. It turns out closures are confusing. Who knew? ;P

rednose: coloured output formatting plugin for nosetests

I recently wrote a plugin for nosetests which greatly (imho) improves the output for failed and errored tests. The screenshot explains it best.

To install, just run easy_install rednose. Then you can run nosetests with the —rednose option.

See github.com/gfxmonk/rednose for code and more information.

ruby - longing for some discipline

More and more, I am wishing that there was some sort of strict mode I could enable in ruby to say “you know what? I'm careful with my code. Please don’t assume things behind my back”. And to be honest, this mode would pretty much be synonymous with “just do what python would do”

By default, python is strict. If you index a dict (hash) with a nonexistant key, you get a keyError. If you don’t want to have to deal with that, you can use the get method and provide a default for if the key doesn’t exist. In ruby, if you want to be strict about anything, you generally have to write your own checks to guard against the core library’s forgivingness. Forgivingness sounds nice at first, but goes completely against the idea of failing fast, and frequently delays the manifestation of bugs, making them that much harder to actually track down.

Two examples that I came across within minutes of each other the other day:

Struct.new(:a,:b,:c).new('a','b')

that should NOT go without an exception

"a|b||c".split("|")
=> ["a", "b", "", "c"]

good…. so now:

"a|b||".split("|")
=> ["a", "b"]

argh! what have you done to my third field?

Pretty Decorators

Python decorators are cool, but they can become very messy if you sart taking arguments:

def decorate_with_args(arg):
    def the_decorator(func):
        def run_it():
            func(arg)
        return run_it
    return the_decorator

@decorate_with_args('some string')
def messy(s):
    print s

ugh. Three levels of function definitions for a single decorator. And heaven forbid you want the decorator to be useable without supplying any arguments (not even empty brackets).

So then, I present a much cleaner decorator helper class:

class ParamDecorator(object):
    def __init__(self, decorator_function):
        self.func = decorator_function

    def __call__(self, *args, **kwargs):
        if len(args) == 1 and len(kwargs) == 0 and callable(args[0]):
            # we're being called without paramaters
            # (just the decorated function)
            return self.func(args[0])
        return self.decorate_with(args, kwargs)

    def decorate_with(self, args, kwargs):
        def decorator(callable_):
            return self.func(callable_, *args, **kwargs)
        return decorator

All that’s required of you is to take the decorated function as your first argument, and then any additional (optional) arguments. For example, here’s how you might implement a “pending” and “trace” decorator:

@ParamDecorator
def pending(decorated, reason='no reason given'):
    def run(*args, **kwargs):
        print "function '%s' is pending (%s)" % (decorated.__name__, reason)
    return run

@ParamDecorator
def trace(decorated, label=None):
    if label is None:
        label = decorated.__name__
    else:
        label = "%s (%s)" % (decorated.__name__, label)
    def run(*args, **kwargs):
        print "%s: started (args=%s, kwargs=%s)" % (label, args, kwargs)
        ret = decorated(*args, **kwargs)
        print "%s: returning: %s" % (label, ret)
        return ret
    return run

Which can then be used as either standard or paramaterised decorators:

@pending
def a():
    pass

@pending("I haven't done it yet!")
def b():
    pass

@trace
def foo():
    return "blah"

@trace("important function")
def bar():
    return "blech!"

And just to show what this all amounts to:

if __name__ == '__main__':
    a()
    b()
    foo()
    bar()

reveals the following output:

function 'a' is pending (no reason given)
function 'b' is pending (I haven't done it yet!)
foo: started (args=(), kwargs={})
foo: returning: blah
bar (important function): started (args=(), kwargs={})
bar (important function): returning: blech!

Fairly simple stuff, but hopefully useful for anyone who finds themselves tripped up by decorators – particularly when trying to allow for both naked and paramaterised decorators.

P.S: I've made a pastie of the code in this post, because my weblog engine is not cool enough to colour-code python ;)

mocktest

I've been working on this a little while now. At work, I use ruby. It has its good points, but my language of choice is still python. I am, however, blown away by rspec. So I've tried to bring some of its features into python (namely and should_receive matchers).

Thus was born mocktest. The readme has all the examples you should need to start using it in your own testing code.

It’s also available from the cheese shop, which means you can just run easy_install mocktest to install it on your system.

I should note that this code builds upon the excellent Mock library by Michael Foord.

(view link)

HTML to PDF converter

Given the integration of PDF into Mac OS X, I was surprised to find that there didn’t seem to be any tool to convert HTML files into PDFs. So, like any frustrated coder I wrote my own little script to do it: html2pdf.py

Requires python and OSX 10.5 (Leopard). It uses a WebKit view for the rendering, so in theory it should work with any URL that safari can handle – but I haven’t exactly tested it thoroughly…