I’ve migrated to Octopress and moved my blog to http://kakar.ca. Please update your links.


Earlier this year I was working with Canonical’s design team. We were putting our heads together to develop a fantastic cloud user experience in Landscape. I created a concept map to help clarify the various artefacts involved in the EC2 API. It’s not entirely complete, but it does capture the most important aspects of the model. I find it useful as a high-level diagram, hopefull you will too.


UPDATE I’ve moved my emacs configuration to GitHub.

Some time ago during a sprint, I noticed that Free Ekanayaka, one of my Landscape teammates, had excellent pyflakes/flymake integration that dramatically improved the experience of editing Python files in emacs. When I asked him about it he kindly offered to share his configuration and I was excited to pick out the flymake settings he had. When I actually looked at what he’d sent me I was blown away. He has the most elegant emacs configuration I’ve ever seen. He uses a simple pattern composed of individual configuration files, each focused on a particular set of customizations such as appearance, interaction, programming, etc. that are all wired up to create the final result. It’s very clean and avoids the blob-of-crap problem that most emacs configuration suffers from.

This evening I finally took his ideas and refactored them to match my emacs setup. The structure is the same, but the details are, in some cases, quite different. I’m managing the configuration with Bazaar and have pushed it to Launchpad at lp:~jkakar/+junk/emacs. Hopefully others will find this as inspiring as I have.


A number of folks at Canonical have been writing about what we do and how our work contributes to the goal of getting free software into the mainstream. Canonical and Ubuntu have brought free software to the general public in a way that hasn’t been done before, in a very short span of time.

I’ve been working on the Landscape Team since early 2006. I was one of the founding members of the project along with Gustavo and Chris. Landscape is a web-based tool that eases the task of managing large deployments of Ubuntu. It is one of the many services that Canonical provides, as a value add to Ubuntu, and is proprietary software. At this point you’re probably thinking, “How can proprietary software help the goal of spreading free software? That’s crazy talk.” Well, in at least two ways.

Landscape makes it possible for people that could never have chosen Ubuntu to consider it, and in many cases to adopt it. Our enterprise customers look at Ubuntu and they like what it offers, but that’s only the beginning of the story. Management is a big concern for organizations that roll out tens, hundreds or thousands of Ubuntu machines. Without a management solution, Ubuntu is a non-starter no matter how good the user experience.

Landscape is a revenue generator for Canonical and, though we’d all love to release it as free software, that revenue is very important. It contributes to the sustainability of the company and thus, to the sustainability of Ubuntu itself. We do as much work as we can in the open. For example, Storm, an object relational mapper for Python, was developed by the Landscape Team and released as free software. We’ve sent patches to projects such as Twisted, txAWS and others.

Although my work may be controversial in the wider free software community I’m a free software developer at heart. The effort I put into Landscape is good for free software in general. It helps Ubuntu gain adoption in places where it simply wouldn’t be considered without a tool like Landscape. Canonical has been, and continues to be, an amazing place to work. It is full of passionate people all doing their part to get free software out into the world.


Yesterday I released version 0.17 of Storm. This release fixes a checkpointing bug that could cause coherency issues in certain situations involving triggers. It introduces a handful of new features and optimizations including the ability to get a Select expression from a ResultSet, which is useful when building a query with a subselect. It also includes safety checks to ensure that a Store is only used when database access is expected and only from the thread in which it was created. The release notes have detailed information about the changes along with links to the MD5 sum and GPG signature for the tarball. I’ve also built official packages for Ubuntu users, available in the Storm PPA.

We’re always interested in hearing about your experiences with Storm. If you have comments or questions please feel free to hop into #storm on Freenode and talk to us, or post a message on the mailing list.


I’ve been using the EC2 API quite a lot lately, while working on the Landscape project. The API documentation is excellent, but I want to know how it behaves when things go wrong. What kind of failure is produced when I forget to pass a parameter? What happens when I pass a bogus parameter? How does it behave when I pass numbered parameters that aren’t in sequence, like Name.7=foo&Name.42=bar? How does Amazon’s implementation differ from Eucalyptus or Nova?

I couldn’t find the answers I wanted in the documentation so I wrote a simple tool to invoke a method with an arbitrary set of parameters against an EC2 API endpoint. After sending the request, it prints the HTTP status code and response to the screen. It’s called txaws-discover and is part of the txAWS project. For example, to run the DescribeRegions method simply provide credentials, an endpoint, the method name and whatever parameters you want to pass with it:

$ txaws-discover --key NUJOZS2V6RWQUJ3G8JUT \
    --secret +Ki9Wk5Y4kudFh1EcnB3hthC9PtVU+CcfMJZ4DVl \
    --endpoint https://ec2.us-east-1.amazonaws.com \
    --action DescribeRegions \
    --RegionName.0 us-west-1

This command produces the following output:

HTTP status code: 200


<DescribeRegionsResponse xmlns="http://ec2.amazonaws.com/doc/2008-12-01/">
    <requestId>1e9beacc-0044-4ad6-bd2c-c615351ae46d</requestId>
    <regionInfo>
        <item>
            <regionName>us-west-1</regionName>
            <regionEndpoint>ec2.us-west-1.amazonaws.com</regionEndpoint>
        </item>
    </regionInfo>
</DescribeRegionsResponse>

The credentials and API endpoint can be defined in environment variables to make txaws-discover easier to use:

export AWS_ACCESS_KEY_ID=NUJOZS2V6RWQUJ3G8JUT
export AWS_SECRET_ACCESS_KEY=+Ki9Wk5Y4kudFh1EcnB3hthC9PtVU+CcfMJZ4DVl
export AWS_ENDPOINT=https://ec2.us-east-1.amazonaws.com

With those defined, the command above can be shortened:

$ txaws-discover --action DescribeRegions --RegionName.0 us-west-1

It’s in lp:txaws and has proven to be very helpful as a learning tool.


In Storm, the Store.find method runs a query and returns matching objects. Finding all accounts in the database, the equivalent of SELECT * FROM account, is simple:

result = store.find(Account)

Clauses to limit the scope of the query can be passed to find. Finding all accounts owned by Vince Offer, the equivalent of SELECT * FROM account WHERE owner = ‘Vince Offer’, is also simple:

result = store.find(Account, Account.owner == "Vince Offer")

Another way to achieve the same thing is to take the result from the first find operation and refine it:

result1 = store.find(Account)
result2 = result1.find(Account.owner == "Vince Offer")

Storm doesn’t run a query until you do something with the result, so the impact on the database is exactly the same as in the previous example. This is the simplest form of what Jonathan and I have been calling the collection pattern. The basic idea is that you start with a collection of all objects, and then refine it, until you have the collection you want. This pattern is possible using pure Storm, as shown above, but we’ve found it useful to implement objects that hide this logic and provide a more user friendly API. For example, an AccountCollection would provide filtering logic as named methods:

class AccountCollection(object):

    def owned_by(self, salesman):
        """
        Get a new collection with accounts owned by
        salesman.
        """

    def has_unpaid_invoices(self):
        """
        Get a new collection with accounts that have
        unpaid invoices.
        """

    def find(self):
        """Get a result set of matching accounts."""

Each filtering method, such as owned_by and has_unpaid_invoices returns a new AccountCollection instance. This isn’t strictly necessary but creating a new instance is both easier to understand and implement. Finding all accounts owned by Vince Offer that have unpaid invoices is quite easy:

collection1 = AccountCollection()
collection2 = collection1.owned_by("Vince Offer")
collection3 = collection2.has_unpaid_invoices()
result = collection3.find()

This pattern provides several benefits:

  • It names filtering options which makes them easy to understand and use.
  • Query building logic is in one place which eliminates duplication.
  • All the criteria used to build a query are available when the query is generated, which makes it possible to generate optimized queries. All users of the collection benefit from optimizations.
  • It creates a clean separation between finding data and using it. This helps keep application logic more focused, because it isn’t intermingled with the particulars of generating queries.

We’ve used this pattern to good effect in both Landscape and Launchpad. In a future post I’ll talk about some different strategies we’ve used to implement the pattern.


I’ve been meaning to write about Commandant for ages. I started it last year because I wanted to write command driven applications with a user experience just like Bazaar. There are several things that I like about Bazaar’s user interface:

  • You only have to know that the program is called bzr to start discovering it, because its default behaviour is to teach you how to use the builtin help system.
  • Each command has a clear name that makes its purpose easy to discern.
  • Each command has a succinct summary and a well written long description, along with clear information about the arguments and options the command accepts.
  • Help topics provide general information about the program that isn’t command-specific. The topics describe conceptual and technical aspects of the program that help the user better understand how to use it.
  • The interface conventions are very simple. It doesn’t take long to figure out how things work.

I wanted, as much as possible, to start writing application logic without having to do much work to get a working user interface. With that in mind, the basic goal for Commandant is to provide a Bazaar-like user interface with the least effort possible. I started by trying to extend Bazaar but I ran into tricky integration problems that made it hard to make progress, so I switched gears and started implementing the logic from scratch. Reinventing the wheel got me further than trying to integrate, but I quickly reached a point where progress slowed down because I had to deal with tricky problems that Bazaar had already solved. In the meantime, Robert landed changes in Bazaar that removed the integration hurdles that I’d run into.

With the integration problems out of the way, I stopped wasting time reinventing the wheel, and Commandant is now a thin layer on top of bzrlib itself. Being able to take advantage of the work that has been done in Bazaar has been a blessing. There’s more that can be done to improve Commandant, but it is very usable in its current state. If you’re interested in learning more, the README file should get you started.


People ask me questions about Storm several times a week, and I’m happy to help where I can, but it would be better if the information our users need was easier to find. Our documentation story is weak and, as a result, Storm is harder to discover than it ought to be. If you do run into problems or have questions, please hop into #storm on Freenode and we’ll do our best to help you. I want to improve the documentation situation, but finding the time to do so hasn’t happened yet. The next best thing is to post some helpful hints here.

First, some advice: if you don’t know what SQL query you want to run, using Storm will probably be hard. When I get asked a question about writing a query using Storm I almost always respond with, “What does the query you want to run look like?” The querying part of Storm is essentially just an expression compiler that takes a list of expressions and converts them to SQL. Once you know what end result you want, working backwards to figure out the Storm parlance is relatively straight forward.

Today I was asked about using COALESCE. The person asking the question found the storm.expr.Coalesce expression but it wasn’t obvious how to use it. He knew he wanted to run a query like:

SELECT * FROM table WHERE COALESCE(col1, col2) = 'foo'

With that knowledge, figuring out how to use Storm was quite easy:

store.find(Table, Coalesce(Table.col1, Table.col2) == "foo")

Furthermore, Coalesce is just a helper expression to aid usability and readability. If we didn’t have a built-in Coalesce expression the query could have been written using the generic storm.expr.Func expression:

store.find(Table,
           Func("COALESCE", Table.col1, Table.col2) == "foo")

If you want to use a SQL function in a Storm query, and there is no built-in expression for it, you can use Func to call it.


Today I gave a talk about Storm to the fine folks that make up the local Python users group. I think the talk went alright. It wasn’t awesome but I don’t think it was terrible either. It’s the third public talk I’ve given, pretty much ever, and so I have a lot to learn about the process. The group is friendly and I felt comfortable standing up in front of everyone and yammering on about Storm.

In retrospect, I should have come up with much better example code. I didn’t make enough time to prepare and, due to rushing at the end, the topics I covered were rather basic. I did an experiment and wrote a psuedo-doctest to use as slides. The doctest and a script to run it are available at lp:~jkakar/+junk/storm-talk. I used a slightly hacked version of Michael Hudson-Doyle’s very awesome console-presenter, which is available at lp:~jkakar/console-presenter/two-line-separator.

The format was great for getting me to focus on content, even though I didn’t have enough time to do a great job preparing it, but the slides were a bit lackluster being grey text on a black background. Also, a common problem with doctests is that you need to keep track of the state of the program as you read the document. I think this problem made the format awkward for the audience. It was a good experiment, but in the future I’ll use traditional slides, and try to avoid a situation that requires the audience to keep program state in their head as the talk progresses. Funny how these things seem so obvious in hindsight.

Doug Latornell gave a nice talk about his experiences using PyYAML, flickerapi and Tkinter to build an application that displays random photos on his living room TV, both from his personal collection and from his favourite groups on Flickr. I enjoyed the talk and the discussion it spurred.

All in all it was a good experience and from which I’ve learnt some lessons, which I can hopefully use to make the next talk better.