I’ve migrated to Octopress and moved my blog to http://kakar.ca. Please update your links.


Earlier this year I was working with Canonical’s design team. We were putting our heads together to develop a fantastic cloud user experience in Landscape. I created a concept map to help clarify the various artefacts involved in the EC2 API. It’s not entirely complete, but it does capture the most important aspects of the model. I find it useful as a high-level diagram, hopefull you will too.

UPDATE I’ve moved my emacs configuration to GitHub.

Some time ago during a sprint, I noticed that Free Ekanayaka, one of my Landscape teammates, had excellent pyflakes/flymake integration that dramatically improved the experience of editing Python files in emacs. When I asked him about it he kindly offered to share his configuration and I was excited to pick out the flymake settings he had. When I actually looked at what he’d sent me I was blown away. He has the most elegant emacs configuration I’ve ever seen. He uses a simple pattern composed of individual configuration files, each focused on a particular set of customizations such as appearance, interaction, programming, etc. that are all wired up to create the final result. It’s very clean and avoids the blob-of-crap problem that most emacs configuration suffers from.

This evening I finally took his ideas and refactored them to match my emacs setup. The structure is the same, but the details are, in some cases, quite different. I’m managing the configuration with Bazaar and have pushed it to Launchpad at lp:~jkakar/+junk/emacs. Hopefully others will find this as inspiring as I have.

A number of folks at Canonical have been writing about what we do and how our work contributes to the goal of getting free software into the mainstream. Canonical and Ubuntu have brought free software to the general public in a way that hasn’t been done before, in a very short span of time.

I’ve been working on the Landscape Team since early 2006. I was one of the founding members of the project along with Gustavo and Chris. Landscape is a web-based tool that eases the task of managing large deployments of Ubuntu. It is one of the many services that Canonical provides, as a value add to Ubuntu, and is proprietary software. At this point you’re probably thinking, “How can proprietary software help the goal of spreading free software? That’s crazy talk.” Well, in at least two ways.

Landscape makes it possible for people that could never have chosen Ubuntu to consider it, and in many cases to adopt it. Our enterprise customers look at Ubuntu and they like what it offers, but that’s only the beginning of the story. Management is a big concern for organizations that roll out tens, hundreds or thousands of Ubuntu machines. Without a management solution, Ubuntu is a non-starter no matter how good the user experience.

Landscape is a revenue generator for Canonical and, though we’d all love to release it as free software, that revenue is very important. It contributes to the sustainability of the company and thus, to the sustainability of Ubuntu itself. We do as much work as we can in the open. For example, Storm, an object relational mapper for Python, was developed by the Landscape Team and released as free software. We’ve sent patches to projects such as Twisted, txAWS and others.

Although my work may be controversial in the wider free software community I’m a free software developer at heart. The effort I put into Landscape is good for free software in general. It helps Ubuntu gain adoption in places where it simply wouldn’t be considered without a tool like Landscape. Canonical has been, and continues to be, an amazing place to work. It is full of passionate people all doing their part to get free software out into the world.

Yesterday I released version 0.17 of Storm. This release fixes a checkpointing bug that could cause coherency issues in certain situations involving triggers. It introduces a handful of new features and optimizations including the ability to get a Select expression from a ResultSet, which is useful when building a query with a subselect. It also includes safety checks to ensure that a Store is only used when database access is expected and only from the thread in which it was created. The release notes have detailed information about the changes along with links to the MD5 sum and GPG signature for the tarball. I’ve also built official packages for Ubuntu users, available in the Storm PPA.

We’re always interested in hearing about your experiences with Storm. If you have comments or questions please feel free to hop into #storm on Freenode and talk to us, or post a message on the mailing list.

I’ve been using the EC2 API quite a lot lately, while working on the Landscape project. The API documentation is excellent, but I want to know how it behaves when things go wrong. What kind of failure is produced when I forget to pass a parameter? What happens when I pass a bogus parameter? How does it behave when I pass numbered parameters that aren’t in sequence, like Name.7=foo&Name.42=bar? How does Amazon’s implementation differ from Eucalyptus or Nova?

I couldn’t find the answers I wanted in the documentation so I wrote a simple tool to invoke a method with an arbitrary set of parameters against an EC2 API endpoint. After sending the request, it prints the HTTP status code and response to the screen. It’s called txaws-discover and is part of the txAWS project. For example, to run the DescribeRegions method simply provide credentials, an endpoint, the method name and whatever parameters you want to pass with it:

$ txaws-discover --key NUJOZS2V6RWQUJ3G8JUT \
    --secret +Ki9Wk5Y4kudFh1EcnB3hthC9PtVU+CcfMJZ4DVl \
    --endpoint https://ec2.us-east-1.amazonaws.com \
    --action DescribeRegions \
    --RegionName.0 us-west-1

This command produces the following output:

HTTP status code: 200

<DescribeRegionsResponse xmlns="http://ec2.amazonaws.com/doc/2008-12-01/">

The credentials and API endpoint can be defined in environment variables to make txaws-discover easier to use:

export AWS_SECRET_ACCESS_KEY=+Ki9Wk5Y4kudFh1EcnB3hthC9PtVU+CcfMJZ4DVl
export AWS_ENDPOINT=https://ec2.us-east-1.amazonaws.com

With those defined, the command above can be shortened:

$ txaws-discover --action DescribeRegions --RegionName.0 us-west-1

It’s in lp:txaws and has proven to be very helpful as a learning tool.

In Storm, the Store.find method runs a query and returns matching objects. Finding all accounts in the database, the equivalent of SELECT * FROM account, is simple:

result = store.find(Account)

Clauses to limit the scope of the query can be passed to find. Finding all accounts owned by Vince Offer, the equivalent of SELECT * FROM account WHERE owner = ‘Vince Offer’, is also simple:

result = store.find(Account, Account.owner == "Vince Offer")

Another way to achieve the same thing is to take the result from the first find operation and refine it:

result1 = store.find(Account)
result2 = result1.find(Account.owner == "Vince Offer")

Storm doesn’t run a query until you do something with the result, so the impact on the database is exactly the same as in the previous example. This is the simplest form of what Jonathan and I have been calling the collection pattern. The basic idea is that you start with a collection of all objects, and then refine it, until you have the collection you want. This pattern is possible using pure Storm, as shown above, but we’ve found it useful to implement objects that hide this logic and provide a more user friendly API. For example, an AccountCollection would provide filtering logic as named methods:

class AccountCollection(object):

    def owned_by(self, salesman):
        Get a new collection with accounts owned by

    def has_unpaid_invoices(self):
        Get a new collection with accounts that have
        unpaid invoices.

    def find(self):
        """Get a result set of matching accounts."""

Each filtering method, such as owned_by and has_unpaid_invoices returns a new AccountCollection instance. This isn’t strictly necessary but creating a new instance is both easier to understand and implement. Finding all accounts owned by Vince Offer that have unpaid invoices is quite easy:

collection1 = AccountCollection()
collection2 = collection1.owned_by("Vince Offer")
collection3 = collection2.has_unpaid_invoices()
result = collection3.find()

This pattern provides several benefits:

  • It names filtering options which makes them easy to understand and use.
  • Query building logic is in one place which eliminates duplication.
  • All the criteria used to build a query are available when the query is generated, which makes it possible to generate optimized queries. All users of the collection benefit from optimizations.
  • It creates a clean separation between finding data and using it. This helps keep application logic more focused, because it isn’t intermingled with the particulars of generating queries.

We’ve used this pattern to good effect in both Landscape and Launchpad. In a future post I’ll talk about some different strategies we’ve used to implement the pattern.