dave peck :: software, music, nerdery

Papertrail sells to SolarWinds

My friends Troy and Eric just sold their bootstrapped Seattle startup, Papertrail, to SolarWinds in an all-cash deal. Says Troy:

By not trying to build a large business — and simply trying to deliver an amazing experience to every customer and sweat the details — we built a large business. No shortcuts, just hard work.

For me, Papertrail is an ideal startup success story. Troy and Eric built a product that elegantly solved an important infrastructure problem. They found great customers and scaled to meet demand. They never had to raise money and today they’ve had a well-deserved exit.

They did it their way. Cheers to that, guys!

Hashicorp's Vault

I’m consistently impressed by the work that Mitchell Hashimoto and team are doing at Hashicorp.

Hashicorp builds tools that solve tightly focused real-world DevOps problems. Their solutions nearly always match my own mental model of the problems’ shape and extent. The Unix philosophy shines through: Hashicorp’s tools are independent but composable; developers can mix and match to meet their unique needs.

We use a couple Hashicorp tools at Cloak. If we had the bandwidth to improve our process and infrastructure, we’d quickly use a couple more. Vault solves a problem we’ve already had to solve ourselves, but it appears to do so in a much more complete and elegant way.

The golden rules of tech support

In response to my previous post, François Joseph proposes six interesting and actionable rules for better tech support. These seem to have been hard-won by François at a very large company.

There’s a catch:

None of the above can help if your product works poorly or your interface is atrocious. Dave’s case is fascinating because the inexpertly designed interface is not his to change.

Yes. The profile install process is fraught with peril. I think there’s much more we can do, both in our current iOS apps and in our support efforts, to wrap this process in softer blankets. The process itself, however, will forever be out of our hands.

We tend to think of Cloak for Mac as asymptotically approaching our ideal VPN client. Our iOS apps are another matter entirely, precisely because of this speed bump.

What's My Passcode?

We’ve been providing Cloak customer support for a couple years now. We know Cloak’s most problematic pitfalls like the back of our hands.

There’s one remarkably common confusion that I feel is very telling. It usually starts with an innocent request for help:

What’s my four-digit passcode? I never got one from Cloak.

Cloak doesn’t hand out passcodes, of course. Instead, on iDevices, it installs a VPN configuration profile. As part of this process, users must re-enter their lock-screen PIN.

For most users, installing profiles is terra incognita. There are several steps, comprising several poorly worded dialogs. This is unfortunate, but not necessarily a road block. Many users immediately ascertain from visual cues that they’re being asked for the passcode they always use to unlock their device. A few taps later and they’re on their merry way.

However, a surprising fraction of users simply don’t make the connection. Often, even after we reply, customers tell us that they “still don’t know the right passcode.” No amount of word-smithing to our default reply has yet reduced our initial rate of failure.

I know several people who teach math at the high school and collegiate levels. One common refrain I hear is that students often bucket into two groups: those that identify general concepts, and those that learn recipes to solve specific problems. Both are valid learning strategies, but they can lead to quite different outcomes.

This may be a leap, but it feels like Cloak’s passcode confusion is much the same. One the one hand, some customers have a general notion that they have a passcode, and that this passcode protects their device. On the other, customers have a specific notion that when their device wakes up from sleep, they need to enter a specific code. Installing a configuration profile is far from waking up from sleep; there’s no recipe for how to proceed.

I’d love to hear further examples of this kind of customer support conundrum in the wild.

Django Storage Minutia

I recently found myself peering under Django’s hood, trying to better understand how it manages static files and file uploads. I’ve been a Django user for years, yet I’ve never felt that I understood its storage layer.

What I found is a story told again and again in code: incremental change, organic growth, and strong path dependence. The storages layer is fantastically useful and flexible; for most people, it just works. On the other hand, if you’re actively building something rich and strange with it, then perhaps this historical perspective (and kibitzing!) will be of interest.

In the beginning

The very first version of Django shipped with support for file uploads. (Support for static files would have to wait.) To handle a variety of production scenarios, Django 1.0 introduced the notion of a Storage. Django storages are lowest-common-denominator abstract filesystems, but with a twist: they also surface a mapping between (private) filesystem paths and the (public) URLs where one can actually request those files.

Django 1.0 also shipped with a single concrete Storage implementation, FileSystemStorage, which simply wrapped the local filesystem. Since all of this code was strictly intended to be used with file uploads, the class defaulted to using MEDIA_ROOT and MEDIA_URL as the base of its path-to-URL mapping — a default that lives on in Django even today.

Static files

A couple years later, Django 1.3 shipped with its star feature: support for static files. The new staticfiles package became the Storage layer’s second real customer.

A small issue must have been apparent to Django’s developers at the time. It made sense to use FileSystemStorage to store static files locally, but the class defaults were no good: static files might need entirely different paths than uploaded media.

Oddly, instead of resolving this small problem by removing references to MEDIA_* in the storage layer, thereby clarifying its layering in the ecosystem, staticfiles instead opted to introduce a new derived class, StaticFilesStorage, whose sole purpose was to override the defaults to STATIC_ROOT and STATIC_URL. I’m not sure what the motivation was: it may have been historical, since staticfiles was originally a third-party package. Regardless, it seems to cause developer confusion even today.

Other small sins were committed with Django 1.3. The staticfiles package had the task of finding static files and collecting them into a final location, defined by STATICFILES_STORAGE and STATIC_ROOT. But where were the static files to be found? Enter Django 1.3’s Finder abstraction. Django 1.3 shipped with several finders, including the AppDirectoriesFinder, which looks for content in the static subdirectories of Django apps. Curiously, 1.3 also shipped with both a FileSystemFinder, which wraps (multiple) FileSystemStorage instances under the hood, and a BaseStorageFinder, which wraps an arbitrary Storage instance. I think the motivation for FileSystemFinder was to support Django’s convenient new STATICFILES_DIRS setting “out of the box”, but the partial functional overlap between these new finders also led to confusion.

Another strangeness shipped with Django 1.3: there were now real-world storages where the special “twist” of having to map between paths and URLs no longer made sense. The mapping continued to make sense for file uploads and collected static files, but for the storages used in finders, the URL side of the mapping was meaningless. No effort was made to clarify or refactor the API.

Cached static files

Django 1.4 included a key new staticfile feature: cached static files. Caching gave developers the ability to automatically generate and append content hashes to filenames during collection (like style-91a0.css), permitting them to leverage far future Expires headers for static content.

Responsibility for hashing content is split in two. The first interested party is the collectstatic management command. After finishing collection, it looks for a magic method, post_process(), on the underlying Storage and calls it if present. This method is intended to be generic, performing arbitrary work and returning a list of impacted static files.

The post_process() method is apparently not well-used: after a search across all public Python repositories on both GitHub and BitBucket, the only implementation I found was Django’s own content hash generator. Tellingly, Django’s implementation is completely generic with respect to the underlying storage, living as it does in a mixin; it’s not clear to me it belongs on Storage at all.

Modern day Django

Fast forward to today, and the fantastic Django 1.7.4 release. Aside from a small refactoring to introduce the new ManifestFilesMixin (a slight variant on the previous CachedFilesMixin), and the introduction of deconstructibility to support using storages with 1.7’s new migrations, things have largely remained the same in this corner of core Django.

The Django community hasn’t stood still, however. The Django Storages project has implemented several commonly-used storages, including for Amazon S3, Azure, and other well-known cloud providers. And packages like Django Compressor have filled in the critical gap between static files, which are intended to be served directly, and the assets from which they are generated1.

I think the fact that the ecosystem has flourished demonstrates that the original design, while imperfect, is still quite sound. I do think there is an opportunity for a beneficial (if backwards-incompatible) refactoring.

There’s an opportunity to clarify layering. Django’s storage abstractions should be independent of any specific use. For example, they should not refer back to MEDIA_* settings; media and static files should be strict consumers of the storage layer. It might also be worth reconsidering the restriction that storages must be constructible without any parameters; this has led to a flourishing of storage classes whose only purpose is to override defaults.

Then there’s the question of the precise responsibilities of Storage implementations. Path-to-URL mapping, so fundamental to storages in all cases in Django 1.0, is only sometimes needed today. In addition, there are plenty of real-world storages where common operations (directory listings, reading back written files) are either expensive in the underlying filesystem, or simply impossible. There is currently little clarity around which Storage methods are required in derived classes, and which are optional. The bottom line today seems to be: if you use an exotic Storage, and it blows up in your use case, then you’re out of luck.

Finally, the sheer number of third-party asset pipelines for Django shows that there’s a lot more room to grow. I suspect that, much like they did with migrations, the core Django team will take their time before finally deciding on the one true path forward.

[1] Asset pipelines are my secret reason for spending time here. After evaluating the big two, Compressor and Pipeline, Peter and I rolled our own for Cloak. It’s something we’re considering shipping publicly. I’m more closely aligned with Compressor in spirit, but it was primarily designed with runtime in mind; its “offline” compression feels like somewhat of an afterthought.