Overview Container Orchestration Tools

I’ve been experimenting with Docker and it’s ecosystem for a while, and my setup has become a bit of a mess; different machines using various old versions of Docker and various generations of custom scripts to manage them. It was time for an overhaul, and I set out to have a closer look at the tools our there.

It’s kind of a mess. Everyone wants to release an orchestration tool, and often their places in the stack are all over each other.

So let’s consider different parts an orchestration system might cover:

  1. A container engine. Often this is docker, but there are alternatives, including just talking to the Linux plumbing directly. Because the container engine is ultimately replaceable plumbing, and Docker Inc. is a highly funded business, what will continue to happen is that Docker tries to become a full-stack orchestration tool, and competitors will support other container engines.
  2. A scheduler that will run the containers. The most bare-bones version is a CLI script that runs imperatively; It might be upstart or systemd on single-host systems. Or it’s a networked-cluster scheduler like Swarm. A cluster scheduler essentially needs a process that runs on every host. A lot of full-stack orchestrators have their own scheduler, say Tutum (now Docker Cloud). There are standalone schedulers like fleet though that you can harness.

  3. A networking solution to let containers talk to each other. On multi-node clusters this will be some sort of overlay network; your cloud provider might provide it. Even on a single host, you want to let services talk to each other, but only expose some particular services to the public (say the HTTP router). In the early days of Docker, the answer was host mapping.

  4. A Service discovery solution. You want your web app container to talk to the MySQL container, so you have to know it’s address. These days, pretty much everyone seems to use a) a cluster-internal network that every service has an address on b) DNS-based resolving, usually build right into the networking layer and c) ‘links’ in the sense that the DNS name to connect to is injected via environment variables. In the early days of Docker, this was messy. We used various tools to register the services when they start (sdutil, registrator), and to query the service discovery and link on service to another (sdutil, ambassadord); frequently, host mapping to random ports was involved. See also my previous post on this.

  5. A proxy to route to the services you want to expose. In simple cases, you can just publish your webapp on port 80 directly, but if you have two apps, you need to route to them based on the respective domain. Because the router needs to know the address of the backend, this router might integrate with service discovery.

  6. Developer tools, for example deploying an app on every push to the repo.

  7. Cluster-wide persistent storage. Unless you’re not on cloud provider, I consider this to be still unsolved, despite various Docker volume plugins. It’s just very hard to setup.

Before I look at the full-stack tools, here are some of the implementations that focus on particular layers in that stack:


swarm (old)
The swarm scheduler that you ran on as a docker container, now being presumably deprecated in 1.12.

Docker Swarm Mode
The new swarm mode built directly into docker. It is incredibly simple to use.

A networked systemd. Core-OS specific.

Networking solutions

The network overlay that gives every container their own private ip is clearly the winner here. A lot of orchestrators have their own solution. Generic ones include weave and flannel.

Dev experience

A lot of orchestration tools naturally target ops and don’t deal with this part. There are basically two approaches:

A service handles git pushes, runs the code through slugbuilder, stores the slug as a tar.gz somewhere. To run it, the blob is given to a slugrunner image. In other words, your build artifacts use the Heroku slug format, and there is a custom system to hold the version history.

You build every version of your app into a docker image directly. You use your docker registry for version management. In the simpliest case, you just set up a Github webhook and let Docker Hub build.


A distributed filesystem.

Docker volume pugin; too enterprisy for me.

Written as part of Rancher. Integrates nicely there. Outside of it, has bad instructions. Does support NFS, block dvices.

Docker NetShare
Supports NFS, CIFS.

Now, let’s look at some full-stach approaches and where they fall in the stack:


Flynn literally implements the whole stack by themselves, and exposes everything with a very limited, thin Heroku-like API.

  • They have their own container engine, their own scheduler, their own network overlay, their own service discovery, their own router, and their own dev UI to create apps, and “git push” release.
  • You wouldn’t run MySQL by ourself like you would do in a docker-compose. MySQL, as in Heroku, is a backing service serving multiple apps.
  • But your app can be a Dockerfile, and apps can find each other via service discovery.
  • So you could setup our own MySQL server as an app, but structurally, you now have two apps: myblog-app and myblog-mysql.
  • The UX of Flynn is to give devs something Heroku-like apps, not for ops to spin up various containers that interact.

What do I think?

  • I like that the whole stack is lightweight go. It’s limited API surface has a certain beauty.
  • I worry they have too much work for a small team.
  • Installation is difficult; only on Ubuntu now.
  • Web UI is still very basic.

Tutum/Docker Cloud

Tutum was bought by Docker and rebranded.

  • Container engine is docker
  • Scheduler: It’s own.
  • Networking: weave
  • Service Discovery: DNS, injecting env vars based on links.
  • Router: They offer a haproxy image that talks to their scheduler to know about services, and the services device env vars.
  • Dev-Tools: Let the Docker Hub auto-build after a Github Push, and can auto-deploy after a docker image is published.

Essentially, Docker Cloud is:

a) a scheduler.
b) assembles some tech for you (weave).
c) a UI that is thin interface on top of Docker itself (you still interact with containers and their config a lot), including the “stack” abstraction (a collection of multiple services).

Regarding the proxy: The idea of using a battle-tested haproxy is nice, but in practice, I continuously run into issue: Often it required a restart when updating/changing services. It’s also limited in that it requires defining https + http urls, and cannot do redirects. It requires manually linking the proxy container to all services; if a reload fails (say an issue with ssl cert), all of the sites will be down.

Also I wonder what will happen to Docker Cloud now that the Docker daemon itself now implements essentially everything that Docker Cloud offers, but with different tech (stacks and services as an abstraction, a network overlay, a scheduler). It might end up being just a UI on top of the docker daemon.


  • Container engine and scheduler: The default is their own scheduler running Docker containers. But they have backends for Swarm, Kubernetes and Mesos.
  • Networking: Custom L3 IPSec tunnel. It seems this is encrypted by default and doesn’t require any user-space implementation.
  • Service Discovery: DNS and env vars.
  • Dev-Tools: None.

The idea of different backends is nice, but in practice, Rancher doesn’t paint over the differences. In other words, whatever backend you choose, the frontend you work with will be different, too. The “app catalogs” they offer are separate too. So it’s basically four different products, and not all of them have the same quality. I see a lack of focus here.

What do I think?

  • UI is a little less polished than Docker Cloud; but I like it more in same ways, plus it’s more speedy.
  • I easily ran into a bunch of bugs and issues on deeper use.
  • Does not support v2 of docker-compose.
  • The abstraction of an external DNS service that they have is very neat. Currently only supports a single root domain, and adds exposed services as subdomain. But still good enough to use with multiple root domains too if CNAME is used.
  • Rescheduling only happens if you define a health check (https://github.com/rancher/rancher/issues/1877). See also https://github.com/rancher/rancher/issues/2195
  • Storage pool integration via their Convoy service; this worked quite well; the key here is that they wrote the docker volume plugin + they show the pool in the UI. Maybe they execute some docker register plugin command, too. Nice helpers, but independent of the rest of the system, really.

The native docker stack

Docker used to be just the engine. Then they added Swarm as a separate scheduler. A native network overlay. docker-compose as a dev tool. I already talked about Docker cloud.

Now with 1.12, Docker itself has the swarm scheduler built in, and understands a “service” abstraction. Just everything.


  • Container engine: Docker, later likely supporting others.
  • Scheduler: Their own.
  • Networking: This is up to you, which makes it so difficult to install. Solved for you if using Google Container engine.
  • Service discovery – Their own, based on DNS
  • Storage – Various volume implementations.
  • Proxy up to you, but provides an “Ingress resource” abstraction that you can build on.
  • Dev tools – up to you.

While it uses Docker as a basic container runner, contrary to other tools it doesn’t expose it at all. You are dealing with a custom CLI and custom abstractions, and there are *a lot of them. Ingress resources, secrets, it’s own volume system. For example, a “service” in Kubernetes doesn’t actually need to run on Kubernetes. Other apps can refer to the service without knowing whether it runs inside or outside of the cluster. Or think about the fact that for scaling, you don’t say replicas=3. It’s abstracted inside a “replication controller”.

What’s my idea?

  • Looking at Kubernetes/Helm and the config files around it, you get the impression that it has a strong backing/ecosystem/architecture.
  • It seems well thought out in that it seems to have a resource type of every conceivable problem space.
  • But kubernetes is far away from docker, and in fact the host itself, that it very much feels like a blackbox.
  • If you look on your host, there are so many containers management and side-car containers running for each actual container, that it’s not something I want to interact with; I only have the Kubernetes CLI.
  • The config files feel complicated.
  • I wish it could be installed more easily.

How does it work inside?

  • Random proxy port on each host necessary (service port is routed to random proxy port which is routed to pod ip). Apparently because if the pause pod is restarted, it gets a new netns and then the user containers have to be restarted too (is this right?)
  • Services get a fake ip. The proxy on every node picks up traffic to that fake subnet, probably looks up the ip to map it to a pod, then can forward to the pod ip.
  • The kubeproxy on each host is essentially the global load balancer; in our own architecture the load balancing would be done via DNS/Service discovery. Here DNS gives you only the (more stable) service ip, which load balances to all the pods/containers.
  • Similarly, if you give a service an external ip, it will simply create the routes on each node that when a request for this ip comes in it will be routed like internal ips. As such you can use as an external address either any minion ip that is stable, or an external load balancer ip that routes to any of your minions. There might still be an extra hop until it goes to the real pod. It seems internal services are protected by either a firewall and/or the network setup (you cannot get your packets routed to one of these internal service ips).
  • This series goes into extesnive detail.

Deis Workflow

This basically adds a Heroku-like “git push for deploy”-based Workflow on top of Kubernetes.

I like the idea. Build on top of an existing scheduler, provide all the pieces for development.

  • Setup was a pleasant experience (on gcloud). well documented, works…
  • Like in Flynn, the Heroku-workflow is just a different beast than running up your own MySQL image for every app.


I didn’t look into this one too closely, because I somehow couldn’t communicate with it. It seems that:

  • It is not built on Docker, and maybe does not actually containerize.
  • It seems like a solution to deploy and manage multiple apps on a single host.
  • The WordPress example basically has you write the commands to install MySQL manually.
  • It seems the benefit is running CI, and builds for you, creating release artifacts, that it will then push to the servers for you.

Other notes:

Thoughts on a Let’s Encrypt docker workflow

Getting the cert

  • Assuming that there is a proxy that handles a bunch of different domains.
  • That proxy should support Let’s Encrypt either directly (there are some haproxy images on the Hub).
  • Or it can be configured to redirect Let’s Encrypt challenges to the Let’s Encrypt container (described in the Forum).

Basically: Run Let’s Encrypt with ‘–standalone’. For validation, Let’s Encrypt will try to find a file on your domain (/.well-known/acme-challenge). The proxy redirects that to the container you just started (which might need to have a fixed address/ip, or the proxy needs to find it via the regular service discovery mechanism you are using).

Installing a cert

  • Manually.
  • A custom installer plugin, for Tutum for example, could use the Tutum API to redeploy the app container with a new SSL_CERT environment variable.

Renew a certificate

  • Let’s Encrypt certs are only valid for 3 months, so renewing is an issue.
  • A cronjob could re-run the above process every x months.
  • A Let’s Encrypt service & web ui where all the domains can be managed.

Tree extensions for SQLAlchemy

I recently needed a performant way to manage a tree, meaning using MPTT, nested sets etc.

I looked at the following libraries, which integrate with SQLAlchemy. These are my notes, some of which seem to be outdated now.


  • No features for querying
  • Injects fields like parent_id, children into the model by itself, without allowing customization.
  • It’s rebuild() is not able to init a tree from scratch.


  • Integrating into the class-based ORM mapper is messy (pr accepted)
  • No rebuild() feature at all (pr accepted)
  • Has bugs when used with ORM attributes are diverging from table column names
  • BIG PLUS: can detect and handle tree moves if attributes on an existing node change.
  • Does not support keeping a sort order around.


  • Seems to have a working rebuild()
  • Can sort of integrate with the class-based ORM mapper setup (although it seems it would need to be globally)
  • tree_recursive_iterator() for querying a tree.
  • To move a subtree, we need to call special move() methods.

sqlamp and sqlalchemy-orm-tree seem to share a lot of ideas/APIs.

sqlamp seems to be the most most stable / actively developed, although sqlalchemy-orm-tree makes a good impression, too, and the tree move detection is so helpful that I decided to go with it.

Deleted Emails – Imap Sync issue with Thunderbird (and Dovecot)

I’ve run into this Thunderbird issue a number of times, always with a Dovecot IMAP server:

  1. You have multiple connections open to an IMAP account.
  2. You delete an email in one client.
  3. The email is not removed from Thunderbird until you restart Thunderbird, or do other sorts of fiddling around.
  4. Meanwhile, marking messages as read syncs just fine immediately.

There are a lot of people complaining about this, and various workarounds suggested (1, 2), but I’ve never seen a clear explanation of why it doesn’t work and who is really at fault, more importantly, why it isn’t fixed.

The best answer I found is that Thunderbird’s implementation of the CONDSTORE IMAP feature (which allows more optimized syncing) is broken, isn’t getting fixed since Thunderbird development is basically dead, and that disabling CONDSTORE, either on the server (i.g. in Dovecot) or in Thunderbird (disable “use_condstore” setting in the advanced config editor) is the best way to fix it.

Options to do CSS in React

  • Suit CSS, BEM et al.: separate, regular stylesheets but component naming convention for components.
  • Just manually with inline style attributes. To merge multiple style maps, use helpers like react-style, which enables an array syntax for the style attribute, use a custom function like m() to do the same, or use destructuring: style={ {...a, ....b, ...c} }.
  • Radium supports a [] syntax like react-style, but also adds support for things like :hover and media queries at the expense of wrapping the component.
  • CSS Modules is a CSS-like syntax that gets parsed into a dict of class names, and the CSS extracted into a file.

Further reading:



  • Durchaus komplex, aber auch entsprechend viele Funktionen.
  • Muss installiert werden (irgendein Adobe Air Dreck), dräge UI
  • Teuer (Anlagegüter nur in Enterprise-Version!)
  • Gegenüber Lexware Buchhalter wohl ein Fortschritt.
  • Die Funktion Kontierung nach Regeln ist sehr nett.


  • Sieht super aus, nicht langsam.
  • Die Präsentation des “Kontorahmens” ist sehr nett.
  • Nicht wirklich ein Buchhaltungsprogramm per se; ohne zuvor erstellte Rechnung kann man auch keine Einname abgleichen.
  • Erweiterte Funktionen lassen sich nicht testen ohne zu zahlen.
  • Auch ansonsten gefällt mir die Preisgestaltung nicht. Bei mehr als 2 Konten muss man schon ins Top-Paket.


  • Die einzelnen Versionen haben eine unterschiedliche UI. In der Buchhalter-Version dreht sich die UI tatsächlich um Konten und Buchungen, in der Standardfassung um Rechnungen und Belege.
  • Kann HBCI Kontenzeilen “Belegen/Rechnungen” zuordnen, oder direkt auf SKG Kontorahmen.
  • Fertiger Kontorahmen, keine eigenen Sachkonten.
  • Man kann sich Sachkonten auch nicht anzeigen lassen.
  • Kann an Elster übermitteln.
  • Hat CSV Export.


  • Eher eine Webseite als eine “App”.
  • Ich mag den Modus wo er beim Buchen zuerst nach der Einnahme-/Ausgabeart frägt, und das Konto so quasi versteckt.
  • Keine eigenen Sachkonten, nur Kontorahmen.
  • Nicht wirklich irgendwelche Funktionen.
  • Träge.


  • Extrem unsexy, auch in der Bedienung.
  • Funktioniert aber und hat alles was ich so brauchen könnte, einschliesslich Elster.
  • Ist eine der schnelleren Apps; ich habe das Gefühl das man trotz der hässlichen UI schneller damit arbeiten kann als mit scopeVisio.
  • In der URL steht cgi.exe

Am Ende würde ich hauptsächlich lexOffice und collmex in Erwägung ziehen.

Moving to a new todo app

I’ve been a GetFlow customer since the beginning of 2012. But I’ve been growing exceedingly unsatisfied with the experience, especially since it’s not the cheapest of apps (I was paying 10 dollars per month, now it became more expensive).


  • It’s too slow. Loading the app takes a couple seconds; loading the Android app takes a couple seconds.
  • On Android specifically there is no offline capability. It has to load the whole thing every time, cannot add or check items without a connection, might even have to reload after a screen lock. Terrible if you want to use your shopping list, and the supermarket has bad reception.
  • The Android app doesn’t event let you access subfolders.
  • The search feature is slow, too, and doesn’t seem to find everything.
  • They used to have a section to add long-form text, which I used a lot. But some years back they changed their system to have uneditable comments only, which has been bothering me ever since. It’s just not at all helpful for a single user.

I looked at both Wunderlist and Todoist.

  • Wunderlist has a better Android app; though Todoist is not particularly bad.
  • Todoist’s web interface just looks more attractive to me. Wunderlist does not seem good on a big screen.
  • Both have their UI warts: I hate having to double click in Wunderlist to see the task details, which seems completely unnecessary. Neither do I like how a single click in Todoist enters into a disruptive edit-mode.
  • Both have editable notes, and both don’t implement those in a great way. It’s too uncomfortable to use this for more than short notes. But at least it’s there. In Wunderlist it is better than in Todoist. Hint: The notes should be easy to edit with a single click, ideally support formatting, and not be in a tiny window.
  • Wunderlist has a more traditional approach, with subtasks being an explicit feature. Todoist has a very cool indentation system. However, it has it’s downsides, because “subtasks” are really just their own tasks: They don’t repeat with the parent,
  • Todoist’s “after” repetition mode is pretty cool.

I wrote two scripts to export my GetFlow data and import it into Todoist.

Here is what I found lacking in Todoist so far:

  • Getflow could move a task to a different project by typing the name. This was often easier than using drag&drop.
  • When drag&dropping tasks, the project list left doesn’t scroll up and down as necessary.
  • It doesn’t really want you to keep old tasks around.
  • Todoist does not seem to expose obvious metadata like “when was this task created”. This seems like a small thing, but feels depper. In Todoist, tasks have a more ephemeral feel, less like real “objects” that are a matter of record. The fluid indentation, the easily accessible permanent delete feature, and most importantly the fact that the UI at times seems to encourage you to use the delete feature to clean up contribute to that.
  • In particular, the fact that completing a subtasks does not make that subtasks disappear until the parent task is completed bothers me a lot, and seems clearly not to be a design decision, but a limitation of the data structure – namely that it really is not a subtask, just a task with an indent value assigned.