Marc's Blog

About Me

My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in machining, welding, cooking and skiing.

I'm currently an engineer at Amazon Web Services (AWS) in Seattle, where I work on databases, serverless, and serverless databases. Before that, I worked on EC2 and EBS.
All opinions are my own.

Links

My Publications and Videos
@marcbrooker on Mastodon @MarcJBrooker on Twitter

What is a container?

What are words, anyway?

A common cause of confusion and miscommunication I see is different people using different definitions of words. Sometimes the definitions are subtly different (as with availability). Sometimes they’re completely different, and we’re just talking about different things entirely. A common example is the word container, a popular term for a popular technology that means at least four different things.

  1. An approach to packaging an application along with its dependencies (sometimes a whole operating system user space), that can then run on a minimal runtime environment with a clear contract4.
  2. A set of development, deployment, architectural, and operational approaches built around applications packaged this way.
  3. A set of operational, security, and performance isolation tools that allow multiple applications to share an operating system without interfering with each other. On Linux, this tools include chroot, cgroups, namespaces, seccomp, and others.
  4. A set of implementations of these practices (the proper nouns, Docker, Kubernetes, ECS, etc).

These four definitions are surprisingly independent. The idea of packaging applications this way predates the other three, and will likely be around after they are gone. The practices and approaches are enabled by the tools, but don’t really require them. The Linux kernel-level interfaces, and the semantics and security they provide, are a basis for many of the implementations today, but most of the semantics are available in different ways1.

To pick an example, when we talk about container image support in AWS Lambda3 we mostly mean the first one: enabling customers to get the advantages of packaging their code that way, with a small overlap with practices (some become easier to use with this support available), and the fourth (some of these tools can be used to create the images in ways that fit into a broader ecosystem).

Or, to pick another example, when people say containers are not a security boundary2, they are mostly talking about the third category (with some overlap into the fourth). It barely touches on the first and second category, which are generally a big win for security. That full conversation is subtle, so I won’t go into it here.

When you use the word container, consider whether your audience is using the same definition as you.

Footnotes

  1. For example, with MicroVMs like Firecracker.
  2. Those people include me.
  3. If you’d like to dive into the details, check out our paper about adding container support to AWS Lambda or my blog post summary of it.
  4. This question of reducing the size of the contract between the container and the runtime is an interesting one. In most typical container implementations, this contract still includes hundreds of APIs, and other complex interaction points like filesystems. Only on the more extreme end, like MicroVMs virtio interfaces (see the Firecracker paper for our approach there) and things like SECCOMP_SET_MODE_STRICT do these APIs become truly small. However, across the whole container spectrum they’re smaller and simpler than those presented by libc and openssl and the other thousands of libraries you’ll commonly find in a default Linux user space.