Marc's Blog

About Me

My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in machining, welding, cooking and skiing.

I'm currently an engineer at Amazon Web Services (AWS) in Seattle, where I work on databases, serverless, and serverless databases. Before that, I worked on EC2 and EBS.
All opinions are my own.

Links

My Publications and Videos
@marcbrooker on Mastodon @MarcJBrooker on Twitter

Reading Research: A Guide for Software Engineers

Don't be afraid.

One thing I’m known for at work is reading research papers, and referring to results in technical conversations. People ask me if, and how, they should read papers themselves. This post is a long-form answer to that question. The intended audience is working software engineers.

Why read research?

I read research in one of three mental modes.

The first mode is solution finding: I’m faced with a particular problem, and am looking for solutions. This isn’t too different from the way that you probably use Stack Overflow, but for more esoteric or systemic problems. Solution finding can work directly from papers, but I tend to find books more useful in this mode, unless I know an area well and am looking for something specific.

A more productive mode is what I call discovery. In this case, I’ve been working on a problem or in a space, and know something about it. In discovery mode, I want to explore around the space I know and see if there are better solutions. For example, when I was building a system using Paxos, I read a lot of literature about consensus protocols in general (including classics like Viewstamped Replication1, and newer papers like Raft). The goal in discovery mode is to find alternative solutions, opportunities for optimization, or new ways to think about a problem.

The most intellectually gratifying mode for me is curiosity mode. Here, I’ll read papers that just seem interesting to me, but aren’t related to anything I’m currently working on. I’m constantly surprised by how reading broadly has helped me solve problems, or just informed by approach. For example, reading about misuse-resistant cryptography primitives like GCM-SIV has deeply informed my approach to API design. Similarly, reading about erasure codes around 2005 helped me solve an important problem for my team just this year.

I’ve found reading for discovery and curiosity very helpful to my career. It has also given me tools that makes reading for solution finding more efficient. Sometimes, reading for curiosity leads to new paths. About five years ago I completely changed what I was working on after reading Latency lags bandwidth, which I believe is one of the most important trends in computing.

Do I need a degree to read research papers?

No. Don’t expect to be able to pick up every paper and understand it completely. You do need a certain amount of background knowledge, but no credentials. Try to avoid being discouraged when you don’t understand a paper, or sections of a paper. I’m often surprised when I revisit something after a couple years and find I now understand it.

Learning a new field from primary research can be very difficult. When tackling a new area, books, blogs, talks, and courses are better options.

How do I find papers worth reading?

That depends on the mode you’re in. In solution finding and discovery modes, search engines like Google Scholar are a great place to start. One challenge with searching is that you might not even know the right things to search for: it’s not unusual for researchers to use different terms from the ones you are used to. If you run into this problem, picking up a book on the topic can often help bridge the gap, and the references in books are a great way to discover papers.

Following particular authors and researchers can be great for discovery and curiosity modes. If there’s a researcher who’s working in a space I’m interested in, I’ll follow them on Twitter or add search alerts to see when they’ve published something new.

Conferences and journals are another great place to go. Most of the computer science research you’ll read is probably published at conferences. There are some exceptions. For example, I followed ACM Transactions on Storage when I was working in that area. Pick a couple of conferences in areas that you’re interested in, and read through their programs when they come out. In my area, NSDI and Eurosys happened earlier this year, and OSDI is coming up. Jeff Huang has a nice list of best paper winners at a wide range of CS conferences.

A lot of research involves going through the graph of references. Most papers include a list of references, and as I read I note down which ones I’d like to follow up on and add them to my reading list. References form a directed (mostly) acyclic graph of research going into the past.

Finally, some research bloggers are worth following. Adrian Colyer’s blog is worth its weight in gold. I’ve written about research from researchers from Leslie Lamport, Nancy Lynch and others, too.

That’s quite a fire hose! How do I avoid drowning?

You don’t have to drink that whole fire hose. I know I can’t. Titles and abstracts can be a good way to filter out papers you want to read. Don’t be afraid to scan down a list of titles and pick out one or two papers to read.

Another approach is to avoid reading new papers at all. Focus on the classics, and let time filter out papers that are worth reading. For example, I often find myself recommending Jim Gray’s 1986 paper on The 5 Minute Rule and Lisanne Bainbridge’s 1983 paper on Ironies of Automation2.

Who writes research papers?

Research papers in the areas of computer science I work in are generally written by one of three groups. First, researchers at universities, including professors, post docs, and graduate students. These are people who’s job it is to do research. They have a lot of freedom to explore quite broadly, and do foundational and theoretical work.

Second, engineering teams at companies publish their work. Amazon’s Dynamo, Firecracker, Aurora and Physalia papers are examples. Here, work is typically more directly aimed at a problem to be solved in a particular context. The strength of industry research is that it’s often been proven in the real world, at scale.

In the middle are industrial research labs. Bell Labs was home to some of the foundational work in computing and communications. Microsoft Research do a great deal of impressive work. Industry labs, as a broad generalization, also tend to focus on concrete problems, but can operate over longer time horizons.

Should I trust the results in research papers?

The right answer to this question is no. Nothing about being in a research paper guarantees that a result is right. Results can range from simply wrong, to flawed in more subtle ways3.

On the other hand, the process of peer review does help set a bar of quality for published results, and results published in reputable conferences and journals are generally trustworthy. Reviewers and editors put a great deal of effort into this, and it’s a real strength of scientific papers over informal publishing.

My general advice is to read methods carefully, and verify results for yourself if you’re going to make critical decisions based on them. A common mistake is to apply a correct result too broadly, and assume it applies to contexts or systems it wasn’t tested on.

Should I distrust results that aren’t in research papers?

No. The process of peer review is helpful, but not magical. Results that haven’t been peer reviewed, or rejected from peer review aren’t necessarily wrong. Some important papers have been rejected from traditional publishing, and were published in other ways. This happened to Leslie Lamport’s classic paper introducing Paxos:

I submitted the paper to TOCS in 1990. All three referees said that the paper was mildly interesting, though not very important, but that all the Paxos stuff had to be removed. I was quite annoyed at how humorless everyone working in the field seemed to be, so I did nothing with the paper.

It was eventually published 8 years later, and quite well received:

This paper won an ACM SIGOPS Hall of Fame Award in 2012.

There’s a certain dance one needs to know, and follow, to get published in a top conference or journal. Some of the steps are necessary, and lead to better research and better communities. Others are just for show.

What should I look out for in the methods section?

That depends on the field. In distributed systems, one thing to look out for is scale. Due to the constraints of research, systems may be tested and validated at a scale below what you’ll need to run in production. Think carefully about how the scale assumptions in the paper might impact the results. Both academic and industry authors have an incentive to talk up the strengths of their approach, and avoid highlighting the weaknesses. This is very seldom done to the point of dishonesty, but worth paying attention to as you read.

How do I get time to read?

This is going to depend on your personal circumstances, and your job. It’s not always easy. Long-term learning is one of the keys to a sustainable and successful career, so it’s worth making time to learn. One of the ways I like to learn is by reading research papers. You might find other ways more efficient, effective or enjoyable. That’s OK too.

Updates

Pekka Enberg pointed me at How to Read a Paper by Srinivasan Keshav. It describes a three-pass approach to reading a paper that I like very much:

The first pass gives you a general idea about the paper. The second pass lets you grasp the paper’s content, but not its details. The third pass helps you understand the paper in depth.

Murat Demirbas shared his post How I Read a Research Paper which contains a lot of great advice. Like Murat, I like to read on paper, although I have taken to doing my lighter-weight reading using LiquidText

Footnotes

  1. I wrote a blog post about Viewstamped Replication back in 2014. It’s a pity VR isn’t more famous, because it’s an interestingly different framing that helped me make sense of a lot of what Paxos does.
  2. Obviously stuff like maths is timeless, but even in fast-moving fields like systems there are papers worth reading from the 50s and 60s. I think about Sayre’s 1969 paper Is automatic “folding” of programs efficient enough to displace manual? when people talk about how modern programmers don’t care about efficiency.
  3. There’s a lot of research that looks at the methods and evidence of other research. For a start, and to learn interesting things about your own benchmarking, take a look at Is Big Data Performance Reproducible in Modern Cloud Networks? and A Nine Year Study of File System and Storage Benchmarking