Marc's Blog

About Me

My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in machining, welding, cooking and skiing.

I'm currently an engineer at Amazon Web Services (AWS) in Seattle, where I work on databases, serverless, and serverless databases. Before that, I worked on EC2 and EBS.
All opinions are my own.

Links

My Publications and Videos
@marcbrooker on Mastodon @MarcJBrooker on Twitter

What Fekete’s Anomaly Can Teach Us About Isolation

Is it just fancy write skew?

In the first draft of yesterday’s post, the example I used was one that showed Fekete’s anomaly. After drafting, I realized the example distracted too much from the story. But there’s still something I want to say about the anomaly, and so now we’re here.

What is Fekete’s anomaly? It’s an example of a snapshot isolation behavior first described in Fekete, O’Neil, and O’Neil’s paper A Read-Only Transaction Anomaly Under Snapshot Isolation. The first time I read about it, I found it spooky. As in five stages of grief spooky. But the more I’ve thought about it, the more I think it’s just a great teaching example.

To understand the anomaly, let’s talk about two people. We’ll call the Pat and Betty. Pat and Betty share a pair of bank accounts - a savings account and a current account. They bank at Alan’s bank, which charges a $1 overdraft fee any time a withdrawal will reduce the total value of their accounts below $0.

One day, Pat and Betty are running separate errands. Pat goes to the ATM, checks the savings balance and sees $0, then deposits $20. After completing his transaction, Pat comes back to the ATM, checks their balance, and sees $20 in savings, and $0 in current. Around the same time, Betty goes to the ATM and withdraws $10 from the current account. Checking their account later, they notice a balance of -$11 in the current account, and $20 in savings.

But that’s impossible! Pat saw $0 and $20, so Alan’s bank shouldn’t have charged them that $1. Did they get ripped off?

In SQL

Let’s tell the same story again, in SQL. Starting with some setup:

begin; -- T0
create table test (id int primary key, balance int); -- T0
insert into test (id, value) values (0, 0), (1, 0); -- T0
commit; -- T0

Then we get to the anomaly itself, showing Pat’s two transactions (P1 and P2), and Betty’s one (B1):

begin; -- P1
begin; -- B1
select balance from test where id in (0, 1); -- B1
select balance from test where id = 0; -- P1
update test set balance = balance + 20 where id = 0; -- P1
commit; -- P1

begin; -- P2
select * from test where id in (0, 1); -- P2
commit; -- P2

update test set balance = balance - 11 where id = 1; -- B1
commit; -- B1

Under snapshot isolation (SI), and other some other implementations of weaker-than-serializable isolation levels, this SQL can run as-is. Under serializable isolation, at least one of these transactions would be rejected (standard Postgres would reject the commit of B1).

What’s interesting about this anomaly is that, if it wasn’t for P2, we could simply say that B1 happens before P1 and Alan’s Bank’s behavior was justified. But, by doing a read-only transaction, P2 caught them in a weak isolation anomaly.

But why, and what can we learn from this?

In a Picture

Before we answer that question, let’s draw a picture of the transactions and the database state:

The interesting part, fittingly, is the bit marked the interesting part is here: the decision whether to commit B1.

Why does B1 commit under snapshot isolation? We can answer that in three ways:

The OCC view: B1 is allowed to commit under SI, because it has no write-write conflicts with the transactions that committed between it’s start and end. B1’s write set is {1}, P1’s is {0}, and P2’s is {}. No conflict there. B1 would not be allowed to commit under serializability because of read-write conflict with P1: B1’s read set is {0, 1} which intersects with P1’s write set {0}.

The 2PL MVCC view: Under SI, B1 and P1 read from different MVCC snapshots, and the write lock taken by P1 on row 0 doesn’t conflict with the write lock taken by B1 on row 1.

The theory view: To quote from my favorite transaction theory paper, Crooks et al’s Seeing is Believing:

Like serializability, SI prevents transactions T from seeing the effects of concurrently running transactions. The commit test enforces this requirement by having all operations in \(T\) read from the same state \(s\), produced by a transaction that precedes \(T\) …

You can see that in the diagram: B1 reads the results from our setup transaction T0, which directly precedes it in the history.

However, SI no longer insists on that state \(s\) being \(T\)’s parent state \(s_p\): other transactions, whose operations T will not observe, may commit in between \(s\) and \(s_p\).

Here, \(s_p\) is the state that B1 applies its changes to, which isn’t the same state as the one it reads.

The commit test only forbids \(T\) from modifying any of the keys that changed value as the system’s state progressed from \(s\) to \(s_p\).

Which it hasn’t: only row 0 has changed, and B1 only needs to change 0.

But What Does It Mean?

Fekete’s anomaly sure is weird: by introducing a third read-only transaction, we get the database to show an anomalous behavior that would otherwise appear serializable. On the other hand, it also seems like a relatively straightforward case of constraint violation caused by write skew. To quote A Critique of ANSI SQL Isolation Levels

Transactions must preserve the constraint predicate to maintain consistency: if the database is consistent when the transaction starts, the database will be consistent when the transaction commits. If a transaction reads a database state that violates the constraint predicate, then the transaction suffers from a constraint violation concurrency anomaly.

From the perspective of the database system builder it’s a direct consequence of what we wrote about yesterday: SI allows a database (single-system or distributed) to avoid the coordination (inter-machine, inter-process, or inter-thread) necessary to detect read-write conflicts. Because that coordination scales with reads, and reads tend to be more common than writes in many DB workloads, this can be a big win.

What about the perspective of the application builder?

The Application Builder’s View

The application builder needs to answer two questions about this anomaly (and write skew and constraint violation) more generally:

  1. Is it a problem?
  2. If it is, what do I do about it?

As much as it makes DB-heads uncomfortable, evidence shows that many (if not most) application builders have concluded that the answer to question 1 is no. This isn’t an unreasonable answer. But let’s assume they do care, what do they do?

One option is to choose a serializable database and use its serializable isolation level. This can work for some workloads, but certainly not all. Serializability comes with significant performance and concurrency implications.

But there are some more surgical fixes. For example, in Aurora DSQL’s snapshot mode you can use FOR UPDATE1.

begin; -- P1
begin; -- B1
select balance from test where id = 0 for update; -- B1
select balance from test where id = 1 for update; -- B1
...
commit; -- B1 (will not commit)

This works because the theoretical model of isolation levels doesn’t map to special SQL features like FOR UPDATE in a super clean way, but roughly they are ways to increase isolation for some transactions. The second way is to force the write-write conflict.

...
update test set balance = balance - 11 where id = 1; -- B1
update test set balance = balance where id = 0; -- B1
commit; -- B1 (will not commit)

Overall, Fekete’s update isn’t something to be spooked about, but is an interesting example of the trade-off between serializability and weaker isolation levels. The anomaly is a direct result of the reduced coordination needed for SI: the very same reduced coordination that brings significant performance and scalability.

Footnotes

  1. This is broken up in a strange way, rather than using OR or IN because of a preview limitation in DSQL.