In case you’ve been living on another planet, we recently open-sourced our high performance message passing framework.

I’m going to give a quick run-down on how we put messages into the ring buffer (the core data structure within the Disruptor) without using any locks.

Before going any further, it’s worth a quick read of Trish’s post, which gives a high-level overview of the ring buffer and how it works.

The salient points from this post are:

  1. The ring buffer is nothing but a big array.
  2. All “pointers” into the ring buffer (otherwise known as sequences or cursors) are Java longs (64 bit signed numbers) and count upward forever. (Don’t panic – even at 1,000,000 messages per second, it would take the best part of 300,000 years to wrap around the sequence numbers).
  3. These pointers are then “mod’ed” by the ring buffer size to figure out which array index holds the given entry. For performance, we actually force the ring buffer size to be the next power of two bigger than the size you ask for, and then we can use a simple bit-mask to figure out the array index.

Basic ring buffer structure

WARNING: In terms of the organisation of the code, much of what I’m about to say is a simplification. Conceptually, I think it’s simpler to understand starting from how I describe it.

The ring buffer maintains two pointers, “next” and “cursor”:

In the picture above, a ring buffer of size 7 (hey, you know how these hand-drawn diagrams work out sometimes!) has slots 0 through 2 filled with data. The next pointer refers to the first free slot. The cursor refers to the last filled slot. In an idle ring buffer, they will be adjacent to each other as shown.

Claiming a slot

The Disruptor API has a transactional feel about it. You “claim” a slot in the ring buffer, then you write your data into the claimed slot, then you “commit” the data.

Let’s assume there’s a thread that wants to put the letter “D” into the ring buffer. It claims a slot. The claim operation is nothing more than a CAS “get-and-increment” operation on the next pointer. That is, this thread (let’s call it thread D) simply does an atomic get-and-increment which moves the next pointer to 4, and returns 3. Thread D has now claimed slot 3:

Next, another thread (thread E) claims slot 4 in the same manner:

Committing the writes

Now, threads D and E can both safely and simultaneously write their data into their respective slots. But let’s say that thread E finishes first for some reason…

Thread E attempts to commit its write. The commit operation consists of a CAS operation in a busy-loop. Since thread E claimed slot 4, it does a CAS waiting for the cursor to get to 3 and then setting it to 4. Again, this is an atomic operation. So, as the ring buffer stands right now, thread E is going to spin because the cursor is set to 2 and it (thread E) is waiting for the cursor to be at 3.

Now thread D commits. It does a CAS operation and sets the cursor to 3 (the slot it claimed) iff the cursor is currently at 2. The cursor is currently at 2, so the CAS succeeds and the commit succeeds. At this point, cursor has been updated to 3 and all data up to that sequence number is available for reading.

This is an important point. Knowing how “full” the ring buffer is – i.e. how much data has been written, which sequence number represents the highest write, etc. – is purely a function of the cursor. The next pointer is only used for the transactional write protocol.

The final step in the puzzle is making thread E’s write visible. Thread E is still spinning trying to do an atomic update of the cursor from 3 to 4. Now the cursor is at 3, its next attempt will succeed:

Summary

The order that writes are visible is defined by the order in which threads claim slots rather than the order they commit their writes, but if you imagine these threads are pulling messages of a network messaging layer then this is really no different from the messages arriving at slightly different times, or the two threads racing to the slot claim in a different order.

So there we have it. It’s a pretty simple and elegant algorithm. (OK, I admit I was heavily involved in its creation!) Writes are atomic, transactional and lock-free, even with multiple writing threads.

(Thanks to Trish for the inspiration for the hand-drawn diagrams!)

Share this:
Facebook Twitter Digg Email