The Disruptor – Lock-free publishing

In case you’ve been living on another planet, we recently open-sourced our high performance message passing framework.

I’m going to give a quick run-down on how we put messages into the ring buffer (the core data structure within the Disruptor) without using any locks.

Before going any further, it’s worth a quick read of Trish’s post, which gives a high-level overview of the ring buffer and how it works.

The salient points from this post are:

  1. The ring buffer is nothing but a big array.
  2. All “pointers” into the ring buffer (otherwise known as sequences or cursors) are Java longs (64 bit signed numbers) and count upward forever. (Don’t panic – even at 1,000,000 messages per second, it would take the best part of 300,000 years to wrap around the sequence numbers).
  3. These pointers are then “mod’ed” by the ring buffer size to figure out which array index holds the given entry. For performance, we actually force the ring buffer size to be the next power of two bigger than the size you ask for, and then we can use a simple bit-mask to figure out the array index.

Basic ring buffer structure

WARNING: In terms of the organisation of the code, much of what I’m about to say is a simplification. Conceptually, I think it’s simpler to understand starting from how I describe it.

The ring buffer maintains two pointers, “next” and “cursor”:

In the picture above, a ring buffer of size 7 (hey, you know how these hand-drawn diagrams work out sometimes!) has slots 0 through 2 filled with data. The next pointer refers to the first free slot. The cursor refers to the last filled slot. In an idle ring buffer, they will be adjacent to each other as shown.

Claiming a slot

The Disruptor API has a transactional feel about it. You “claim” a slot in the ring buffer, then you write your data into the claimed slot, then you “commit” the data.

Let’s assume there’s a thread that wants to put the letter “D” into the ring buffer. It claims a slot. The claim operation is nothing more than a CAS “get-and-increment” operation on the next pointer. That is, this thread (let’s call it thread D) simply does an atomic get-and-increment which moves the next pointer to 4, and returns 3. Thread D has now claimed slot 3:

Next, another thread (thread E) claims slot 4 in the same manner:

Committing the writes

Now, threads D and E can both safely and simultaneously write their data into their respective slots. But let’s say that thread E finishes first for some reason…

Thread E attempts to commit its write. The commit operation consists of a CAS operation in a busy-loop. Since thread E claimed slot 4, it does a CAS waiting for the cursor to get to 3 and then setting it to 4. Again, this is an atomic operation. So, as the ring buffer stands right now, thread E is going to spin because the cursor is set to 2 and it (thread E) is waiting for the cursor to be at 3.

Now thread D commits. It does a CAS operation and sets the cursor to 3 (the slot it claimed) iff the cursor is currently at 2. The cursor is currently at 2, so the CAS succeeds and the commit succeeds. At this point, cursor has been updated to 3 and all data up to that sequence number is available for reading.

This is an important point. Knowing how “full” the ring buffer is – i.e. how much data has been written, which sequence number represents the highest write, etc. – is purely a function of the cursor. The next pointer is only used for the transactional write protocol.

The final step in the puzzle is making thread E’s write visible. Thread E is still spinning trying to do an atomic update of the cursor from 3 to 4. Now the cursor is at 3, its next attempt will succeed:

Summary

The order that writes are visible is defined by the order in which threads claim slots rather than the order they commit their writes, but if you imagine these threads are pulling messages of a network messaging layer then this is really no different from the messages arriving at slightly different times, or the two threads racing to the slot claim in a different order.

So there we have it. It’s a pretty simple and elegant algorithm. (OK, I admit I was heavily involved in its creation!) Writes are atomic, transactional and lock-free, even with multiple writing threads.

(Thanks to Trish for the inspiration for the hand-drawn diagrams!)

Share this:
Facebook Twitter Digg Email

This entry was posted in Design, Java, Open Source, Performance. Bookmark the permalink.

14 Responses to The Disruptor – Lock-free publishing

  1. Trish says:

    Awesome, now I don’t need to explain writing to it! Did you actually draw and scan in your pictures?

  2. Trish says:

    Because I was thinking of taking that approach myself, but thought it would be more work. Suspect I was wrong…

  3. Olivier says:

    Hi Danny,

    Thanks for your post, thats very helpfull ;)

    Just a quick question regarding the “Committing the writes” section: which CAS operation are you talking about for the commit?

    private final class ConsumerTrackingProducerBarrier implements ProducerBarrier

    public void commit(final T entry)
    {
    long sequence = entry.getSequence();
    claimStrategy.waitForCursor(sequence – 1L, RingBuffer.this);
    cursor = sequence;
    waitStrategy.signalAll();
    }

    Seems to me cursor is modified by a volatile write, not by a CAS, but I am probably missing something ;-)

    • Danny says:

      Hi Olivier,

      Yes, you are right. There have been some recent performance improvements which I hadn’t spotted! :-) In this case, because the sequence number was atomically allocated we can be sure that no other thread will be waiting for the cursor to reach (sequence – 1), so we don’t need to do an atomic CAS operation – it’s sufficient to simply loop (busy spin) doing a volatile read, and then separately update the cursor. The volatile write is key to ensuring the changes are visible to other thread (as Mike explain in the discussion group).

  4. Pingback: How does LMAX’s disruptor pattern work? - Programmers Goodies

  5. Pingback: 12306票池架构探讨(三) | 编程·早晨

  6. Pingback: 12306票池架构探讨(三) « 成人免费资源三级分享网站

  7. Pingback: 并发译文翻译计划(二) - 并发编程网 - ifeve.com | 并发编程网 - ifeve.com

  8. Pingback: How does LMAX’s disruptor pattern work? | Everyday I'm coding

  9. James I says:

    Hi – thanks for the article. It was very useful.

    Could you explain if/how publishing remains lock-free & coherent in the case that the # messages written exceeding the size of the buffer (circular buffer over-writing scenario). I understood Trish’s post as saying that this could happen.

    The way I’m naively imagining this is that it needs to move the start pointer, but that would be referred to by a separate thread.

    • Danny says:

      At the time of writing this article, there was no protection against overflow (or rather, there was, but it was external to the Disruptor and extremely crude).

      I haven’t actively worked on the Disruptor for some time now, but I believe that the overflow issue has been solved but I’m afraid I can’t really give you the low-level details you’re looking for. I suggest that the discussion group (https://groups.google.com/forum/#!forum/lmax-disruptor) would be a good place for low-level questions. I know Mike is very active on there.

      Thanks!

  10. Alexandr says:

    Hi Danny!

    I’m right thinking that if ten processes will start sequentially writing into the buffer and the first of them is more slower then each other then the others nine will be wait for the slowest thread?

    • Danny says:

      Hi, Sorry for the slow response. I only just saw this message amongst the spam! Yes, you are right. Here’s a thought experiment for you… let’s say the slow one comes along last, so the 9 quick ones are done and the slow one is still in progress. Now what happens when more messages arrive? Those producers will be waiting behind the slow one. So where the slow one comes relative to faster ones is mostly irrelevant. The trick is not to be slow – not to block other producers for too long. Ultimately, there has to be a point at which the “correct” ordering is decided and in this algorithm that happens to be when the producers “arrive” rather than when they “complete”. It’s a trade off that was made to create a lock-free algorithm. I hope that helps.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>