According to Alexa, my employer’s web site is the 13th most popular web site in the UK, the 66th most popular in the US and the 127th most popular in the world. At least it is today. Not too shabby, eh?
So real-time analytics is fun. At our peak time (UK lunch time) we get in the region of 500 views per second on our article pages. This doesn’t include the various index pages or the home page, or any of the analytics data we capture around video views, etc. And our editorial team want to see in real time how their articles are performing so they can tweak them, move them around on the home page, etc.
To enable real-time slice and dice capabilities we hold LOTS of counters in Redis. The general schema for these is:
dimension1:dimension2:dimension3:dimension4:articleId:dimension5:date:hour:minute = counter
What the individual dimensions are is largely irrelevant for this post. Many (but not all) of the fields can also hold “*” instead of real data. And there can be many permutations of fields with real data in and fields with “*” in. So, for example, we might have the following:
d1:d2:d3:d4:articleId:d5:date:hour:* = counter
to represent all the views to a particular article across the 5 other dimensions in a given hour. Or
d1:d2:d3:d4:*:d5:date:*:* = counter
to represent all the views across dimensions d1-d5 for any article on a given date.
So each of the 500 hits per second that the analytics platform receives can generate a large number of Redis updates to satisfy the various combinations of wildcards. Our two primary Redis instances rumble along at a sustained rate of around 20,000 operations per second.
Now, all of this data takes space. Serious space. We shard this data across a number of Redis instance based on what type of data it is (page views, visitor data, video data, metadata, etc.). Following the release of some new business requirements recently, our two primary instances were using in the region of 20-24 GB of RAM each, and holding something like 110,000,000 keys each. Ouch!
A bit of lateral thinking was required.
Those keys work out on average around 40 bytes each. And the counters in them are, on average, around 2-3 bytes. That’s a massive overhead for the key names. The key names represent around 90% of the memory usage and the actual data only around 10%. And the keys contain an awful lot of duplication too – for example, dimension 1 can only take 2 possible values, but they are represented by a 4 character string and a 5 character string. So we store that 4 or 5 character string in EVERY SINGLE KEY.
There are a number of possible approaches to this including simply shortening the segments by making them a bit less semantically meaningful (for example, instead of storing dates as `20130815` we could store something like the number of days since January 1st 2010), or using some kind of dictionary-based approach whereby we assign every unique value that a dimension could have to an integer and then have keys that look like 12:3917:2385:1222:98737:3746:5543. Good luck debugging either of those.
In the end we took a pretty simple approach. We realised that the article ID dimension had by far the largest cardinality. So instead of making it part of the key and having Redis keep simple counters, we instead changed the Redis data structures to hashes keyed on the article ID. Thus:
d1:d2:d3:d4:articleId:d5:date:hour:minute = counter
d1:d2:d3:d4:d5:date:hour:minute = hash of articleID -> counter
By this simple expedient, we managed to reduce the memory usage on the primary servers from around 24GB to around 8GB (a saving of 66%) and the number of keys from 110,000,000 to around 2,500,000 (a saving of around 98%!). Ops team love us. CTO’s budget spreadsheet is looking more chilled out. Epic win.
There are a number of other advantages to this approach beyond the obvious reduction in hardware:
1) Because we’re using so much less memory, dumping RDB files is much quicker.
2) Because writing RDB files is much quicker, the number of changes that happen during the dump is greatly reduced meaning less memory pressuring during the dump.
3) The new data structure exhibits much better locality of reference, meaning that for any given workload that happens during the dump, fewer pages need to be copied. This means, again, even less memory pressure during RDB dump.
It does have some drawbacks though. In particular, you have changed the data access paths.
One of our screens has a “sparkline” of how a particular article has trended over the last 2 hours. Previously, we would gather the data for this by issuing a single Redis MGET command with 120 keys (one for each minute) where the hour and minute fields in the key ranged over the time period we were interested in.
Now, however, we cannot do that because there is no command in Redis to get one hash member (the article ID) out of 120 different hashes. The solution to this is a story for a different post…
So the moral of this story is this. When you are working with large volumes of data in a data store (but an in-memory one in particular):
1) Never let a prototype go live
2) Think about whether your primary workload will be storage or retrieval.
3) Optimise your schema (in this case, the Redis key structure) accordingly.
4) Understand your data and think about how you’re storing it.
(This is also a good opportunity to mention that we took the data in the old storage format and migrated it to the new format using my Node.JS-based streaming RDB toolkit).