Thanks for the (LinkedIn) memories

I got sucked into a LinkedIn post on “Why Dictionaries are 7 x slower than arrays”, and shuddered as the BigO crowd leapt all over it, completely missing the point entirely. It wasn’t supposed to be about O(n)Log(banana)^Transformer. It was about memory. And computer architecture.

So please, indulge me, here is the real reason why a C# Dictionary is 7x slower than an array.

If you have a standard array (e.g., MyClass[], not a List<>!), it’s contiguous in memory, which means:

  • Sequential access with great cache locality
  • No-brainer prefetching for the CPU
  • Little to no pointer indirection since the size and location of each instance are known ahead of time

You’ve got the size, you’ve got the location, it’s already in the cache—you’re golden!

Now, enter Dictionary<int, MyClass>.

Oh boy. First, we need a hash. Not great, but whatever, we can do a % without overheating, no big deal. Now we’ve got a bucket. If we’re lucky, the first entry is what we’re looking for—no need to walk the chain looking for our hash. Phew!

But even if we do have to probe further, it’s not that bad because the dictionary stores its entries in a contiguous array of structs (Entry<TKey, TValue>), which means:

  • When you pull in one entry, you’re pulling in a cacheline’s worth of them (~24 bytes per entry, rounded up).
  • So, as long as you stay within that cacheline, things remain decently fast.

Already, this is getting a bit beefy. And we still haven’t even touched our actual data yet.

The real problem? Heap allocation.

At this point, we’ve found our TValue in the Entry[] array. But now, all bets are off.

With a contiguous array, the next bit of data is guaranteed to be right after the last bit.
With a dictionary? Nope.

Your dict[0], dict[1], and dict[2] could live literally anywhere in memory.
Each lookup could cause cache misses, cache evictions, and CPU stalls all over the place.

But dict[0], dict[1], and dict[2] looks so convincing. That delicious syntactic sugar. But remember when you allocated the memory for those entries? It was off the heap. Was it immediately after each other? Maybe. Does that it make it contiguous? Hell no!

And that’s the real reason dictionaries are slower than arrays.

Memory is slow. CPUs stall if they don’t get the data they need. That’s it.

The big caveat

That said, computers are plenty fast, and dictionaries are super useful. I use them all over the place. And when I don’t, I use List<>.

Why? Because convenience outweighs speed… until it doesn’t.
And when it doesn’t?
You better know how to diagnose, debug, and fix it.

Randall Hyde has a fantastic series of books on Amazon called Write Great Code. if you haven’t, go read them. When you better understand the machine you will write better code.

Or at least, when you write sloppy code, you will know why its sloppy and you’ll be able to justify it 🙂