The CPU cache is a tiny temporary memory located on the CPU die itself. It storesprefetched data that the CPU will likely need for quick access. This is necessary to ensure the RAM doesn’t bottleneck the CPU.
Modern CPUs typically implement CPU cache in 3 levels –L1,L2, andL3. These play an important part in determining CPU performance (especially for certain tasks like gaming).
So, let’s look at how CPU cache works, why it matters, and how much CPU cache you’ll need for your workloads.
What Does the CPU Cache Do
The programs that you run are first loaded into the RAM. The CPUfetches,decodes, andexecutesinstructions from the main memory.
The ‘problem’ with this is that modern processors are extremely powerful (capable of executing billions of instructions per second).
For instance, theAMD Ryzen 9 3950Xhas a base clock speed of 3.5 GHz (3.5 billion cycles per second). It can execute over a hundred instructions in asingle clock cycle.
However, accessing data from the RAM may takehundreds of cycles. That is a lot of wasted cycles that the CPU is stalled for.
If the CPU had to access data from the RAM every time, that wouldcreate a significant bottleneckand cripple system performance. This is where the CPU cache comes into play.
The CPU analyzes access patterns to predict what data and instructions it’ll likely need next. Then, it moves them from the RAM to the CPU cache before they’re actually needed (this is calledprefetching).
Depending on the level, accessing data from the CPU cache can be over a hundred times faster than doing so from the RAM. So, the CPU delay is significantly reduced.
L1 vs L2 vs L3 Cache
Current CPUs implement 3 levels of CPU cache to maximize performance. This allows them to hit the sweet spot for cache size, latency, and hit rate.
you may get the exact numbers for your CPU online or using system profiling tools likeCPU-ZandHWiNFO.
On myRyzen 7 5700G, you may see that it’s split intoL1 DataandL1 Instructions. 32 KB of both caches is embedded into all 8 cores. This means the total L1 cache is 512 KB.
As the L1 cache is the smallest/fastest memory level, the CPU first checks whether the required data is in L1. If the data is present, it immediately reads from or writes to L1. This is called acache hit.
Sometimes, the required data won’t be in L1. This is called acache miss. In this case, the CPU checks the next fastest cache level i.e. L2.
The L2 cache is larger but slower compared to L1. It can be implemented per core, or as a shared pool. On the 5700G, it’s split 8-way (512 KB per core), which totals 4 MB.
If a cache miss occurs in L2, the CPU checks L3 next. This is the largest CPU cache level, but it also has the highest latency. For instance, the 5700G has a 16 MB L3 cache implemented as a shared pool.
If a cache miss occurs again, the CPU checks the RAM, and then the storage drive.
CPU Cache Levels Up Close
Before moving on, let’s see what the CPU cache levels look like on an actual CPU die to understand things better.
If you take apart a CPU and sand the bottom layer of the CPU die, you may expose the actual CPU circuits.
For instance, the bottom layer of ani9-13900KCPU die looks something like this:
Rotate the picture anti-clockwise to make the closeup horizontal. Then, compare it to this die-shot interpretation. You’ll see exactly how the different cache levels are implemented.
By checking the data from system profiling tools, you’ll have an even clearer idea of the CPU cache distribution.
In the i9-13900K’s case, you may see how the L1 and L2 caches are distributed across the P-cores and E-cores.
How Much CPU Cache Do You Need
The CPU cache is clearly important for CPU performance. But what does that mean for the end-user? Are CPUs with higher cache always better?
It all depends on what you’ll use the CPU for.
There are many factors to consider whenchoosing a CPU– clock speed,core count, CPU generation, architecture, TDP, cache, and so on. All of these are interlinked and determine the CPU performance together.
So, generally, it’s hard to single out one element like the cache, and attribute performance to that. But there are exceptions.
TakeAMD’s X3D gaming CPUs, for instance. The Ryzen5800Xand5800X3Dare mostly similar. The only difference is a slightly lower clock speed buttriplethe L3 cache on the5800X3D(32 MB vs 96 MB).
The benchmarks for these processors show that performance differs according to the workload.
To reiterate, there’s no set number for the best cache amount. It can have virtually no impact or make a massive difference depending on the workload. So, it justdepends on what you’ll use the CPU for.
Most consumer CPUs have a standard amount of CPU cache intended to work for most people. Whatever CPU you’re planning to get,check the benchmarks onlineandsee how it performs in tasks that you’ll mostly use it for.
If there are similar options with higher or lower cache, check the benchmarks for them too. Then, decide which one will better fityour use cases.