1. Introduction
The purpo of this paper is two fold. The first part gives an overview of cache, while the
cond part explains how the Pentium Processor implements cache.
A simplified model of a cache system will be examined first. The simplified model is expanded
to explain: how a cache works, what kind of cycles a cache us, common architectures, major
components, and common organization schemes.
The implementation may differ in actual designs, however the concepts remain the same. This
eliminates unnecessary detail and background in hardware design or system dependency. The
cond part of this paper gives more detail and specifics of how the internal caches work on the
An Overview of Cache
Page 2
2.1 Basic Model
CPU
Cache Memory
Main
DRAM
Memory
System Interface
Figure 2-1 Basic Cache Model
Figure 2-1 shows a simplified diagram of a system with cache. In this system, every time the
CPU performs a read or write, the cache may intercept the bus transaction, allowing the cache to
decrea the respon time of the system. Before discussing this cache model, lets define
some of the common terms ud when talking about cache.
2.1.1 Cache Hits
When the cache contains the information requested, the transaction is said to be
a cache hit.
2.1.2 Cache Miss
When the cache does not contain the information requested, the transaction is
said to be a cache miss.
2.1.3 Cache Consistency
An Overview of Cache
Page 3
Now that we have some names for cache functions lets e how caches are designed and how
this effects their function.
2.2 Cache Architecture
Caches have two characteristics , a read architecture and a write policy. The read architecture
may be either “Look Aside” or “Look Through.” The write policy may be either “Write-Back” or
“Write-Through.” Both types of read architectures may have either type of write policy,
depending on the design. Write policies will be described in more detail in the next ction. Lets
examine the read architecture now.
2.2.1 Look Aside
CPU
SRAM
Cache Controller
Tag RAM
System Interface
Figure 2-2 Look Aside Cache
An Overview of Cache
Page 4
2.2.2 Read Architecture: Look Through
CPU
SRAMCache ControllerTag RAM
System Interface
Figure 2-3 Look Through Cache
Figure 2-3 shows a simple diagram of cache architecture. Again, main memory is located
opposite the system interface. The discerning feature of this cache unit is that it sits between
the processor and main memory. It is important to notice that cache es the processors bus
cycle before allowing it to pass on to the system bus.
2.2.2.1 Look Through Read Cycle Example
When the processor starts a memory access, the cache checks to e if that address is a
cache hit.
HIT:
The cache responds to the processor’s request without starting an access to
An Overview of Cache
Page 5
therefore less expensive to implement. The performance with a Write-Through policy is lower
since the processor must wait for main memory to accept the data.
2.3 Cache Components
The cache sub-system can be divided into three functional blocks: SRAM, Tag RAM, and the
Cache Controller. In actual designs, the blocks may be implemented by multiple chips or all
may be combined into a single chip.
2.3.1 SRAM
Static Random Access Memory (SRAM) is the memory block which holds the data. The size of
the SRAM determines the size of the cache.
2.3.2 Tag RAM
Tag RAM (TRAM) is a small piece of SRAM that stores the address of the data that is stored
in the SRAM.
2.3.3 Cache Controller
The cache controller is the brains behind the cache. Its responsibilities include: performing the
snoops and snarfs, updating the SRAM and TRAM and implementing the write policy. The
2
cache controller is also responsible for determining if memory request is cacheable
and if a
request is a cache hit or miss.
2.4 Cache Organization
Main Memory
Cache
Page
Cache Line
Cache Line
Cache Line
Cache Line
:
:
Cache Line
An Overview of Cache
Page 6
called a
cache line
. The size of a cache line is determined by both the processor and the cache
design. Figure 2-4 shows how main memory can be broken into cache pages and how each
cache page is divided into cache lines. We will discuss cache organizations and how to
determine the size of a cache page in the following ctions.
2.4.1 Fully-Associative
Main Memory
Line m
:
:
Line 2
Line 1
Line 0
An Overview of Cache
Page 7
2.4.2 Direct Map
Main Memory Pages
Line n
.
Page m
:
Line n
Line 0
.
Line n
Page 1
.
Page 0
:
:
Line 0
Line 0
Cache Memory
Line n
.
.
.
Line 0
Figure 2-6 Direct Mapped
Direct Mapped cache is also referred to as 1-Way t associative cache. Figure 2-6 shows a
diagram of a direct map scheme. In this scheme, main memory is divided into cache pages.
The size of each page is equal to the size of the cache. Unlike the fully associative cache, the
direct map cache may only store a specific line of memory within the same line of cache. For
example, Line 0 of any page in memory must be stored in Line 0 of cache memory. Therefore if
Line 0 of Page 0 is stored within the cache and Line 0 of page 1 is requested, then Line 0 of
Page 0 will be replaced with Line 0 of Page 1. This scheme directly maps a memory line into an
equivalent cache line, hence the name Direct Mapped cache.
A Direct Mapped cache scheme is the least complex of all three caching schemes. Direct
Mapped cache only requires that the current requested address be compared with only one
cache address. Since this implementation is less complex, it is far less expensive than the other
caching schemes. The disadvantage is that Direct Mapped cache is far less flexible making the
performance much lower, especially when jumping between cache pages.
2.4.3 Set Associative
Main Memory Pages
Line n
.
An Overview of Cache
Page 8
Associate cache scheme. In this scheme, two lines of memory may be stored at any time. This
helps to reduce the number of times the cache line data is written-over?
This scheme is less complex than a Fully-Associative cache becau the number of comparitors
is equal to the number of cache ways. A 2-Way Set-Associate cache only requires two
comparitors making this scheme less expensive than a fully-associative scheme.
3. The Pentium(R) Processors Cache
This ction examines internal cache on the Pentium(R) processor. The purpo of this ction
is to describe the cache scheme that the Pentium(R) processor us and to provide an overview
of how the Pentium(R) processor maintains cache consistency within a system.
The above ction broke cache into neat little categories. However, in actual implementations,
An Overview of Cache
Page 9
CPU
L1 Cahce
Memory
L2 Cache
Memory
Main
DRAM
Memory
System Interface
Figure 3-1 Pentium Processor with L2 cache
®
When developing a system with a Pentium(R) processor, it is common to add an external
cache. External cache is the cond cache in a Pentium(R) processor system, therefore it is
called a Level 2 (or L2) cache. The internal processor cache is referred to as a Level 1 (or L1)
cache. The names L1 and L2 do not depend on where the cache is physically located,( i.e.,
internal or external). Rather, it depends on what is first accesd by the L1 cache
is accesd before L2 whenever a memory request is generated). Figure 3-1 shows how L1 and
L2 caches relate to each other in a Pentium(R) processor system.
3.1 Cache Organization
Main Memory Pages
Line 127
.
Page m
:
Line 127
Line 0
.
Line 127
.
Page 1
:
Page 0
Line 0
:
Line 0
Cache Memory
Way 0Way 1
Line 127Line 127
..
..
..
Line 0Line 0
Figure 3-2 Internal Pentium Processor Cache Scheme
®
Both caches are 2-way t-associative in structure. The cache line size is 32 bytes, or 256 bits.
A cache line is filled by a burst of four reads on the processor’s 64-bit data bus. Each cache way
contains 128 cache lines. The cache page size is 4K, or 128 lines. Figure 3-2 shows a diagram
An Overview of Cache
Page 10
suggests, the CD bit allows the ur to disable the Pentium(R) processors internal cache. When
CD = 1, the cache is disabled, CD = 0 cache is enabled. The NW bit allows the cache to be
either write-through (NW = 0) or write-back (NW = 1).
The Pentium(R) processor maintains cache consistency with the MESI protocol. MESI is ud
5
to allow the cache to decide if a memory entry should be updated or invalidated. With the
Pentium(R) processor, two functions are performed to allow its internal cache to stay consistent,
Snoop Cycles and Cache Flushing.
The Pentium(R) processor snoops during memory transactions on the system bus
. That is,
when another bus master performs a write, the Pentium(R) processor snoops the address. If the
Pentium(R) processor contains the data, the processor will schedule a write-back.
Cache flushing is the mechanism by which the Pentium(R) processor clears its cache. A cache
flush may result from actions in either hardware or software. During a cache flush, the
Pentium(R) processor writes back all modified (or dirty) data. It then invalidates its cache,(i.e.,
本文发布于:2023-11-24 12:34:42,感谢您对本站的认可!
本文链接:https://www.wtabcd.cn/zhishi/a/1700800482225024.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文word下载地址:Cache介绍.doc
本文 PDF 下载地址:Cache介绍.pdf
留言与评论(共有 0 条评论) |