cache_txt

Let's Cache In

How Many Dang Cache's are there--anyway....

What is Cache Memory:

Cache memory is a special high speed memory designed to supply the processor with the most frequently requested instructions and data. Instructions and data located in cache memory can be accessed many times faster than instructions and data located in main memory. The more instructions and data the processor can access directly from cache memory, the faster the computer runs as a whole.

Levels of Cache:

In general, there are two levels of cache memory; internal cache, which is typically located inside the CPU chip; and external cache, which is normally located on the system board. Internal cache is sometimes referred to as primary cache or level 1 (L1) cache.
External cache is sometimes referred to as secondary cache or level 2 (L2) cache. In most desktop personal computers, the internal cache will range from 1KB to 32KB (1,000 to 32,000 bytes) in size. In contrast, external cache configurations are usually much larger, ranging in size from 64KB to 1MB (64,000 to 1,000,000 bytes). When we talk about upgrading cache, we are most often talking about external cache. Upgrading external cache may involve plugging individual cache components into sockets located on the system board or plugging a cache module into a dedicated cache expansion socket. In most cases, upgrading internal cache would require the replacement of the CPU.

Note: Some early model personal computers i.e., 286 and 386 based systems have CPU chips that contain no internal cache. In these cases, the external cache (if present) would actually be the primary cache, also referred to as the level 1 (L1) cache.

How does Cache Memory work:

An interesting way to look at cache is to imagine yourself at a party with a host that is required to serve you the exact beverage you request. The beverages are the data, the corner store is main memory, and the refrigerator is cache memory. If someone at the party requests a diet pepsi, the host of the party makes a trip to the refrigerator first, to see if it is there. If the diet pepsi is in the refrigerator, the requester can have it right away. However, if it is not in the refrigerator, the host has to run to the corner store to get it. This may take considerably longer. The host can save a lot of time by purchasing a 6-pack at the store. This logic insures that most of the time, the next request can be fulfilled directly from the refrigerator.

In the same way, when the cache controller retrieves an instruction from main memory, it also takes the next several instructions back to cache with it. This increases the chances that the next instruction requested by the CPU is already in cache. (When a request from the CPU is found in cache, this is refered to as a �cache hit�).

How much cache should I have:

On a typical 100MHz Intel motherboard, it takes the CPU as much as 180ns to get information from main memory versus as little as 45ns to get information from cache memory. (This represents the total memory retrieval process, including request, verification, and data access time) With the incredible performance advantage cache memory offers, it would seem logical to use cache for all the computer�s main memory.
However, cache memory typically uses SRAM (Static RAM) chips, which cost more thansix times as much as the DRAM chips normally used for main memory.
Thus, it is not cost effective to to use a large amount of cache in a system. In our party example, using cache as main memory would be similar to buying the corner store in order to stock every type of beverage that exists. While having one refrigerator saves a lot of time and inconvenience, the added benefit of having the corner store in the back yard may not be worth the investment. This is how cache works as well. The first 256K of cache saves the comter a lot of time by holding all the most frequently used instructions. However, adding
256K more of cache for a total of 512K does not increase the overall performance of the computer as much as the first 256K does.

Glossary of terms:

Asynchronous SRAM - An SRAM that does not require a clock signal to validate its control signals. About 30% lower in price and performance compared to synchronous SRAM.

Burst - a type of synchronous cache; 30-50% faster than asynchronous and about 50% more expensive.

Cache controller - The circuit in control of the interface between the CPU, cache and DRAM (main memory) controller.

Cache hit - when the address requested by the CPU is found in cache.

Cache miss - when the address requested by the CPU is not found in cache.

CELP socket - Card Edge Low Profile. The type of socket normally used for cache modules.

COAST - Cache On A Stick. A popular design specification for cache modules.

Direct-mapped cache - A cache where there is only one possible location for each data entry.

External cache - Cache that resides outside the processor; usually is soldered on the system board close to the processor or in the form of a cache module in a socket near the processor.

Full-associative cache - A cache policy which allows any main memory location to be mapped to any cache line.

Index - The subset of the CPU address bits used to get a specific location within cache

Internal cache - Cache that is typically located inside the CPU chip.

Level One (L1) - Cache that is closest to the processor; typically located inside the CPU chip. Also refered to as primary cache.

Level Two (L2) - Cache that is second closest to the processor; typically located on the system board. Also referred to as secondary cache.

Pipeline burst - a type of synchronous cache that is slightly less expensive than burst but has similar performance. (Burst is capable of going faster, but the system board does not often take advantage of this capability)

Primary cache - Same as level 1 cache.

Secondary cache - Same as level 2 cache.

Set-associativity - The number of locations into which a single main memory address can be placed within cache.

SRAM - Static Randomn Access Memory. This is the type of memory chip that is normally used on cache modules. It is much faster and more expensive than DRAM chips.

Synchronous SRAM - An SRAM that requires a clock signal to validate its control signals. This enables the cache memory to run lockstep with the CPU. Can be either Burst or Pipelined Burst.

Tag - The subset of the CPU address bits used to compare the tag bits of the cache directory to the main memory address being accessed.

Tag RAM - cache is physically divided into two sections. The Tag RAM section stores the Tag address of the location of the data in cache. This section is smaller than the Data RAM section, which stores the actual data or instruction.

Write Back (or copy back) - Data written into the cache by the CPU is not written into main memory until that data line in the cache is to be replaced.

Write Through - A technique for writing data from the CPU simultaneously into the cache and into main memory to assure coherency.

A Customer Sent Me This, of which is a Good Explaination:

The level 1 (L1) cache is the smallest and fastest. It sits on the Pentium II chip itself. The level 2 (L2) cache is a little larger and slower and is off the chip, but in that big slot 1 cartridge. Newer chips have the L2 cache on the chip itself.

When the CPU needs some information it looks in the L1 cache. If it is there, thats great. If not it looks in the L2 cache. If it isn't there either, it goes to main memory. So, if you are careful about keeping the most-used information in your L1 or L2 cache, which are faster to access than main memory, your program will be faster.

Since the L2 cache on a pentium II is 512KB, any program has a pretty good chance of finding its information there.

The principal is exactly the same as a disk cache. Keep the small amount of information you need to use often somewhere fast (cache), and keep the huge amount of information you almost never need somewhere slow (not cache).

Cache design is one the most important aspects of the latest CPUs. For example, the newest Celerons have only 128KB of L2 cache, but perform about as well as Pentium IIs on most tasks, because it is can be accessed more quickly.

INDEX TOP