📅 Monday, June 3rd, 2024

When a friend is in trouble, don't annoy him by asking if there is anything you can do. Think up something appropriate and do it.

— A. Powell Davies

ECS154A Lecture: cache

How to speed memory up? Caches! Caches exploit tepmoral and spatial locality

  • temporal: cache recently used data
  • spatial locality: cache data neighboring an access

How to measure the performance of memory access?

  • cache hit rate = number of hits / number of memory access = 1 - miss rate
  • cache miss rate (MR) = number of misses / number of memory access = 1 - cache hit rate
  • average memory access time (AMAT) =
    • if cache succeeds
    • if main memory access succeeds (i.e., address is not in swapfile)
    • if we get a miss on all levels, and processor ends up actually accessing the virtual memory (much slower since it’s accessing a swapfile/pagefile on disk)

terminology

  • capacity : size of cache

  • block size : size of one element of cache

  • number of blocks : blocks in cache

  • cache is organized by sets, and each address maps to exactly one set

  • cache line = cache block

  • cache set is a row in the cache, and cache line/block is a part of the set

cache types

  • direct mapped: 1 block per set
    • memory address breakdown: (msb) tag - set - byte offset (lsb)
      • Less significants bits are used to identify a set since more often than not we are working with a limited range of memory where the upper bits are identical. If the set bits are the upper bits instead, we’ll get conflict misses all the time.
    • cache access process
      • the set bits determine the cache line / set / block
      • the tag bits is compared to the stored tag bits in the block
      • if tag matches, the cached data is returned (at the specific offset of the requested size)
    • potential misses
      • compulsory miss: data was never read before; this equally happens to all types of caches
      • conflict miss: data accesses with different addresses share the same block; direct mapped cache is very susceptible to this type of miss, since there’s only one block per set
  • -way set associative: blocks per set
    • fewer set bits, more tag bits
    • format
    • Now that there are more than one block per cache, for each memory access we need to identify the cache set then compare the requested address’s tag with each block’s stored tag bits. if a tag match, return the corresponding block
  • fully associative: all blocks use one set
    • there are no set bits, so we need to compare the address’s tag with each block’s stored tag bits
    • not likely to be used in CPU
    • pros: reduce conflict misses
    • cons: expensive to build