Open In App

Factors affecting Cache Memory Performance

Last Updated : 11 Jan, 2021
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Save
Share
Report
News Follow

Computers are made of three primary blocs. A CPU, a memory, and an I/O system. The performance of a computer system is very much dependent on the speed with which the CPU can fetch instructions from the memory and write to the same memory. Computers are using cache memory to bridge the gap between the processor’s ability to execute instructions and the time it takes to fetch operations from main memory.

Time taken by a program to execute with a cache depends on

  • The number of instructions needed to perform the task.
  • The average number of CPU cycles needed to perform the desired task.
  • The CPU’s cycle time.

While Engineering any product or feature the generic structure of the device remains the same what changes the specific part of the device which needs to be optimized because of client requirements. How does an engineer go about improving the design? Simple we start by making a mathematical model connecting the inputs to the outputs.

Execution Time = Instruction Count x Cycles per Instruction x Cycle Time
=Instruction Count x (CPU Cycles per Instr. + Memory Cycles per Instr.) x Cycle Time
=Instruction Count x [CPU Cycles per Instr. +(References per Instr. x Cycles per References)] x Cycle Time

These four boxes represent four major pain points that can be addressed to have a significant performance change either positive or negative on the machine. The first element of the equation the number of instructions needed to perform a function is dependent on the instruction set architecture and is the same across all implementations. It is also dependent on the compiler’s design to produce efficient code. Optimizing compilers to execute functions with fewer executed instructions is desired.

CPU cycles per instructions are also dependent on compiler optimizations as the compiler can be made to choose instructions that are less CPU intensive and have a shorter path length. Pipelining instructions efficiently also improve this parameter which makes instructions maximize hardware resource optimization.

The average number of memory references per instruction and the average number of cycles per memory reference combine to form the average number of cycles per instruction. The former is a function of architecture and instruction selection algorithms of the compiler. This is constant across implementations of the architecture.  

Instruction Set Architecture :

  • Reduced Instruction Set Computer (RISC) –
    Reduced Instruction Set Computer (RISC) is one of the most popular instruction set. This is used by ARM processors and those are one of the most widely used chips for products.
     
  • Complex Instruction Set Computer (CISC) –
    Complex Instruction Set Computer (CISC) is an instruction set architecture for the very specialized operation which has been researched and studied upon so thoroughly that even the processor microarchitecture is built for that specific purpose only.
     
  • Minimal instruction set computers (MISC) –
    Minimal instruction set computers (MISC) the 8085 may be considered in this category compared to modern processors.
     
  • Explicitly parallel instruction computing (EPIC) –
    Explicitly parallel instruction computing (EPIC) is an instruction set that is widely used in supercomputers. 
     
  • One instruction set computer (OISC) –
    One instruction set computer (OISC) uses assembly only.
     
  • Zero instruction set computer (ZISC) –
    This is a neural network on a computer.

Compiler Technology :

  • Single Pass Compiler –
    This source code is directly converted into machine code.
     
  • Two-Pass Compiler –
    Source code is converted onto an intermediate representation which is converted into machine code.
     
  • Multipass Compiler –
    In this source code is converted into intermediate code from the front end then it is converted into intermediate code after the middle-end then passed to the back end which is converted into machine code.

CPU Implementation :

The micro-architecture is dependent upon the design philosophy and methodology of the Engineers involved in the process. Take a simple example of making a circuit to take input from a common jack passing it through an amplifier then storing the data in a buffer. 

Two approaches can be taken to solve the problem which is either putting a buffer in the beginning and putting two amplifiers and bypassing the current through either which would make sense if two different types of signals are supposed to be amplified or if there is a slight difference in the saturation region of the amplifiers. Or we could make a common current path and introduce a temporal dependence upon the buffer in which data is stored thereby eliminating the need for buffers altogether.

Minute differences like these in the VLSI microarchitecture of the processor create massive timing differences in the same Instruction Set Implementations by two different companies.

Cache and Memory Hierarchy :

This is again dependent upon the use case for which the system was built. Using a general-purpose computer also called a Personal Computer which can perform a wide variety of mathematical calculations and produce wide results but reasonably accurate for non-real-time systems in a hard real-time system will be very unwise.

A very big difference will be the time taken to access data in the cache.

A simple experiment may be run on your computer whereby you may find the cache size of your particular model of processor and try to access elements of an array around that array a massive speed down will be observed while trying to access an array greater than the cache size.


Dreaming of M.Tech in IIT? Get AIR under 100 with our GATE 2026 CSE & DA courses! Get flexible weekday/weekend options, live mentorship, and mock tests. Access exclusive features like All India Mock Tests, and Doubt Solving—your GATE success starts now!


Next Article

Similar Reads

Cache Memory Performance
Types of Caches : L1 Cache : Cache built in the CPU itself is known as L1 or Level 1 cache. This type of cache holds most recent data so when, the data is required again so the microprocessor inspects this cache first so it does not need to go through main memory or Level 2 cache. The main significance behind above concept is "Locality of reference
5 min read
Difference between Virtual memory and Cache memory
Virtual Memory and Cache Memory are important substructures of contemporary computing systems that perform an important function in terms of enhancing capabilities. But they are dissimilar in terms of functionality and function differently. Virtual memory works as extra physical memory of the system and Cache memory provides quick access to frequen
5 min read
Analytical Approach to optimize Multi Level Cache Performance
Prerequisite - Multilevel Cache OrganizationThe execution time of a program is the product of the total number of CPU cycles needed to execute a program. For a memory system with a single level of caching, the total cycle count is a function of the memory speed and the cache miss ratio.m(C) = f(S, C, A, B) m(C) = Cache miss ratioC = Cache SizeA = A
3 min read
Cache Memory Design
Prerequisite - Cache Memory A detailed discussion of the cache style is given in this article. The key elements are concisely summarized here. we are going to see that similar style problems should be self-addressed in addressing storage and cache style. They represent the subsequent categories: Cache size, Block size, Mapping function, Replacement
5 min read
Concept of Cache Memory Design
Cache Memory plays a significant role in reducing the processing time of a program by provide swift access to data/instructions. Cache memory is small and fast while the main memory is big and slow. The concept of caching is explained below. Caching Principle : The intent of cache memory is to provide the fastest access to resources without comprom
4 min read
Terminologies Cache Memory Organization
Cache Memory is a small, fast memory that holds a fraction of the overall contents of the memory. Its mathematical model is defined by its size, number of sets, associativity, block size, sub-block size, fetch strategy, and write strategy. Any node in the cache hierarchy can contain a common cache or two separate caches for instruction and or data.
4 min read
Difference between Cache Coherence and Memory Consistency
1. Cache coherence :Cache coherence in computer architecture refers to the consistency of shared resource data that is stored in multiple local caches. When clients in a system maintain caches of a shared memory resource, problems with incoherent data can arise, which is especially true for CPUs in a multiprocessing system.In a shared memory multip
2 min read
Difference between Cache Memory and Register
In the context of a computer’s architecture, two terms that may be familiar are the cache memory and the register, however, few people may not know the differences and functions of each of the two components. Both of them are important to the CPU but they have different roles and are used in different manner. In this article, we will look at some i
5 min read
Differences between Associative and Cache Memory
Memory system is the only subsystem of computer architecture that contributes a lot to the speed and efficiency of result processing. Among different types of memory that can be mentioned in discussions, associative and cache memories play crucial roles. Because associative memory is also called content-addressable memory (CAM), it retrieves data b
5 min read
Cache Memory in Computer Organization
Cache memory is a small, high-speed storage area in a computer. The cache is a smaller and faster memory that stores copies of the data from frequently used main memory locations. There are various independent caches in a CPU, which store instructions and data. The most important use of cache memory is that it is used to reduce the average time to
11 min read
Random Access Memory (RAM) and Read Only Memory (ROM)
Memory is a fundamental component of computing systems, essential for performing various tasks efficiently. It plays a crucial role in how computers operate, influencing speed, performance, and data management. In the realm of computer memory, two primary types stand out: Random Access Memory (RAM) and Read-Only Memory (ROM). Understanding the diff
8 min read
DNS Spoofing or DNS Cache poisoning
Prerequisite - Domain Name Server Before Discussing DNS Spoofing, First, discuss what is DNS.A Domain Name System (DNS) converts a human-readable name (such as www.geeksforgeeks.org) to a numeric IP address. The DNS system responds to one or more IP-address by which your computer connects to a website (such as geeksforgeeks.org) by using one of the
3 min read
Multilevel Cache Organisation
Cache is a random access memory used by the CPU to reduce the average time taken to access memory. Multilevel Caches is one of the techniques to improve Cache Performance by reducing the "MISS PENALTY". Miss Penalty refers to the extra time required to bring the data into cache from the Main memory whenever there is a "miss" in the cache. For clear
5 min read
Write Through and Write Back in Cache
Prerequisite - Multilevel Cache Organisation Cache is a technique of storing a copy of data temporarily in rapidly accessible storage memory. Cache stores most recently used words in small memory to increase the speed at which data is accessed. It acts as a buffer between RAM and CPU and thus increases the speed at which data is available to the pr
3 min read
Types of Cache Misses
Cache line prefetching is a technique used in computer processors to improve memory access performance. It involves fetching multiple contiguous cache lines from memory into the processor's cache in advance, anticipating that they will be needed in the near future. The cache is a small but fast memory located on the processor chip that stores recen
3 min read
Cache Coherence
Prerequisite - Cache Memory Cache coherence : In a multiprocessor system, data inconsistency may occur among adjacent levels or within the same level of the memory hierarchy. In a shared memory multiprocessor with a separate cache memory for each processor, it is possible to have many copies of any one instruction operand: one copy in the main memo
4 min read
Virtually Indexed Physically Tagged (VIPT) Cache
Prerequisites: Cache MemoryMemory AccessPagingTransition Look Aside Buffer Revisiting Cache AccessWhen a CPU generates physical address, the access to main memory precedes with access to cache. Data is checked in cache by using the tag and index/set bits as show here. Such cache where the tag and index bits are generated from physical address is ca
8 min read
Simultaneous and Hierarchical Cache Accesses
Prerequisite : Cache Organization Introduction :In this article we will try to understand about Simultaneous Cache access as well as Hierarchical Cache Access in detail and also understand how these access actually works whenever CPU (Central Processing Unit) requests for Main Memory Block which is being stored currently in cache memory. Before jum
9 min read
Basic Cache Optimization Techniques
Generally, in any device, memories that are large(in terms of capacity), fast and affordable are preferred. But all three qualities can't be achieved at the same time. The cost of the memory depends on its speed and capacity. With the Hierarchical Memory System, all three can be achieved simultaneously. The cache is a part of the hierarchy present
5 min read
Difference between Buffer and Cache
Among all the categories existing in the field of computing, both buffers and caches are important for improving system efficiency. But they are different from each other concerning their functions and the modes they employ. Thus, it is vital to comprehend the difference between a buffer and a cache for those who wants to consider their system more
5 min read
Performance of paging
Introduction: Paging is a memory management technique used in operating systems to divide a process's virtual memory into fixed-sized pages. The performance of paging depends on various factors, such as: Page size: The larger the page size, the less the number of page tables required, which can result in faster memory access times. However, larger
4 min read
Performance of 2-level Paging
INTRODUCTION: Two-level paging is a hierarchical paging technique that divides the virtual address space into two levels of page tables: the top-level page table and the second-level page table. The top-level page table maps virtual memory addresses to second-level page tables, while the second-level page tables map virtual memory addresses to phys
5 min read
Performance of Computer in Computer Organization
In computer organization, performance refers to the speed and efficiency at which a computer system can execute tasks and process data. A high-performing computer system is one that can perform tasks quickly and efficiently while minimizing the amount of time and resources required to complete these tasks. Here are several factors that can impact t
6 min read
Memory Allocation Techniques | Mapping Virtual Addresses to Physical Addresses
Prerequisite : Requirements of Memory Management System, Logical and Physical Address Memory Allocation Techniques:To store the data and to manage the processes, we need a large-sized memory and, at the same time, we need to access the data as fast as possible. But if we increase the size of memory, the access time will also increase and, as we kno
5 min read
Allocating kernel memory (buddy system and slab system)
Prerequisite - Buddy System Introduction: Allocating kernel memory is a critical task in operating system design, as the kernel needs to manage memory efficiently and effectively to ensure optimal system performance. Two common methods for allocating kernel memory are the buddy system and the slab system. The buddy system is a memory allocation alg
9 min read
Memory mapped I/O and Isolated I/O
As a CPU needs to communicate with the various memory and input-output devices (I/O) as we know data between the processor and these devices flow with the help of the system bus. There are three ways in which system bus can be allotted to them : Separate set of address, control and data bus to I/O and memory.Have common bus (data and address) for I
5 min read
Difference between Memory based and Register based Addressing Modes
Prerequisite - Addressing Modes Addressing modes are the operations field specifies the operations which need to be performed. The operation must be executed on some data which is already stored in computer registers or in the memory. The way of choosing operands during program execution is dependent on addressing modes of instruction. "The address
4 min read
Magnetic Random Access Memory (M-RAM)
MRAM stands for magnetoresistive random access memory and is a non-volatile type of RAM. Magnetic state refers to the electrical resistance of a metal when it is placed in a magnetic field. The MRAM uses magnetic states and magnetisation direction in a ferromagnetic material(a material which is highly susceptible to magnetization to store data bits
4 min read
Memory Organisation in Computer Architecture
The memory is organized in the form of a cell, each cell is able to be identified with a unique number called address. Each cell is able to recognize control signals such as “read” and “write”, generated by CPU when it wants to read or write address. Whenever CPU executes the program there is a need to transfer the instruction from the memory to CP
2 min read
Implementing Non-contiguous Memory Management Techniques
Memory Management Techniques are basic techniques that are used in managing the memory in operating system. Memory Management Techniques are basically classified into two categories: (i) Contiguous (ii) Non-contiguous We have already discussed the implementation of contiguous in the article Implementation of Contiguous Memory Management Techniques.
3 min read
  翻译: