Abstract
Data exchange between a Central Processing Unit (CPU) and a Graphic Processing Unit (GPU) can be very expensive in terms of performance. The characterization of data and cache memory access patterns differ between a CPU and a GPU. The motivation of this research is to analyze the cache memory access patterns of GPU architectures and to potentially improve data exchange between a CPU and GPU. The methodology of this work uses Multi2Sim GPU simulator for AMD Radeon and NVIDIA Kepler GPU architectures. This simulator, used to emulate the GPU architecture in software, enables certain code modifications for the L1 and L2 cache memory blocks. Multi2Sim was configured to run multiple benchmarks to analyze and record how the benchmarks access GPU cache memory. The recorded results were used to study three main metrics: (1) Most Recently Used (MRU) and Least Recently Used (LRU) accesses for L1 and L2 caches, (2) Inter-warp and Intra-warp cache memory accesses in the GPU architecture for different sets of workloads, and (3) To record and compare the GPU cache access patterns for certain machine learning benchmarks with its general purpose counterparts.
Library of Congress Subject Headings
Cache memory; Graphics processing units
Publication Date
7-2018
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Sonia Lopez Alarcon
Advisor/Committee Member
Amlan Ganguly
Advisor/Committee Member
Roy Melton
Recommended Citation
Nimkar, Yash, "Cache Memory Access Patterns in the GPU Architecture" (2018). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/9830
Campus
RIT – Main Campus
Plan Codes
CMPE-MS