Abstract
As modern computing workloads become increasingly data-intensive, the limitations of traditional digital computing hardware, especially in terms of communication bandwidth between memory and processors, are becoming more apparent. Surprisingly, the performance and energy efficiency of the state-of-the-art are not primarily constrained by the computing power of advanced processing architectures but rather by the low communication bandwidth between the memory and the processor. This challenge necessitates a clever re-architecting of digital computing hardware to make the communication between the processor and the memory more energy-efficient, faster, and capable of supporting significantly higher bandwidth. Memory-centric computing emerges as a novel, alternate computing paradigm that addresses these requirements through the integration of processing logic within the memory device itself. By enabling highly localized computing within the memory, memory-centric architectures not only demonstrate higher energy efficiency and lower computational latency but also support massively parallel computing performance within a compact form factor. However, this new computing paradigm also introduces unique design challenges, demanding careful architecting of novel and efficient computing techniques and dataflow designs tailored to the constraints of memory architectures. Further, enhancing functional flexibility, modularity, and design scalability continues to pose significant challenges in this research domain. This work tackles the existing challenges in supporting versatile, heterogeneous computing within the memory while also maintaining high energy efficiency and computational parallelism. This is achieved through a novel, programmable near-memory computing technique that employs a cluster of re-writable look-up tables (LUT) working collectively to support various logic/arithmetic operations. Additionally, the in-situ programmability of this processing architecture is facilitated with a custom-designed instruction set architecture. Furthermore, the integration of a plurality of this processing architecture within the memory cell arrays (i.e., banks) minimizes latency and energy loss while also maximizing the bandwidth of data communication between the processor and memory. This design solution also opens the path to efficient design scaling, supporting highly versatile, heterogeneous computing workloads on the same computing platform.
Library of Congress Subject Headings
Memory management (Computer science); High performance processors; Computer architecture
Publication Date
5-29-2024
Document Type
Dissertation
Student Type
Graduate
Degree Name
Electrical and Computer Engineering (Ph.D)
Department, Program, or Center
Electrical and Computer Engineering Technology
College
Kate Gleason College of Engineering
Advisor
Amlan Ganguly
Advisor/Committee Member
Minseok Kwon
Advisor/Committee Member
Mark Indovina
Recommended Citation
Sutradhar, Purab Ranjan, "A Programmable Look-up Table-based Processing in Memory Architecture for Massively Parallel, Energy-Efficient Processing of Data-intensive Applications within DRAM" (2024). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11812
Campus
RIT – Main Campus
Plan Codes
ECE-PHD