With the advent of artificial intelligence (AI), performance and model runtime feasibility poses a challenge to the advancement of AI technology. Novel methods of accelerating the core mathematical functions of AI applications are being explored. The crux of AI computations that would benefit from hardware acceleration is matrix multiplication. This thesis explores the acceleration of matrix multiplication using systolic arrays and the strassen algorithm, methods known for enhancing computational efficiency through parallel processing. The research focuses on the design, implementation, and comprehensive testing of these architectures to expedite matrix multiplication tasks, crucial for applications in deep learning and signal processing. By comparing various design methodologies and evaluating their performance among different scenarios, the thesis aims to identify optimal configurations that maximize processing speed and efficiency as well as determine the circumstances for which method should be deployed. This paper contributes to the advancement of our understanding high-performance computing trade-offs by providing insights into approaches of hardware acceleration.

Publication Date


Document Type

Master's Project

Student Type


Degree Name

Electrical Engineering (MS)

Department, Program, or Center

Electrical and Microelectronic Engineering, Department of


Kate Gleason College of Engineering


Mark A. Indovina

Advisor/Committee Member

Ferat Sahin


RIT – Main Campus