Abstract

Software engineering exams serve as a tool for evaluating a broad range of skills in the classroom, including theoretical understanding, practical application, and process reasoning. Despite their importance, post-assessment analysis is often overlooked, and the absence of structured reflection by instructors can limit their effectiveness and mask patterns in student performance. By treating exams as data, educators can uncover trends that drive more effective teaching strategies, refine evaluation methods, and work to strengthen student support systems. We conducted a systematic analysis of existing exam data and administered a student survey to determine if student perceptions align with actual outcomes, asking students to predict and postdict their exam scores. We examined 160 graded paper and online exam sessions from three different software engineering courses, totaling 4259 individual exams, to determine if the submission order of exams correlate with grade distributions. Using polynomial regression, rolling standard deviation, Spearman Rank Correlation, stratification, and paired t-tests, we analyzed both exam data and survey responses to identify patterns in student performance and perception. Results indicate that there is no statistically significant correlation between exam turn-in order and performance. Students who remained until the end of the exam session tended to cluster at extremes, representing both the lowest and highest scoring groups. Grade variability remained largely consistent across individual exam sessions showing no discernible trend in 71% of cases. Across exams, student performance was highly consistent, with scores tending to fall within a margin of 5.73 points of one another. These findings highlight that while exam timing behaviors may reflect individual strategies or confidence levels, overall performance remains stable across contexts. Survey data indicates that students' were more accurate in their ability to postdict their exam scores compared to prediction and that calibration did occur as performance improved with repeated measurements. Although most analyses showed no significant correlations, we find it is still valuable to publish the results as reporting null findings helps challenge unsupported assumptions. This paper presents a replicable model for enhancing assessment practices in engineering education to empower instructors to implement evidence-based strategies in supporting student success. Though software engineering education was the primary subject of this study, its results are likely to be applicable to other pedagogical fields as well.

Publication Date

4-23-2026

Document Type

Thesis

Student Type

Graduate

Degree Name

Software Engineering (MS)

Department, Program, or Center

Software Engineering, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Samuel Malachowsky

Advisor/Committee Member

Christian Newman

Advisor/Committee Member

Daqing Hou

Source Code.zip (268 kB)

Campus

RIT – Main Campus

Share

COinS