Abstract

Data Science has ignited unprecedented academic, industrial, and pedagogical fervor, yet its status as a \textit{science} in the classical sense---comparable to physics or biology---remains profoundly unsettled. This article interrogates the epistemological foundations of Data Science by examining its hybrid theoretical lineage, from the Universal Approximation Theorem to the No-Free-Lunch Theorems, with special emphasis on the fundamental Bayesian optimality results for both regression and classification. We argue that Data Science is in a vigorous \textit{gestational period}, characterized not by an absence of principles but by a creative tension between empirical pragmatism and deep mathematical theory. The Cross-Validation score emerges as the Swiss Army knife of computational epistemology, enabling estimation of generalization error from finite data. We analyze candidate ``laws'' (bias-variance tradeoff, Vapnik-Chervonenkis theory), present complete formulations of Bayesian optimal predictors under both squared and zero-one losses, and propose that Data Science's maturity will be marked by the codification of its beautiful, necessary tensions. The fundamental theorems of Data Science are indeed hybrid, bridging centuries of mathematical analysis with modern computational epistemology.

Publication Date

Spring 3-31-2026

Document Type

Technical Report

Department, Program, or Center

Mathematics and Statistics, School of

College

College of Science

Campus

RIT – Main Campus

Share

COinS