Abstract

Smart‑meter data are pivotal to modern demand‑side management and predictive maintenance, yet strict confidentiality rules at Dubai Electricity and Water Authority (DEWA) hinder internal access. Synthetic data generated by Generative Adversarial Networks (GANs) offers a promising privacy‑preserving alternative, potentially unlocking analytics without exposing customer information. This study assesses the ability of three GAN variants—vanilla GAN, Tabular GAN (TGAN), and Conditional Tabular GAN (CTGAN)—to replicate real consumption patterns. A publicly available \emph{Smart Meters in London} daily dataset comprising 3.51 million records was utilised. After exploratory data analysis (EDA) and cleansing, identifier (\texttt{LCLid}) and date (\texttt{day}) variables were removed, leaving seven numerical energy features (median, mean, max, min, sum, count, and standard deviation). Each model was trained for 300 epochs with a batch size of 128 and latent dimension 128, then tasked with producing 1000 synthetic samples. Evaluation combined marginal distribution similarity (Kolmogorov–Smirnov statistic), dependence preservation (mean absolute Pearson‑correlation error), and visual inspection via kernel‑density overlays, heat‑maps, and pair‑plot analyses. Results indicate that TGAN attains the highest fidelity, recording an average KS distance of 0.037 and correlation error of 0.06. CTGAN follows (0.052/0.09), while vanilla GAN lags behind (0.078/0.12). TGAN most accurately captures heavy‑tailed behaviour in \texttt{energy\_max} and \texttt{energy\_sum}, whereas CTGAN exhibits mild mode collapse and vanilla GAN under‑represents extremes. All models retain correlation structure within ±0.10 for dominant feature pairs, suggesting the synthetic data preserve relationships critical to downstream analytics. The findings demonstrate that TGAN‑generated data can serve as a reliable, privacy‑friendly proxy for DEWA’s smart‑meter records, facilitating exploratory analytics, demand forecasting, and anomaly‑detection prototyping without breaching confidentiality. Integrating such synthetic datasets could accelerate AI adoption across DEWA while aligning with Dubai’s broader digital‑transformation goals. Future work should incorporate differential‑privacy regularisation to provide formal guarantees, expand validation to multi‑temporal energy datasets, and explore hybrid GAN architectures for enhanced fidelity.

Library of Congress Subject Headings

Dubai Electricity and Water Authority--Data processing; Dwellings--Energy consumption--United Arab Emirates--Dubai--Data processing; Big data--Industrial applications; Generative adversarial networks (Computer networks); Predictive analytics; Data protection

Publication Date

5-2025

Document Type

Thesis

Student Type

Graduate

Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research

Advisor

Ioannis Karamitsos

Campus

RIT Dubai

Plan Codes

PROFST-MS

Share

COinS