Main article

Jinhyuk Kwon
Department of Computer Science and Engineering, Dongguk University, Seoul 04620, Republic of Korea
Soojin Park
School of Software, Soongsil University, Seoul 06978, Republic of Korea
Taeyoung Lim*
Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea
taeyoung.lim@seoultech.ac.kr

DOI: https://doi.org/10.63646/datamind.2025.030304

Abstract

Federated learning deployments on heterogeneous edge devices generate a continuous stream of compressed model updates, differential privacy budgets, training metrics, checkpoint states, and synchronisation logs that collectively define the provenance and reproducibility of every trained model. Yet no purpose-built database exists that structures, versions, and exposes this edge FL state in a machine-queryable and audit-ready form. This paper introduces EdgeDB-FL, a lightweight relational database system specifically designed to manage the full lifecycle of federated learning state on resource-constrained edge infrastructure. EdgeDB-FL organises data into six core tables—EdgeDevice, ClientUpdate, ModelVersion, TrainingRound, PrivacyBudget, and SyncLog—linked by foreign-key constraints and supported by seven composite indexes optimised for the access patterns of FL orchestration, privacy accounting, and convergence analysis workloads. A five-stage processing pipeline ingests on-device training events, compresses gradient updates using top-k sparsification and 8-bit quantisation before database storage, tracks differential privacy epsilon consumption per device, validates updates against server-side model hashes, and exports versioned model artefacts to downstream lakehouse and vector store targets. Experiments across a 200-device benchmark network demonstrate a 77.3% reduction in per-round per-device communication payload relative to standard FL baselines (4.5 KB vs. 19.8 KB), a global test accuracy of 94.6% with differential privacy (ε = 2.0) after 50 rounds, and a median database query latency of 17 ms under 500-device concurrent load. EdgeDB-FL is released as open-source software under MIT licence with a Python SDK, REST and GraphQL APIs, and reproducible experiment notebooks.

Article details

How to Cite

Kwon, J., Park, S. ., & Lim, T. (2025). EdgeDB-FL: An Edge Database for Federated Learning Updates and Model-State Management. DATAMIND, 3(3), 45-59. https://doi.org/10.63646/datamind.2025.030304