The distributed nature of edge computing infrastructures requires a significant effort to avoid overload conditions due to uneven distribution of incoming load from sensors placed over a wide area. While optimisation algorithms operating offline can address this issue in the medium to long term, sudden and unexpected traffic surges require an online approach where load balancing actions are taken at a smaller time scale. However, when the service time of a single request becomes comparable with the latency needed to take and actuate load balancing decisions, the design of online approaches becomes particularly challenging. This paper focuses on the class of online algorithms for load balancing based on resource sharing among random nodes. While this randomisation principle is a straightforward and effective way to share resources and achieve load balance, it fails to work properly when the interval between decision making and decision actuating times (called schedule lag) becomes comparable with the time required to execute a job, a condition not rare in edge computing systems, and provokes stale (out-of-date) information to be involved in scheduling decisions. Our analysis combines (1) a theoretical model that evaluates how stale information reduces the effectiveness of the balancing mechanism and describes the correlation between the system state at decision making and decision actuating times; (2) a simulation approach to study a wide range of algorithm parameters and possible usage scenarios. The results of our analysis provides the designers of distributed edge systems with useful hints to decide, based on the scenario, which load balancing protocol is the most suitable.
2022, COMPUTER NETWORKS, Pages 108935- (volume: 210)
On the impact of stale information on distributed online load balancing protocols for edge computing (01a Articolo in rivista)
Beraldi R., Canali C., Lancellotti R., Mattia G. P.
Gruppo di ricerca: Distributed Systems