Abstract
Fog Computing is now emerging as the dominating paradigm bridging the compute and connectivity gap between sensing devices and latency-sensitive services. However, as fog deployments scale by accumulating numerous devices interconnected over highly dynamic and volatile network fabrics, the need for self-healing in the presence of failures is more evident. Using the prevailing methodology of self-stabilization, we propose a fault-tolerant framework for control planes that enables fog services to cope and recover from a very broad fault model. Specifically, our model considers network uncertainties, packet drops, node fail-stops and violations of the assumptions according to which the system was designed to operate (e.g., system state corruption). Our self-stabilizing algorithms guarantee automatic recovery within a constant number of communication rounds without the need for external (human) intervention. To showcase the framework's effectiveness, the correctness proof of the self-stabilizing algorithmic process is accompanied by a comprehensive evaluation featuring an open and reproducible testbed utilizing real-world data from the smart vehicle domain. Results show that our framework ensures a fog system recovers from faults in constant time, analytics are computed correctly, while the control plane overhead scales linearly towards the IoT load.
Original language | English |
---|---|
Publication status | Published - 2020 |
Event | 13th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2020) - Duration: 7 Dec 2020 → 10 Dec 2020 |
Conference
Conference | 13th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2020) |
---|---|
Period | 7/12/20 → 10/12/20 |