Towards Low-Cost and Energy-Aware Inference for EdgeAI Services via Model Swapping

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Over the past decade, key advancements in Artificial Intelligence (AI) and Edge Computing (EC) have led to the development of EdgeAI services to provide intelligent and low latency responses essential for mission-critical applications. However, the expansion of EdgeAI services to the network extremes can face challenges such as load fluctuations causing delays in AI inference and concerns over energy efficiency. This paper proposes 'model swapping' where the model employed by the EdgeAI service is swapped on-the-fly with another readily available model so that cost and energy savings are achieved during runtime inference tasks. The ModelSwapper can achieve this by employing a lowcost algorithmic technique that explores meaningful trade-offs between the computational overhead and the model accuracy. By doing so, edge nodes adapt to load fluctuations by substituting complex models with simpler ones, thus meeting desired latency requirements, albeit with potentially higher uncertainty. Our evaluation with two EdgeAI services (object detection, NLU) demonstrates that ModelSwapper can significantly reduce energy usage and inference delays by at least 27% and 68\% respectively, with only a 1\% reduction in accuracy.

    Original languageEnglish
    Title of host publicationProceedings - 2024 IEEE International Conference on Cloud Engineering, IC2E 2024
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages168-177
    Number of pages10
    ISBN (Electronic)9798331528690
    DOIs
    Publication statusPublished - 2024
    Event12th IEEE International Conference on Cloud Engineering, IC2E 2024 - Paphos, Cyprus
    Duration: 24 Sept 202427 Sept 2024

    Publication series

    NameProceedings - 2024 IEEE International Conference on Cloud Engineering, IC2E 2024

    Conference

    Conference12th IEEE International Conference on Cloud Engineering, IC2E 2024
    Country/TerritoryCyprus
    CityPaphos
    Period24/09/2427/09/24

    Keywords

    • Edge Computing
    • Machine Learning

    Fingerprint

    Dive into the research topics of 'Towards Low-Cost and Energy-Aware Inference for EdgeAI Services via Model Swapping'. Together they form a unique fingerprint.

    Cite this