Sniper: Cloud-Edge Collaborative Inference Scheduling with Neural Network Similarity Modeling
TimeWednesday, July 13th11:15am - 11:37am PDT
Location3000, Level 3
Event Type
Research Manuscript
ML Algorithms and Applications
DescriptionThe cloud-edge collaborative inference demands efficiently scheduling artificial intelligence(AI) tasks to the appropriate edge intelligent device. However, continuously iterative deep neural networks(DNNs) and heterogeneous devices pose great challenges for efficient inference tasks scheduling. In this paper, we propose a cloud-edge collaborative inference scheduling system(Sniper) with time awareness. Considering that similar networks have similar behavior, we develop a non-invasive performance characterization network(PCN) based on neural network similarity(NNS) to predict inference times of DNNs. The least laxity first(LLF) algorithm is applied to DNNs scheduling. Finally, Sniper achieves greater optimization of the transactions per second and the waiting time in DNNs scheduling.