Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing
TimeWednesday, July 13th6pm - 7pm PDT
LocationLevel 2 Lobby
DescriptionFederated learning (FL) nowadays involves heterogeneous compound learning tasks as cognitive applications’ complexity increases. For example, a self-driving system hosts multiple tasks simultaneously (e.g., detection, classification, segmentation, etc.) and expects FL to retain life long intelligence involvement. However, our analysis demonstrates that, when deploying compound FL models for multiple training tasks on a GPU, certain issue arises: As different tasks’ skewed data distributions and corresponding models cause highly imbalanced learning workloads, current GPU scheduling methods lack effective resource allocations; To address this issue, we propose a scheme to address multi-task GPU scheduling. Specifically, our work illustrates competitive resource sharing widely exists among parallel model executions, and the proposed concept of “virtual resource” could effectively characterize and guide the practical per-task resource utilization and allocation. Our experiments demonstrate that the FL throughput could be significantly escalated by 2.16×∼2.38×in various multi-task scenarios.