Heuristic Adaptability to Input Dynamics for SpMM on GPUs*
TimeWednesday, July 13th1:52pm - 2:15pm PDT
Location3005, Level 3
AI/ML Design: System and Platform
DescriptionPrevious studies have proposed SpMM designs on GPUs. We point out the performance of a static SpMM design is related to input dynamics. Following challenges should be solved: (1) The algorithm space considering sparsity with orthogonal design principles. (2) Nontrivial implementations combining previous designs. (3) Heuristic adaptability to input dynamics. We propose a novel three-loop algorithm space, extracting orthogonal principles specially for sparse problems. We propose novel techniques to implement missing algorithms. We further propose a heuristic kernel to optimize code considering input dynamics. We achieve 1.26x~1.37x speedup compared with the best cuSPARSE algorithm, bringing 5.59x end-to-end speedup to GNNs.