Design Space Exploration of Streaming Implementations of CNNs on the Xilinx AIEngine Processor Array
TimeWednesday, July 13th6pm - 7pm PDT
LocationLevel 2 Lobby
DescriptionThe performance of DNN accelerators heavily depends on the ability of a compiler to generate optimized mappings tailored to the accelerator's architecture. However, while previous work investigated layer-by-layer mappings extensively, streaming implementations, where DNN layers run concurrently on the hardware resources, still need to be explored. In this work, we investigate the algorithmic changes compilers must support to generate streaming implementations of DNNs when targetting the Xilinx AIEngine Processor Array. We start with a simple data-centric representation and express the hardware resource allocation problem as a path-finding problem on graphs. We demonstrate the performance of the compiler on modern DNNs.