BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220715T000513Z
LOCATION:3002\, Level 3
DTSTART;TZID=America/Los_Angeles:20220713T133000
DTEND;TZID=America/Los_Angeles:20220713T150000
UID:dac_DAC 2022_sess137@linklings.com
SUMMARY:Accelerating the Inference: Transformers, Graphs and Others
DESCRIPTION:Research Manuscript\n\nThis session presents novel accelerator
s for neural inference. The papers presented target accelerators of graph
neural networks and transformer networks among others. Further, one accele
rator focuses on diagonal matrix multiplication, used in sparse self-atten
tion heads. All papers present various trade-offs and demonstrate signific
ant progress against the state of the art.\n\nGNNIE: GNN Inference Engine
with Load-balancing and Graph-specific Caching\n\nMondal, Manasi, Kunal, S
, Sapatnekar\n\nGraph neural networks (GNN) inferencing involves weighting
vertex feature vectors, followed by aggregating weighted vectors over a v
ertex neighborhood. High and variable sparsity in the input vertex feature
vectors, and high sparsity and power-law degree distributions in the adja
cency matrix, can le...\n\n---------------------\nNN-LUT: Neural Approxima
tion of Non-Linear Operations for Efficient Transformer Inference\n\nYu, P
ark, Park, Kim, Lee...\n\nNon-linear operations such as GELU, Layer normal
ization, and Softmax are essential yet costly building blocks of Transform
er models. Several prior works simplified these operations with look-up ta
bles or integer computations, but such approximations suffer inferior accu
racy or considerable hardware ...\n\n---------------------\nSelf Adaptive
Reconfigurable Arrays (SARA): Learning Flexible GEMM Accelerator Configura
tion and Mapping-space using ML\n\nSamajdar, Qin, Pellauer, Krishna\n\nAbs
tract to be added, need to cut it down to 100 words\n\n-------------------
--\nSALO: an Efficient Spatial Accelerator Enabling Hybrid Sparse Attentio
n Mechanisms for Long Sequences\n\nShen, Zhao, Chen, Leng, Li...\n\nThe at
tention mechanisms of transformers effectively extract pertinent informati
on from the sequence. However, the quadratic complexity of self-attention
w.r.t the sequence length incurs heavy computational and memory burdens, e
specially for tasks with long sequences. Existing accelerators face perf..
.\n\n\nTopic: Design\n\nKeyword: AI/ML Design: Circuits and Architecture
END:VEVENT
END:VCALENDAR