05 Jul 2021
14:30 Master's Defense Fully distance
Theme
Accelerated Pooling: Creating a Hardware Accelerated Implementation Based on Im2col and Col2im
Student
Caio Salvador Rohwedder
Advisor / Teacher
Guido Costa Souza de Araujo
Brief summary
Convolution is a crucial operation for Deep Learning applications. As such, it has been the focus of many optimization efforts in this field of application. Image-to-column (Im2col) and column-to-image (Col2im) are transformations widely used to map convolution for matrix multiplication. These transformations rearrange convolution inputs to avoid their non-continuous memory access pattern, thus providing a more user-friendly arrangement of data for CPUs and GPUs. In artificial intelligence (AI) accelerators, these transformations allow the convolution to be performed in matrix multiplier units. Implemented in software, however, they impose a significant increase in computing time that must be offset by the efficiency gains of matrix multipliers. DaVinci is an AI accelerator architecture that presents instructions for optimizing Im2col and Col2im, thus reducing the overhead of convolution execution in its matrix multiplier. Another central layer for convolutional neural networks that presents an access pattern similar to convolution is pooling. The execution of pooling is typically directed to vector computing units. However, implementations based on the Im2col and Col2im transformations can be used to improve their execution. This paper explores the use of DaVinci's Im2col and Col2im instructions to accelerate pooling layers. The proposed approach uses a general purpose vector computing unit and instructions designed primarily for convolution. An experimental evaluation reveals that the proposed pooling implementations can produce speed gains of up to 5,8 times compared to base implementations that do not use these specialized instructions. The speed gains are obtained from an improvement in the data arrangement of the pooling entries, as this arrangement leads to a better vectorization of your instructions.
Examination Board
Headlines:
Guido Costa Souza de Araújo IC / UNICAMP
Sandra Eliza Fontes de Avila IC / UNICAMP
Bruno de Carvalho Albertini EPUSP
Substitutes:
Hervé Cédric Yviquel IC / UNICAMP
Nahri Balesdent Moreano FACOM / UFMS