Visual self-attention accelerator optimization method based on FPGA

The invention discloses a visual self-attention accelerator optimization method based on an FPGA, and the method comprises the following steps: carrying out the dynamic token pruning of a visual self-attention model through a dynamic token pruning scheme, removing redundant information, and reducing...

Full description

Saved in:
Bibliographic Details
Main Authors LUO CONGHUI, LIN HAIYAN, HUANG YIHUA
Format Patent
LanguageChinese
English
Published 27.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a visual self-attention accelerator optimization method based on an FPGA, and the method comprises the following steps: carrying out the dynamic token pruning of a visual self-attention model through a dynamic token pruning scheme, removing redundant information, and reducing the calculation amount of the visual self-attention model; through a design mode of a single visual self-attention calculation layer on an FPGA, a calculation process is segmented by using a matrix slicing mode, an optimal calculation resource allocation strategy is solved based on a genetic algorithm, and maximum load balancing is realized; the calculation amount is reduced, the model operation time is shortened, and the operation efficiency of the accelerator is improved. 本发明公开了一种基于FPGA的视觉自注意力加速器优化方法,包括以下步骤:通过动态令牌剪枝方案对视觉自注意力模型进行动态令牌剪枝,去除冗杂信息,减少视觉自注意力模型的计算量;通过在FPGA上的单个视觉自注意力计算层的设计方式、使用矩阵切块的方式对计算过程进行分割,基于遗传算法求解最优的计算资源分配策略,实现最大化负载均衡;本发明减少计算量,降低模型运行时间,提高加速器的运行效率。
Bibliography:Application Number: CN202311355863