Lightweight transformer for high resolution images

Systems and methods for obtaining attention features are described. Some examples may include: receiving, at a projector of a transformer, a plurality of tokens associated with image features of a first dimensional space; generating, at the projector of the transformer, projected features by concate...

Full description

Saved in:
Bibliographic Details
Main Authors Ding, Mingyu, Jin, Xiaojie, Yang, Linjie, Wang, Peng, Lian, Xiaochen
Format Patent
LanguageEnglish
Published 14.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Systems and methods for obtaining attention features are described. Some examples may include: receiving, at a projector of a transformer, a plurality of tokens associated with image features of a first dimensional space; generating, at the projector of the transformer, projected features by concatenating the plurality of tokens with a positional map, the projected features having a second dimensional space that is less than the first dimensional space; receiving, at an encoder of the transformer, the projected features and generating encoded representations of the projected features using self-attention; decoding, at a decoder of the transformer, the encoded representations and obtaining a decoded output; and projecting the decoded output to the first dimensional space and adding the image features of the first dimensional space to obtain attention features associated with the image features.
Bibliography:Application Number: US202117342483