Apparatus and method for efficient prefix sum operation

An apparatus and method are described for performing a prefix sum. For example, one embodiment of an apparatus comprises: a graphics processor unit comprising one or more execution units to execute single instruction multiple data (SIMD) instructions, the GPU to be provided with a plurality of data...

Full description

Saved in:
Bibliographic Details
Main Authors Sarangi Satyajit, Raoux Thomas F
Format Patent
LanguageEnglish
Published 25.04.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An apparatus and method are described for performing a prefix sum. For example, one embodiment of an apparatus comprises: a graphics processor unit comprising one or more execution units to execute single instruction multiple data (SIMD) instructions, the GPU to be provided with a plurality of data elements as input for a prefix sum operation; a first register of the GPU to store the plurality of data elements in specified data element positions; and the one or more execution units to perform a series of single instruction multiple data (SIMD) operations using the plurality of data elements, the SIMD operations performed using regioning techniques to generate the prefix sum, the SIMD operations including a first plurality of simultaneous addition operations to add specified data elements to generate intermediate results and further including a second plurality of simultaneous addition operations to add the intermediate results to other intermediate results to generate the prefix sum.
Bibliography:Application Number: US201514727826