APPARATUS TO OPTIMIZE GPU THREAD SHARED LOCAL MEMORY ACCESS
One embodiment provides for a graphics processor comprising first logic coupled with a first execution unit, the first logic to receive a first single instruction multiple data (SIMD) message from thefirst execution unit; second logic coupled with a second execution unit, the second logic to receive...
Saved in:
Main Authors | , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
05.02.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | One embodiment provides for a graphics processor comprising first logic coupled with a first execution unit, the first logic to receive a first single instruction multiple data (SIMD) message from thefirst execution unit; second logic coupled with a second execution unit, the second logic to receive a second SIMD message from the second execution unit; and third logic coupled with a bank of shared local memory (SLM), the third logic to receive a first request to access the bank of SLM from the first logic, a second request to access the bank of SLM from the second logic, and in a single access cycle, schedule a read access to a read port for the first request and a write access to a write port for the second request.
个实施例提供了种图形处理器,包括:第逻辑,与第执行单元耦合,所述第逻辑用于接收来自所述第执行单元的第单指令多数据(SIMD)消息;第二逻辑,与第二执行单元耦合,所述第二逻辑用于接收来自所述第二执行单元的第二SIMD消息;以及第三逻辑,与共享本地存储器(SLM)的存储体耦合,所述第三逻辑用于接收来自所述第逻辑的用于访问SLM的所述存储体的第请求、来自所述第二逻辑的用于访问SLM的所述存储体的第二请求,并且用于在单个访问周期内针对所述第请求将读取访问调度至读取端口以及针对所述第二请求将写入访问调度至写入端口。 |
---|---|
Bibliography: | Application Number: CN201780035842 |