[Workshop] Hot Chips 2020 Alibabas Hanguang 800 NPU

Published: by Creative Commons Licence (Last updated: )

Hot Chips 2020 Alibabas Hanguang 800 NPU1

The four cores are connected with ring bus, and the memory is shared via this bus.

Data is pre-processed in Memory Engine.

And the tasks are controlled and mapped by software to make it more fixable.

img

08:01PM EDT - 4 cores with ring bus

08:01PM EDT - 192 MB local memory, distributed shared, no DDR

08:01PM EDT - Command processor above all four cores

08:01PM EDT - PCIe 4.0 x16

08:02PM EDT - Each core has three engines: Tensor, Pooling, Memory

img

img

img

img

  1. Anandtech. https://www.anandtech.com/show/16009/hot-chips-2020-live-blog-alibabas-hanguang-800-npu-500pm-pt