Posts By Tag

3D IC

  • [Glean] 2.5D and 3D Interposer Mar 23, 2021

    Interposers are wide, extremely fast electrical signal conduits used between die in a 2.5D configuration.

3DConv

3DIC

ACA

AMBA

  • [Glean] AXI Bus Introduction Apr 28, 2024

    AXI Bus Introduction, including five channels: read address, read data, write address, write data, and write response, and VALID/READY Handshake.

AMD

ANSI

  • [Glean] ANSI Escape Codes Aug 06, 2020

    ANSI escape sequences are a standard for in-band signaling to control the cursor location, color, and other options on video text terminals and terminal emulators.

APR

  • [Glean] Library Formats: CCS, ECSM, and NLDM Apr 03, 2023

    This article provides an overview of three common library formats used in the design and analysis of digital circuits: Composite Current Source (CCS), Effective Current Source Model (ECSM), and Non-Linear Delay Model (NLDM), which is generated by ChatGPT4.

ASIC

AXI

  • [Glean] AXI Bus Introduction Apr 28, 2024

    AXI Bus Introduction, including five channels: read address, read data, write address, write data, and write response, and VALID/READY Handshake.

Adapters

  • [Emulate] Refinement of Computation and Communication Jun 19, 2020

    This post introduces Refinement of Computation and Communication in SystemC. Including different kinds of communication refinement, such as channel refinement, module refinement, hw-hw refinement, sw-sw refinement, hw-sw refinement. It also introduces the steps in communication refinement.

Agile

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

  • [Glean] Impact Map Sep 17, 2020

    The introduction of impact map and how to create an impact map.

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

Algorithm

Algorithms and parallel computing

AllReduce

Atomic

Attention

  • [Read Paper] Attention Is All You Need Jan 07, 2021

    This blog is the combination of two blogs which introduces the paper Attention is All You Need. Shortages and one improvement is shown, too.

AyarLabs

BF16

BNN

Bell

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

Benchmark

  • [Survey] MLforHPC Benchmarks Feb 16, 2020

    I attached my recent survey on ML4HPC benchmarks, including three papers 1) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; 2) HPC AI500: A Benchmark Suite for HPC AI Systems; 3) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; and some other presentation slides.

BlackBox

BrainFloat16

C++

CAM

  • [Glean] Content-addressable memory Aug 28, 2020

    Content-addressable memories (CAMs) are hardware search engines that are much faster than algorithmic approaches for search-intensive applications. CAMs are composed of conventional semiconductor memory (usually SRAM) with added comparison circuitry that enable a search operation to complete in a single clock cycle. The two most common search-intensive tasks that use CAMs are packet forwarding and packet classification in Internet routers. I introduce CAM architecture and circuits by first describing the application of address lookup in Internet routers. Then we describe how to implement this lookup function with CAM.

CCS

  • [Glean] Library Formats: CCS, ECSM, and NLDM Apr 03, 2023

    This article provides an overview of three common library formats used in the design and analysis of digital circuits: Composite Current Source (CCS), Effective Current Source Model (ECSM), and Non-Linear Delay Model (NLDM), which is generated by ChatGPT4.

CI

CLINT

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

CLion

CNN

CPP

CRV

  • [Survey] Current Verification Methods And Their Limited Situations Jan 11, 2021

    This post introduces the current verification methods, steps and their limitations, including formal verification, constrained random verification (CRV) and hardware-software co-verification using virtual platform with hardware emulation and acceleration.

CS61B

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

  • [Weekly Review] 2020/04/06-12 Apr 12, 2020

    Contains cuset retiming, zen of Python, and some knowledge of CS61B.

  • [Weekly Review] 2020/03/30-04/05 Apr 05, 2020

    Contains BDD, TDD, CI and CS61B. Plus the ICR in Chisel.

CSC

CUDA

CXL

Cache

  • [Glean] Content-addressable memory Aug 28, 2020

    Content-addressable memories (CAMs) are hardware search engines that are much faster than algorithmic approaches for search-intensive applications. CAMs are composed of conventional semiconductor memory (usually SRAM) with added comparison circuitry that enable a search operation to complete in a single clock cycle. The two most common search-intensive tasks that use CAMs are packet forwarding and packet classification in Internet routers. I introduce CAM architecture and circuits by first describing the application of address lookup in Internet routers. Then we describe how to implement this lookup function with CAM.

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

  • [Weekly Review] 2019/12/09-15 Dec 15, 2019

    This review contains come basic knowledge related to git, RISC-V, Chipyard, RoCC interface, SHA3 and cache.

Cache Coherency

Cadence

Category

Cbo

CentOS

CentOS7

CentOS8

Cerebras

Channel

ChatGPT

Chipyard

Chisel

  • [Tutorial] Develop Chisel with Dev Container in Idea. Oct 19, 2023

    This article introduces how to develop Chisel with Dev Container in IntelliJ Idea.

  • [Tutorial] Chisel simulation with the hierarchical BlackBox Module Jul 23, 2023

    This article introduces how to do Chisel simulation with the hierarchical BlackBox Module.

  • [Tutorial] Quick Debug and Run Test on Chisel Repos based on CI Flow Files Feb 28, 2023

    This tutorial introduces the quick way to debug the code of Chisel environment, such as Chisel3, playground, Rocket Chip, et al. The method introduced in this tutorial can also be used for other repos.

  • [CodeStudy] Scala Excel Read: POI XSSF Nov 11, 2020

    In this article, I introduced how to read a workbook, a sheet, a row and a special cell. The methods to obtain the row number and column number are also given. One way to filter empty cells is introduced too.

  • [CodeStudy] RocketChip Optional Bundle Oct 08, 2020

    Learned some tips of Chisel via RocketChip. This introduces how to make the bundles be optional.

  • [CodeStudy] RocketChip MultiWidthFIFO Sep 25, 2020

    Learned some tips of Chisel via RocketChip. This includes the Imp of Multi-Width-FIFO.

  • [CodeStudy] Some Chisel details in the project RocketChip Sep 24, 2020

    Learned some tips of Chisel via RocketChip. Here includes come implicit classes, and one implementation of a gray counter.

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

  • [Weekly Review] 2020/03/30-04/05 Apr 05, 2020

    Contains BDD, TDD, CI and CS61B. Plus the ICR in Chisel.

  • [Tutorial] Suggest Using private Before val Apr 03, 2020

    This tutorial suggests to use private as a prefix of val when create a wire or register and mentions one possible problem when using private.

  • [Weekly Review] 2020/03/23-29 Mar 29, 2020

    This weekly review contains Scala intersection, union and complement, as well as ScalaDoc tags. Also, introduce using console to print colorful log. An error occurred while I using `RegInit` without giving the width to UInt.

  • [CodeStudy] RocketChip Fuzzer Mar 29, 2020

    Study the code: fuzzer in rocketchip. Including how to generate source id, how to send requirement via TileLink.

  • [Weekly Review] 2020/03/16-22 Mar 22, 2020

    include Scala high-order function, Scala Regex, Chisel forkwithRegion. Also, the definition of `base address` and `offset`

  • [Tutorial] TileLink Spec Mar 21, 2020

    The study of SiFive TileLink. Including TileLink buses, nodes and its chisel codes in chipyard.

  • [Tutorial] TileLink RegMap Mar 20, 2020

    The study of TileLink TLRegMap

  • [Weekly Review] 2020/03/09-15 Mar 15, 2020

    git commit types. chisel `withRegion`, Scala `collect`, et.

  • [Weekly Review] 2020/03/02-08 Mar 08, 2020

    This weekly review contains the usage of Linux `tree` and Chisel `<>` as well as `:=`. Also, DecoupledDriver.

  • [Weekly Review] 2020/02/17-23 Feb 23, 2020

    This week, I continued on the survey of ML4HPC and found several papers of Indiana University, which described the definitions of ML4HPC and its subcategories. Also, I finished the draft implementation of GLB cluster with some test.

  • [Weekly Review] 2020/02/10-16 Feb 16, 2020

    This review contains one way to think matrix multiply, one Chisel class named DataMirror which can monitor the details of ports, and a discussing of how can RoCC accelerator communicate with L2 cache. Also, I continued my survey at AI for HPC.

  • [Weekly Review] 2020/02/03-09 Feb 09, 2020

    This review contains the usage of general data type in Chisel, the basic architecture of NN and the introductions of BNN and the BitFlow algorithm. Also, some materials related to HPC+ML.

  • [Weekly Review] 2020/01/13-19 Jan 19, 2020

    This weekly review contains Chisel syntax such as mem, Vec and mem test.

  • [Weekly Review] 2020/01/06-12 Jan 12, 2020

    This review contains some Chisel and Scala syntaxes studying notes.

Chrono

Co-Verification

  • [Survey] Current Verification Methods And Their Limited Situations Jan 11, 2021

    This post introduces the current verification methods, steps and their limitations, including formal verification, constrained random verification (CRV) and hardware-software co-verification using virtual platform with hardware emulation and acceleration.

CoDesign

CoVerification

CodeBloat

  • [Glean] Code bloat Aug 25, 2020

    In computer programming, code bloat is the production of program code (source code or machine code) that is perceived as unnecessarily long, slow, or otherwise wasteful of resources. Code bloat can be caused by inadequacies in the programming language in which the code is written, the compiler used to compile it, or the programmer writing it. Thus, while code bloat generally refers to source code size (as produced by the programmer), it can be used to refer instead to the generated code size or even the binary file size.

Codesign

  • [Glean] Hardware/Software Codesign Aug 29, 2020

    As the name implies, Hardware/Software Codesign (HSCD) denotes design methodologies for electronic systems that exploit the trade-offs and the synergy of Hardware (HW) and Software (SW).

Coherence

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

Compression

Computer Architecture

  • [Weekly Review] 2021/01/18-2021/01/24 Jan 24, 2021

    The weekly review 2021/01/18-2021/01/24

  • [Glean] Turning Tax Jan 24, 2021

    Turning Tax is a term taught in the advanced computer architecture by Paul H J Kelly at IC London. It describes the overhead (performance, cost, or energy) of the universality of the universal computing devices. It can be caused by instructions, data routing, register access and configurable ALU, where we can reduce the Turning Tax.

  • [Glean] Tomasulo Algorithm Jan 24, 2021

    Tomasulo Algorithm eliminate three kinds of hazard RAW, WAR and WAW hazards by forwarding and renaming. The three stages of this algorithm are issue, execute and write back.

Converters

  • [Emulate] Refinement of Computation and Communication Jun 19, 2020

    This post introduces Refinement of Computation and Communication in SystemC. Including different kinds of communication refinement, such as channel refinement, module refinement, hw-hw refinement, sw-sw refinement, hw-sw refinement. It also introduces the steps in communication refinement.

Crontab

DAL

DG

DLA

DMA

  • [Weekly Review] 2020/02/10-16 Feb 16, 2020

    This review contains one way to think matrix multiply, one Chisel class named DataMirror which can monitor the details of ports, and a discussing of how can RoCC accelerator communicate with L2 cache. Also, I continued my survey at AI for HPC.

DNN

DSA

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

  • [Read Paper] Domain-Specific Hardware Sep 24, 2020

    Some tricks for design domain specific accelerators.

DSH

  • [Weekly Review] 2019/12/23-29 Dec 29, 2019

    This review contains some troubles I met while setting up the Chisel develop environment.

  • [Weekly Review] 2019/12/16-22 Dec 22, 2019

    This review contains some basic knowledge of Scala, and the tutorial of deep learning accelerator designs named 'Efficient Processing of Deep Neural Network: from Algorithms to Hardware Architectures'.

Data format

Debug

Dequantization

  • [Workshop] tinyML Talks: AIML SoC for Ultra-Low-Power Mobile and IoT devices Jul 22, 2020

    This workshop introduces two computation optimization methods and three memory optimization methods. Address Generation HW Unit and pipeline architecture are helpful to computation optimization. Dequantization, entropy compression and pooling on the fly are benefit to memory optimization.

Design Compiler

Design Pattern

DesignAutomation

Detector

DianNao

Docker

E1400

  • [Glean] What Is Memory-Hard Apr 19, 2021

    In cryptography, a memory-hard function (MHF) is a function that costs significant amount of memory to evaluate. I also show the solution from Linzhi.

ECSM

  • [Glean] Library Formats: CCS, ECSM, and NLDM Apr 03, 2023

    This article provides an overview of three common library formats used in the design and analysis of digital circuits: Composite Current Source (CCS), Effective Current Source Model (ECSM), and Non-Linear Delay Model (NLDM), which is generated by ChatGPT4.

EIE

ESL

ESWEEK

Einstein Summation Convention

Einsum

Emulation

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

Entropy

Eyeriss

EyerissV2

Eyexam

FFmpeg

FP16

FP32

FP8

FalseSharing

Find

  • [Glean] Remove Empty File Folder Jan 11, 2021

    Introduces two Linux command find and xargs. By combining this two command, you can easily remove empty directories and finish more jobs.

FloatingPoint

FogComputation

Folding

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

Formal

  • [Weekly Review] 2021/02/01-2021/02/07 Feb 07, 2021

    The weekly review 2021/02/01-2021/02/07

  • [Glean] VC Formal Apps Feb 02, 2021

    This post introduces the Apps of VC formal, including AEP, FCA, CC, SEQ, FRV, FXP, FPV, FTA, FSV, DPV, RMA, AIP and FuSa.

  • [Glean] Static Sign-Off, Formal & Simulation Feb 01, 2021

    This post introduces the differences of Static Sign-Off, Formal and Simulation by three key functional verification metrics. analysis always finishes, all the violations flagged by the analysis, 100% of the failures are found.

  • [Weekly Review] 2021/01/25-2021/01/31 Jan 31, 2021

    The weekly review 2021/01/25-2021/01/31

  • [Glean] Formal Signoff Jan 29, 2021

    This post introduces VC Formal Apps, the reason and goal of formal signoff. Later seven steps of formal signoff based on Synopsys are listed.

  • [Survey] Current Verification Methods And Their Limited Situations Jan 11, 2021

    This post introduces the current verification methods, steps and their limitations, including formal verification, constrained random verification (CRV) and hardware-software co-verification using virtual platform with hardware emulation and acceleration.

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

Forwarding

  • [Glean] Tomasulo Algorithm Jan 24, 2021

    Tomasulo Algorithm eliminate three kinds of hazard RAW, WAR and WAW hazards by forwarding and renaming. The three stages of this algorithm are issue, execute and write back.

Fusion Compiler

GEMM

GPT

GTKWave

Genus

Git

GitHub

GoF

Graph

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

Grouped Convolution

  • [Glean] Grouped Convolution May 27, 2021

    Grouped convolution is a variant of convolution where the channels of the input feature map are grouped and convolution is performed independently for each grouped channels. There are also visualised graphs to show both spatial and channel domain of convolution, grouped convolution and other convolutions.

HLS

HM-NoC

HPC

  • [Survey] MLforHPC Benchmarks Feb 16, 2020

    I attached my recent survey on ML4HPC benchmarks, including three papers 1) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; 2) HPC AI500: A Benchmark Suite for HPC AI Systems; 3) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; and some other presentation slides.

  • [Weekly Review] 2020/02/03-09 Feb 09, 2020

    This review contains the usage of general data type in Chisel, the basic architecture of NN and the introductions of BNN and the BitFlow algorithm. Also, some materials related to HPC+ML.

  • [Weekly Review] 2020/01/27-02/02 Feb 02, 2020

    This review contains some hotchip19's slides and materials of HPC

HPCforML

  • [Survey] HPCforML and MLforHPC Feb 23, 2020

    This survey contains two papers 1) Understanding ML driven HPC: Applications and Infrastructure; 2) Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations.

HPML

  • [Survey] HPCforML and MLforHPC Feb 23, 2020

    This survey contains two papers 1) Understanding ML driven HPC: Applications and Infrastructure; 2) Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations.

  • [Survey] MLforHPC Benchmarks Feb 16, 2020

    I attached my recent survey on ML4HPC benchmarks, including three papers 1) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; 2) HPC AI500: A Benchmark Suite for HPC AI Systems; 3) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; and some other presentation slides.

  • [Weekly Review] 2020/02/03-09 Feb 09, 2020

    This review contains the usage of general data type in Chisel, the basic architecture of NN and the introductions of BNN and the BitFlow algorithm. Also, some materials related to HPC+ML.

  • [Weekly Review] 2020/01/27-02/02 Feb 02, 2020

    This review contains some hotchip19's slides and materials of HPC

HUAWEI

HWPE

  • [CodeStudy] HPWE and Its Interfaces between Hardware and Software Dec 26, 2020

    This article introduces the MMIO register files of HWPE in Pulp SoC with its related c codes for simulation. It also gives hints of custom modifying the codes to use more registers or more events. I think it can also help you to understand the interaction between hardware and software.

Habana

Hanguang

Haswell

HierarchicalNeuralNetwork

  • [Workshop] tinyML Talks: Low-Power Computer Vision Jun 28, 2020

    By utilizing hierarchical neural network, we can separate the big neural network into much small ones, hence reduce the training time and inference power consumption. However, it might increase the latency.

HotChips

Hotchip

Hwacha

  • [Weekly Review] 2020/02/10-16 Feb 16, 2020

    This review contains one way to think matrix multiply, one Chisel class named DataMirror which can monitor the details of ports, and a discussing of how can RoCC accelerator communicate with L2 cache. Also, I continued my survey at AI for HPC.

IIO

INT16

INT8

Idea

Img2Col

Impact Map

  • [Glean] Impact Map Sep 17, 2020

    The introduction of impact map and how to create an impact map.

Inference

Inner Product

Innovus

Integer

Interface

Interposer

  • [Glean] 2.5D and 3D Interposer Mar 23, 2021

    Interposers are wide, extremely fast electrical signal conduits used between die in a 2.5D configuration.

IoU

  • [Glean] IoU and NMS Dec 03, 2020

    Intersection over Union (IoU) is an evaluation metric used to measure the accuracy of an object detector on a particular dataset. Non-maximum suppression (NMS) is a technique to remove duplicates and false positives in object detection.

JVM

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

L2DC

LLC

LSTM

Latex

Linux

Linzhi

  • [Glean] What Is Memory-Hard Apr 19, 2021

    In cryptography, a memory-hard function (MHF) is a function that costs significant amount of memory to evaluate. I also show the solution from Linzhi.

  • [Glean] Five Steps to Make an ASIC for Algorithm X Apr 19, 2021

    Five Steps to Make an ASIC for Algorithm X: Math first, Optimization Target, Hardware-Software Boundary, Building Blocks, Physical Implementation

Logic Synthesis

Low Power

ML

  • [Glean] Grouped Convolution May 27, 2021

    Grouped convolution is a variant of convolution where the channels of the input feature map are grouped and convolution is performed independently for each grouped channels. There are also visualised graphs to show both spatial and channel domain of convolution, grouped convolution and other convolutions.

  • [Weekly Review] 2020/05/25-31 May 25, 2020

    There is no excerpt to show~

  • [Read Paper] Learning to Design Circuits Jan 12, 2020

    Han Song's Paper: Learning to Design Circuits. Using ML to design analogue circuits.

ML4HPC

  • [Survey] HPCforML and MLforHPC Feb 23, 2020

    This survey contains two papers 1) Understanding ML driven HPC: Applications and Infrastructure; 2) Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations.

  • [Survey] MLforHPC Benchmarks Feb 16, 2020

    I attached my recent survey on ML4HPC benchmarks, including three papers 1) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; 2) HPC AI500: A Benchmark Suite for HPC AI Systems; 3) A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning; and some other presentation slides.

  • [Weekly Review] 2020/02/03-09 Feb 09, 2020

    This review contains the usage of general data type in Chisel, the basic architecture of NN and the introductions of BNN and the BitFlow algorithm. Also, some materials related to HPC+ML.

MLPerf

MLforSystem

  • [Survey] HPCforML and MLforHPC Feb 23, 2020

    This survey contains two papers 1) Understanding ML driven HPC: Applications and Infrastructure; 2) Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations.

MST

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

Makimoto

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

Matplotlib

Maxpool

Memory Hard

  • [Glean] What Is Memory-Hard Apr 19, 2021

    In cryptography, a memory-hard function (MHF) is a function that costs significant amount of memory to evaluate. I also show the solution from Linzhi.

Memory Pooling

Memory Profiler

Mesh

  • [Glean] Network Topology Jul 10, 2020

    Network topology is the arrangement of the elements of a communication network. Including point to point, bus, star, ring or circular, mesh, tree, hybrid, or daisy chain.

Mill

Model

Modelling

  • [Weekly Review] 2020/08/24-30 Aug 30, 2020

    This week, the accelerator model successfully simulated the first layer of MobileNet V2. I also took almost two days to write my research proposal.

  • [Weekly Review] 2020/08/17-23 Aug 23, 2020

    This week, I still worked on the Loosely Timed TLM.

  • [Weekly Review] 2020/08/10-16 Aug 16, 2020

    This week, I still worked on the Loosely Timed TLM. I'm a little knowledge the concept of memory cell and memory structure. I spent a lot of time on optimizing the memory structure. I also learned a little about the SystemC TLM quantum keeper, but didn't use it in my modelling as I didn't think I need it to sync the time.

  • [Weekly Review] 2020/08/03-09 Aug 09, 2020

    This week, I still worked on the Loosely Timed TLM. This post contains some thinking about implementation and modelling, Chisel and SystemC.

  • [Emulate] Different Abstraction Models Jun 11, 2020

    Six different abstraction models by nctu. The models includes specification model, component assembly model, bus arbitration model, cycle accurate computation and RTL model.

Models

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

Moore

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

MultiWidthFIFO

Multiprocessing

N3XT

NAS

NCCL

NLDM

  • [Glean] Library Formats: CCS, ECSM, and NLDM Apr 03, 2023

    This article provides an overview of three common library formats used in the design and analysis of digital circuits: Composite Current Source (CCS), Effective Current Source Model (ECSM), and Non-Linear Delay Model (NLDM), which is generated by ChatGPT4.

NLP

  • [Read Paper] Attention Is All You Need Jan 07, 2021

    This blog is the combination of two blogs which introduces the paper Attention is All You Need. Shortages and one improvement is shown, too.

NMS

NPU

NVIDIA

Network

  • [Glean] Network Topology Jul 10, 2020

    Network topology is the arrangement of the elements of a communication network. Including point to point, bus, star, ring or circular, mesh, tree, hybrid, or daisy chain.

NumPy

OperatorFusion

  • [Glean] Operator Fusion Jun 28, 2020

    There are many opportunities, where fused operators—in terms of fused chains of basic operators—can significantly improve performance.

Outer Product

Outstanding

PDK

PIPT

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

PLIC

  • [Weekly Review] 2020/04/20-26 Apr 26, 2020

    This weekly review includes the introduction of SystemC, modeling, JVM Memory, Rocket Chip's interruption PLIC and CLINT. Also, including CS61B's Graph.

POI

  • [CodeStudy] Scala Excel Read: POI XSSF Nov 11, 2020

    In this article, I introduced how to read a workbook, a sheet, a row and a special cell. The methods to obtain the row number and column number are also given. One way to filter empty cells is introduced too.

Parallelism

  • [Glean] Computer Architectures for Next Generation Applications Jan 18, 2021

    This post is mainly translated from one zhihu answer by Bao Yungang. It introduces three laws: Moore_s law, Makimoto_s wave, Bell_s law and design methods and optimizations for performance and power as well as fragmented requirements in AIoT aging. These methods including reducing data movements, reducing data precious, improve parallelism and agile hardware development.

ParetoOptimal

Physical Synthesis

Posit

PowerWall

Programming Model

  • [Glean] CUDA Programming Model Apr 02, 2024

    This post introduces the CUDA programming model, including kernels, thread hierarchy, thread blocks, thread block clusters, and memory hierarchy.

Prompt

Protocol

Pulp

Pulpissimo

Python

QNN

QPI

Qiskit

Quantum Circuit

Quantum Neural Networks

RISC

RISC-V

RRAM

Refinement

  • [Emulate] Refinement of Computation and Communication Jun 19, 2020

    This post introduces Refinement of Computation and Communication in SystemC. Including different kinds of communication refinement, such as channel refinement, module refinement, hw-hw refinement, sw-sw refinement, hw-sw refinement. It also introduces the steps in communication refinement.

RegMap

Regex

Renaming

  • [Glean] Tomasulo Algorithm Jan 24, 2021

    Tomasulo Algorithm eliminate three kinds of hazard RAW, WAR and WAW hazards by forwarding and renaming. The three stages of this algorithm are issue, execute and write back.

ResNet-50

Retiming

RoCC

  • [Weekly Review] 2020/02/10-16 Feb 16, 2020

    This review contains one way to think matrix multiply, one Chisel class named DataMirror which can monitor the details of ports, and a discussing of how can RoCC accelerator communicate with L2 cache. Also, I continued my survey at AI for HPC.

  • [Weekly Review] 2019/12/09-15 Dec 15, 2019

    This review contains come basic knowledge related to git, RISC-V, Chipyard, RoCC interface, SHA3 and cache.

Rocket Chip

RocketChip

RoeketChip

RoundRobin

  • [Glean] Round-Robin Arbitration Jun 25, 2020

    Round robin arbitration is a scheduling scheme which gives to each requestor its share of using a common resource for a limited time or data elements.

SAD

SDC

SHA3

  • [Weekly Review] 2019/12/09-15 Dec 15, 2019

    This review contains come basic knowledge related to git, RISC-V, Chipyard, RoCC interface, SHA3 and cache.

SINT16

SINT8

Sbt

Scala

  • [CodeStudy] Scala Excel Read: POI XSSF Nov 11, 2020

    In this article, I introduced how to read a workbook, a sheet, a row and a special cell. The methods to obtain the row number and column number are also given. One way to filter empty cells is introduced too.

  • [Weekly Review] 2020/03/23-29 Mar 29, 2020

    This weekly review contains Scala intersection, union and complement, as well as ScalaDoc tags. Also, introduce using console to print colorful log. An error occurred while I using `RegInit` without giving the width to UInt.

  • [Weekly Review] 2020/03/16-22 Mar 22, 2020

    include Scala high-order function, Scala Regex, Chisel forkwithRegion. Also, the definition of `base address` and `offset`

  • [Weekly Review] 2020/03/09-15 Mar 15, 2020

    git commit types. chisel `withRegion`, Scala `collect`, et.

  • [Weekly Review] 2020/02/24-03/01 Mar 01, 2020

    This week I read a deep learning accelerator survey named 'A Survey of Accelerator Architectures for Deep Neural Networks'. Also, I tried to use a Scala library named `Breeze`.

  • [Weekly Review] 2020/02/17-23 Feb 23, 2020

    This week, I continued on the survey of ML4HPC and found several papers of Indiana University, which described the definitions of ML4HPC and its subcategories. Also, I finished the draft implementation of GLB cluster with some test.

  • [Weekly Review] 2020/01/20-26 Jan 26, 2020

    This weekly review contains the usage of `grep` as well as Scala Patton Match

  • [Weekly Review] 2020/01/06-12 Jan 12, 2020

    This review contains some Chisel and Scala syntaxes studying notes.

  • [Tutorial] Establish Linux Environment for Chisel and Chipyard Developments Jan 02, 2020

    This tutorial will help you to establish a Linux environment for Chisel and Chipyard development quickly with little error.

  • [Weekly Review] 2019/12/16-22 Dec 22, 2019

    This review contains some basic knowledge of Scala, and the tutorial of deep learning accelerator designs named 'Efficient Processing of Deep Neural Network: from Algorithms to Hardware Architectures'.

ScalaDoc

  • [Weekly Review] 2020/03/23-29 Mar 29, 2020

    This weekly review contains Scala intersection, union and complement, as well as ScalaDoc tags. Also, introduce using console to print colorful log. An error occurred while I using `RegInit` without giving the width to UInt.

ScaledML

Scrum

  • [Tutorial] Scrum Master Guide Apr 07, 2023

    Generated by ChatGPT4 -- This Scrum Master Guide provides an overview of the Scrum Master's role in five key Scrum meetings -- Sprint Planning, Daily Stand-up, Backlog Refinement, Sprint Review, and Sprint Retrospective. It also discusses the process of assisting in task breakdown and the importance of having a clear and concise Definition of Done. The guide is designed to be a helpful resource for Scrum Masters to facilitate team communication, collaboration, and continuous improvement.

Scrum Master

  • [Tutorial] Scrum Master Guide Apr 07, 2023

    Generated by ChatGPT4 -- This Scrum Master Guide provides an overview of the Scrum Master's role in five key Scrum meetings -- Sprint Planning, Daily Stand-up, Backlog Refinement, Sprint Review, and Sprint Retrospective. It also discusses the process of assisting in task breakdown and the importance of having a clear and concise Definition of Done. The guide is designed to be a helpful resource for Scrum Masters to facilitate team communication, collaboration, and continuous improvement.

Signoff

  • [Glean] Static Sign-Off, Formal & Simulation Feb 01, 2021

    This post introduces the differences of Static Sign-Off, Formal and Simulation by three key functional verification metrics. analysis always finishes, all the violations flagged by the analysis, 100% of the failures are found.

  • [Glean] Formal Signoff Jan 29, 2021

    This post introduces VC Formal Apps, the reason and goal of formal signoff. Later seven steps of formal signoff based on Synopsys are listed.

Simulation

  • [Glean] Static Sign-Off, Formal & Simulation Feb 01, 2021

    This post introduces the differences of Static Sign-Off, Formal and Simulation by three key functional verification metrics. analysis always finishes, all the violations flagged by the analysis, 100% of the failures are found.

Simulator

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

Software2

SpanningTree

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

Sparsity

Strassen

Subfloat

Sublime

Survey

Synthesis

  • [Glean] Cadence Genus Synthesis Check List Mar 06, 2023

    Here lists several messages that should be checked from the Genus synthesis log file to make sure there is no error and mismatch between the simulation and synthesis results.

SystemC

SystemforML

  • [Survey] HPCforML and MLforHPC Feb 23, 2020

    This survey contains two papers 1) Understanding ML driven HPC: Applications and Infrastructure; 2) Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations.

Systolic

Systolic Array

TCM

TDD

TF32

TLB

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

TLM

TPU

TSMC

Tcl

  • [Tutorial] Obtain Objects in the collection in Genus Using Tcl Mar 06, 2023

    collection is an extension provided by EDA vendors like Synopsys to support a list of objects in their Tcl API. Usually, most database query operations in Cadence and Synopsys would return a collection object. Complex query operations with filters may be slow in large design. Pre-store the query results might reduce runtime when it will be used in multiple places.

  • [Tutorial] Background Execution of Reporting Commands in Cadence Genus Mar 06, 2023

    Cadence Genus supports doing report in parallel and running them in the background. This tutorial introduces how to conditional enable this feature using Tcl syntax.

TensorFloat32

Timing

Tomasulo

  • [Glean] Tomasulo Algorithm Jan 24, 2021

    Tomasulo Algorithm eliminate three kinds of hazard RAW, WAR and WAW hazards by forwarding and renaming. The three stages of this algorithm are issue, execute and write back.

Trace

Transformer

  • [Read Paper] Attention Is All You Need Jan 07, 2021

    This blog is the combination of two blogs which introduces the paper Attention is All You Need. Shortages and one improvement is shown, too.

Tunstall

Turing Tax

  • [Glean] Turning Tax Jan 24, 2021

    Turning Tax is a term taught in the advanced computer architecture by Paul H J Kelly at IC London. It describes the overhead (performance, cost, or energy) of the universality of the universal computing devices. It can be caused by instructions, data routing, register access and configurable ALU, where we can reduce the Turning Tax.

Tutorial

UINT16

UINT8

UPF

  • [Workshop] Using UPF for Low Power Design and Verification Nov 27, 2021

    This workshop describes the detailed information related to UPF. Including its definition, terminology, some Tcl commands, etc.

  • [Glean] Unified Power Format Jun 25, 2020

    The Unified Power Format (UPF) is a published IEEE standard. It is intended to ease the job of specifying, simulating and verifying IC designs that have a number of power states and power islands.

Ubuntu

Unfolding

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

Upmem

V2F

VC Formal

VCD

  • [Weekly Review] 2020/04/13-19 Apr 19, 2020

    Contains CS61B binary search tree, red-black trees, and hashing, heap; Three methods to dump vcd files (waveform) in Chisel Testers2; The first two generations verification and the coming third generation verification, plus the defination of simulator, emulation and formal verification.

VIPT

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

VIVT

  • [Weekly Review] 2020/05/04-10 May 10, 2020

    This weekly review includes some knowledge related to cache indexed and tagged methods, TLB, coherence between Cache and DMA, coherence between iCache and dCache, coherence between multiple processors.

VLIW

Verdi

Verification

Verilog

Version Control

Vim

VirtualPrototypes

Windows

Winograd

Xargs

  • [Glean] Remove Empty File Folder Jan 11, 2021

    Introduces two Linux command find and xargs. By combining this two command, you can easily remove empty directories and finish more jobs.

alias

cProfile

data precesion

db

ddc

formal

grey

implicit class

interface

  • [CodeStudy] HPWE and Its Interfaces between Hardware and Software Dec 26, 2020

    This article introduces the MMIO register files of HWPE in Pulp SoC with its related c codes for simulation. It also gives hints of custom modifying the codes to use more registers or more events. I think it can also help you to understand the interaction between hardware and software.

offset

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

playgroud

tcl

tinyML

verification

way

  • [Weekly Review] 2020/04/27-05/03 May 03, 2020

    This weekly review contains spanning tree, A*, Primi's algorithm, Kruskal's algorithm, MST, dynamic programming and LIS. Also introduce some basic terms of Cache, such as offset, cache line, way, cache thrashing, et.

xargs