[Survey] MLforHPC Benchmarks

Published: February 16, 2020 by SingularityKChen (Last updated: March 07, 2020 )

Categories:
ReadPaper 30
Survey 4

ML For HPC Benchmarks

ML For HPC Benchmarks

The screenshot bellow shows the materials related to HPC benchmarks. Although most of them are HPC for AI instead of AI for HPC.

Current Benchmarks

Implications of Integration of Deep Learning and HPC for Benchmarking

ML4HPC & HPC4ML

HPC AI500: A Benchmark Suite for HPC AI Systems

HPC AI500 Methodology

A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning

Components in Deep Learning

This benchmark utilize the Open Neural Network Exchange format to store DNN reproducibly. And almost all frameworks support generate this kind of files. So this benchmark could combine them together.

The essence of Deep500 is its layered and modular design that allows to independently extend and benchmark DL procedures related to simple operators, whole neural networks, training schemes, and distributed training. The principles behind this design can be reused to enable interpretable and reproducible benchmarking of extreme-scale codes in domains outside DL.

Benchmark Challenges

customizability: the ability to seamlessly and effortlessly combine different features of different frameworks and still be able to provide fair analysis of their performance and accuracy.
metrics: considering performance, correctness, and convergence in shared- as well as distributed-memory environments.
- some metrics can test the performance of both the whole computation and fine-grained elements, for example latency or overhead.
- others, such as accuracy or bias, assess the quality of a given algorithm, its convergence, and its generalization towards previously-unseen data.
- combine performance and accuracy (time-to-accuracy) to analyze the tradeoffs.
- propose metrics for the distributed part of DL codes: communication volume and I/O latency.
performance and scalability: the design of distributed benchmarking of DL to ensure high performance and scalability.
validation: A benchmarking infrastructure for DL must allow to validate results with respect to several criteria. As we discuss in § V, Deep500 offers validation of convergence, correctness, accuracy, and performance.
reproducibility: the ability to reproduce or at least interpret results.

Design and Implementation of Deep500

The design of Deep500

The core enabler in Deep500 is the modular design that groups all the required functionalities into four levels:

“Operators”
“Network Processing”
“Training”
“Distributed Training”

Each level provides relevant abstractions, interfaces, reference implementations, validation procedures, and metrics.