369 0

PreScaler: An Efficient System-aware Precision Scaling Framework on Heterogeneous Systems

Title
PreScaler: An Efficient System-aware Precision Scaling Framework on Heterogeneous Systems
Author
박영준
Keywords
HSA; Precision Scaling; Profile-guided; Compiler; Runtime
Issue Date
2020-02
Publisher
Association for Computing Machinery
Citation
CGO 2020: Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, Page. 280-292
Abstract
Graphics processing units (GPUs) have been commonly utilized to accelerate multiple emerging applications, such as big data processing and machine learning. While GPUs are proven to be effective, approximate computing, to trade off performance with accuracy, is one of the most common solutions for further performance improvement. Precision scaling of originally high-precision values into lower-precision values has recently been the most widely used GPU-side approximation technique, including hardware-level half-precision support. Although several approaches to find optimal mixed-precision configuration of GPU-side kernels have been introduced, total program performance gain is often low because total execution time is the combination of data transfer, type conversion, and kernel execution. As a result, kernel-level scaling may incur high type-conversion overhead of the kernel input/output data. To address this problem, this paper proposes an automatic precision scaling framework called PreScaler that maximizes the program performance at the memory object level by considering whole OpenCL program flows. The main difficulty is that the best configuration cannot be easily predicted due to various application- and system-specific characteristics. PreScaler solves this problem using search space minimization and decision-tree-based search processes. First, it minimizes the number of test configurations based on the information from system inspection and dynamic profiling. Then, it finds the best memory-object level mixed-precision configuration using a decision-tree-based search. PreScaler achieves an average performance gain of 1.33x over the baseline while maintaining the target output quality level.
URI
https://dl.acm.org/doi/10.1145/3368826.3377917https://repository.hanyang.ac.kr/handle/20.500.11754/161202
ISBN
978-1-4503-7047-9
DOI
10.1145/3368826.3377917
Appears in Collections:
COLLEGE OF ENGINEERING[S](공과대학) > COMPUTER SCIENCE(컴퓨터소프트웨어학부) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE