Repository at Hanyang University: 멀티 디바이스 환경에서의 연산 병렬화 기법 적용 방안 및 HLS 기반 FPGA 가속기 자원 최적화

Browse

My Repository

Repository at Hanyang UniversityGRADUATE SCHOOL[S](대학원)COMPUTER SCIENCE(컴퓨터·소프트웨어학과)Theses (Master)

344 0

멀티 디바이스 환경에서의 연산 병렬화 기법 적용 방안 및 HLS 기반 FPGA 가속기 자원 최적화

Title: 멀티 디바이스 환경에서의 연산 병렬화 기법 적용 방안 및 HLS 기반 FPGA 가속기 자원 최적화

Other Titles: Inference Parallelization Strategy on Multi Edge Devices and Hardware Resource Optimization for HLS based FPGA Accelerators

Author: 김규진

Alternative Author(s): Kyujin Kim

Advisor(s): 박영준

Issue Date: 2022. 2

Publisher: 한양대학교

Degree: Master

Abstract: Several technologies have been proposed to improve the computational speed by dividing the huge computational amount of an artificial intelligence network from a single computational device to multiple edge devices. Study for efficiency in such multiple edge devices may have two main goals. First, there is a way to optimize the edge device hardware itself. The overall performance can be improved by maximizing limited resources and deriving improved performance. Second, it is a method to divide and conquer neural network operations by distributing workloads efficiently using a smart compiler. In this paper, hardware optimization method and software optimization method are simultaneously shown to improve performance in multiple edge device environments. In hardware optimization, the size of the hardware is a very important factor. In this paper, I propose an optimization direction of a single hardware, which is TVM's VTA. In particular, the VTA hardware is developed based on HLS-C. A more efficient hardware implementation method is proposed by utilizing HLS-Blackbox[7] and implement hardware in Verilog language. Furthermore, based on TVM, I analyze the data parallelization and the model parallelization technique with ResNet18 network model[5]. Virtual multiple edge devices environment is set up, and the most efficient workload split method in each of given environment is proposed.|인공지능 네트워크의 거대한 연산량을 단일 연산 디바이스에서 여러 엣지 혹은 디바이스로 나누어 처리하여 연산 속도 향상을 위한 여러 기술이 제안됐다. 다수의 엣지 디바이스에서 최적의 효율을 내기 위한 연구는 크게 두 가지 목표를 가질 수 있다. 첫째, 엣지 디바이스를 최적화하는 방법이 있다. 한정된 자원을 최대한으로 활용하여 향상된 성능을 도출하여 전체적인 성능 향상을 이룰 수 있다. 둘째, 스마트한 컴파일러를 이용해 엣지 디바이스에 효율적인 워크로드를 분배하여 뉴럴 네트워크의 연산을 나누어 처리하는 것이다. 본 논문은 다수의 엣지 디바이스 환경에서의 성능 향상을 위해 하드웨어 최적화 기법과 소프트웨어 최적화 기법에 대한 동시 접근을 진행한다. 하드웨어 최적화에 있어 하드웨어의 사이즈는 엣지 디바이스 개수, 전력 등 매우 중요한 요소이다. 본 논문은 TVM의 VTA를 활용하여 단일 하드웨어 측면에서 최적화 방향성을 제안하며, 특히 HLS-C 기반으로 개발된 VTA의 하드웨어를 HLS-Blackbox[7]를 활용하여 Verilog 언어를 추가 삽입해 더 효율적인 하드웨어 구현 방식을 제안한다. 나아가 TVM을 기반으로 ResNet18 모델[5]을 기반으로 데이터 및 모델 병렬화 기법에 대한 분석을 진행한다. 다수 엣지 디바이스 환경을 가상으로 설정하고 데이터 입력 상황을 설정하여, 각 환경에 대해 다수의 엣지 디바이스 환경에서 효율적인 워크로드 분배 방식을 분석, 제안한다.

URI: http://hanyang.dcollection.net/common/orgView/200000592430 https://repository.hanyang.ac.kr/handle/20.500.11754/167514

Appears in Collections:: GRADUATE SCHOOL[S](대학원) > COMPUTER SCIENCE(컴퓨터·소프트웨어학과) > Theses (Master)

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE