150 0

Full metadata record

DC FieldValueLanguage
dc.contributor.author서지원-
dc.date.accessioned2022-03-21T07:19:56Z-
dc.date.available2022-03-21T07:19:56Z-
dc.date.issued2020-07-
dc.identifier.citation2020 57th ACM/IEEE Design Automation Conference (DAC), page. 1-6en_US
dc.identifier.isbn978-1-7281-1085-1-
dc.identifier.issn0738-100X-
dc.identifier.urihttps://ieeexplore.ieee.org/document/9218518-
dc.identifier.urihttps://repository.hanyang.ac.kr/handle/20.500.11754/169267-
dc.description.abstractTraining a deep neural network (DNN) is expensive, requiring a large amount of computation time. While the training overhead is high, not all computation in DNN training is equal. Some parameters converge faster and thus their gradient computation may contribute little to the parameter update; in nearstationary points a subset of parameters may change very little. In this paper we exploit the parameter convergence to optimize gradient computation in DNN training. We design a light-weight monitoring technique to track the parameter convergence; we prune the gradient computation stochastically for a group of semantically related parameters, exploiting their convergence correlations. These techniques are efficiently implemented in existing GPU kernels. In our evaluation the optimization techniques substantially and robustly improve the training throughput for four DNN models on three public datasets.en_US
dc.description.sponsorshipThis work is supported by Samsung Research, Samsung Electronics Co., LTd, by the National Research Foundation of Korea (NRF) grant (No. 2018R1D1A1B07050609), and by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2013-0-00109, WiseKB: Big data based self-evolving knowledge base and reasoning platform). We thank Jinwon Lee for the preliminary experiments. The corresponding authors are Jiwon Seo and Yongjun Park.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectConvergenceen_US
dc.subjectTrainingen_US
dc.subjectNeuronsen_US
dc.subjectCorrelationen_US
dc.subjectMonitoringen_US
dc.subjectKernelen_US
dc.subjectHistoryen_US
dc.titleConvergence-Aware Neural Network Trainingen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/DAC18072.2020.9218518-
dc.relation.page1-6-
dc.contributor.googleauthorOh, Hyungjun-
dc.contributor.googleauthorYu, Yongseung-
dc.contributor.googleauthorRyu, Giha-
dc.contributor.googleauthorAhn, Gunjoo-
dc.contributor.googleauthorJeong, Yuri-
dc.contributor.googleauthorPark, Yongjun-
dc.contributor.googleauthorSeo, Jiwon-
dc.relation.code20200058-
dc.sector.campusS-
dc.sector.daehakCOLLEGE OF ENGINEERING[S]-
dc.sector.departmentSCHOOL OF COMPUTER SCIENCE-
dc.identifier.pidseojiwon-
Appears in Collections:
COLLEGE OF ENGINEERING[S](공과대학) > COMPUTER SCIENCE(컴퓨터소프트웨어학부) > Articles
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE