Repository at Hanyang University: Solely Transformer Based Variational Auto Encoder for Language Modeling

Browse

My Repository

Repository at Hanyang UniversityGRADUATE SCHOOL[S](대학원)INDUSTRIAL ENGINEERING(산업공학과)Theses (Master)

573 0

Solely Transformer Based Variational Auto Encoder for Language Modeling

Title: Solely Transformer Based Variational Auto Encoder for Language Modeling

Other Titles: 언어모델을 위한 트랜스포머 변분 오토인코더

Author: 옥창원

Alternative Author(s): 옥창원

Advisor(s): 이기천

Issue Date: 2021. 8

Publisher: 한양대학교

Degree: Master

Abstract: In natural language processing (NLP), Transformer is widely used and has reached state of art in many fields, such as language modeling, summarization, classification and so on. Moreover, a variational auto encoder (VAE) are an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse also known as KL vanishing. To reduce this problem and enhance the parallelization of processing language data, we propose a new language model, as an integration of seemingly two difference deep learning models, that is a Transformer model solely coupled with variational auto encoder. We compare the proposed model with existing previous works, a VAE structured with a recurrent neural network (RNN).Our experiments with four real-life datasets show our implementation with kl annealing reduces posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning and that the encoded representations of the proposed model are better informative than other tested models.|자연어 처리(NLP)에 있어서, 트랜스포머(Transformer)는 많은 분야에서 활용되고 있고, 언어 모델, 요약, 분류 등의 많은 영역에서 최고 수준의 성능을 발휘하고 있다. 그리고 변분 오토 인코더(Variational auto encoder, VAE)는 인코드된 표현을 딥러닝을 통한 통계적 추론을 함으로써 표현 학습에 있어서 우수하고 효과적인 생성 모델이다. 그러나 VAE를 자연어 처리에 활용할 때, 사후 붕괴 또는 KL 소멸과 같은 많은 실제적인 문제가 뒤따라 온다. 이 문제를 줄이고, 언어 자료의 병렬 처리를 효율적으로 증대하기 위해, 우리는 서로 다른 딥러닝 모델인 트랜스포머와 변분 오토 인코더를 결합한 새로운 언어 모델을 제안한다. 우리는 이 제안된 모델을 기존에 존재한 모델인 VAE와 RNN을 결합한 모델들과 비교할 것입니다. 우리 실험은 4가지의 실제적 데이터를 활용했고, KL 완화란 사후 붕괴를 감소하기 위한 기술을 함께 활용했습니다. 실험 결과는 제안된 트랜스포머 기반 방식이 RNN 기반의 방식보다 재건축 그리고 표현학습에 능하고, 제안된 모델의 인코드된 표현이 다른 실험된 방식들의 표현보다 더 높은 수준의 정보를 담고 있는 것을 보였습니다.

URI: http://hanyang.dcollection.net/common/orgView/200000496917 https://repository.hanyang.ac.kr/handle/20.500.11754/164007

Appears in Collections:: GRADUATE SCHOOL[S](대학원) > INDUSTRIAL ENGINEERING(산업공학과) > Theses (Master)

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE