456 0

Multi-Armed Bandit for Slotted Random Access Systems

Title
Multi-Armed Bandit for Slotted Random Access Systems
Author
이동우
Alternative Author(s)
이동우
Advisor(s)
이주현
Issue Date
2021. 8
Publisher
한양대학교
Degree
Master
Abstract
This work investigates a random access (RA) game for a time-slotted RA system in single, and multi-cell RA systems. In the single-cell RA system, where there is a single access point (AP), and N players choose a set of slots of a frame and each frame consists of M multiple time slots. We obtain the pure strategy Nash equilibria (PNEs) of this RA game, where slots are fully utilized as in the centralized scheduling. As a realizing algorithm for PNEs, we propose an Exponential-weight algorithm for Exploration and Exploitation (EXP3)-based multi-agent (MA) learning algorithm. EXP3 is a bandit algorithm designed to find an optimal strategy in a multi-armed bandit (MAB) problem that users do not know the expected payoff of each strategy. Our simulation results show that the proposed algorithm can achieve PNEs. Moreover, it can adapt to time-varying environments, where the number of players varies over time. In this paper, our goal is to maximize the system throughput in a time-slotted uplink multi-cell random access communication system. To this end, we propose a two-stage reinforcement learning (RL)-based algorithm based on the EXP3. For the multi-cell RA system, we propose a two-stage RL-based algorithm based on the EXP3. The main goal of the proposed algorithm is to maximize the system throughput in a time-slotted multi-cell RA system. In each macro-time slot that consists of multiple time slots, players run the RL-based algorithm to choose the AP. Then, a transmission policy determines the sub-time slot that the player will transmit data in each time slot. Another RL-based learning algorithm is used to obtain an optimal transmission policy. To show that our method is efficient, we compare our proposed algorithm with the ε-greedy algorithm in two different scenarios. The simulation results show that the average system throughput of our algorithm outperforms that of ε-greedy algorithm.
URI
http://hanyang.dcollection.net/common/orgView/200000498737https://repository.hanyang.ac.kr/handle/20.500.11754/163642
Appears in Collections:
GRADUATE SCHOOL[S](대학원) > DEPARTMENT OF ELECTRICAL AND ELECTRONIC ENGINEERING(전자공학과) > Theses (Master)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE