Dynamic Video Delivery using Deep Reinforcement Learning for Device-to-Device Underlaid Cache-Enabled Internet-of-vehicle Networks
- Authors
- Choi, Minseok; Shin, Myungjae; Kim, Joongheon
- Issue Date
- 4월-2021
- Publisher
- KOREAN INST COMMUNICATIONS SCIENCES (K I C S)
- Keywords
- Deep reinforcement learning; device-to-device under-laid network; vehicular networks; video delivery; wireless caching
- Citation
- JOURNAL OF COMMUNICATIONS AND NETWORKS, v.23, no.2, pp.117 - 128
- Indexed
- SCIE
SCOPUS
KCI
- Journal Title
- JOURNAL OF COMMUNICATIONS AND NETWORKS
- Volume
- 23
- Number
- 2
- Start Page
- 117
- End Page
- 128
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/128296
- DOI
- 10.23919/JCN.2021.000006
- ISSN
- 1229-2370
- Abstract
- This paper addresses an Internet-of-vehicle network that utilizes a device-to-device (D2D) underlaid cellular system, where distributed caching at each vehicle is available and the video streaming service is provided via D2D links. Given the spectrum reuse policy, three decisions having different timescales in such a D2D underlaid cache-enabled vehicular network were investigated: 1) The decision on the cache-enabled vehicles for providing contents, 2) power allocation for D2D users, and 3) power allocation for cellular vehicles. Since wireless link activation for video delivery could introduce delays, node association is determined in a larger timescale compared to power allocations. We jointly optimize these delivery decisions by maximizing the average video quality under the constraints on the playback delays of streaming users and the data rate guarantees for cellular vehicles. Depending on the channel and queue states of users, the decision on the cache-enabled vehicle for video delivery is adaptively made based on the frame-based Lyapunov optimization theory by comparing the expected costs of vehicles. For each cache-enabled vehicle, the expected cost is obtained from the stochastic shortest path problem that is solved by deep reinforcement learning without the knowledge of global channel state information. Specifically, the deep deterministic policy gradient (DDPG) algorithm is adopted for dealing with the very large state space, i.e., time-varying channel states. Simulation results verify that the proposed video delivery algorithm achieves all the given goals, i.e., average video quality, smooth playback, and reliable data rates for cellular vehicles.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > School of Electrical Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.