Many-to-many voice conversion experiments using a Korean speech corpus
- Authors
- Yook, D.; Seo, H.; Ko, B.; Yoo, I.-C.
- Issue Date
- 2022
- Publisher
- Acoustical Society of Korea
- Keywords
- Conditional Cycle-Consistent Generative Adversarial Network (CC-GAN); Cycle-Consistent Variational AutoEncoder (CycleVAE); Generative Adversarial Network (GAN); Variational AutoEncoder (VAE); Voice conversion
- Citation
- Journal of the Acoustical Society of Korea, v.41, no.3, pp.351 - 358
- Indexed
- SCOPUS
KCI
- Journal Title
- Journal of the Acoustical Society of Korea
- Volume
- 41
- Number
- 3
- Start Page
- 351
- End Page
- 358
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/146983
- DOI
- 10.7776/ASK.2022.41.3.351
- ISSN
- 1225-4428
- Abstract
- Recently, Generative Adversarial Networks (GAN) and Variational AutoEncoders (VAE) have been applied to voice conversion that can make use of non-parallel training data. Especially, Conditional Cycle-Consistent Generative Adversarial Networks (CC-GAN) and Cycle-Consistent Variational AutoEncoders (CycleVAE) show promising results in many-to-many voice conversion among multiple speakers. However, the number of speakers has been relatively small in the conventional voice conversion studies using the CC-GANs and the CycleVAEs. In this paper, we extend the number of speakers to 100, and analyze the performances of the many-to-many voice conversion methods experimentally. It has been found through the experiments that the CC-GAN shows 4.5 % less Mel-Cepstral Distortion (MCD) for a small number of speakers, whereas the CycleVAE shows 12.7 % less MCD in a limited training time for a large number of speakers. Copyright © 2022 The Acoustical Society of Korea.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School > Department of Computer Science and Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.