음질 및 속도 향상을 위한 선형 스펙트로그램 활용 Text-to-speech

윤혜빈; 남호성

doi:10.13064/KSSS.2021.13.3.071

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

음질 및 속도 향상을 위한 선형 스펙트로그램 활용 Text-to-speechText-to-speech with linear spectrogram prediction for quality and speed improvement

Other Titles: Text-to-speech with linear spectrogram prediction for quality and speed improvement

Authors: 윤혜빈; 남호성

Issue Date: 2021

Publisher: 한국음성학회

Keywords: artificial intelligence; machine learning; speech synthesis; text-to-speech (TTS)

Citation: 말소리와 음성과학, v.13, no.3, pp.71 - 78

Indexed: KCI

Journal Title: 말소리와 음성과학

Volume: 13

Number: 3

Start Page: 71

End Page: 78

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/138207

DOI: 10.13064/KSSS.2021.13.3.071

ISSN: 2005-8063

Abstract: Most neural-network-based speech synthesis models utilize neural vocoders to convert mel-scaled spectrograms into high-quality, human-like voices. However, neural vocoders combined with mel-scaled spectrogram prediction models demand considerable computer memory and time during the training phase and are subject to slow inference speeds in an environment where GPU is not used. This problem does not arise in linear spectrogram prediction models, as they do not use neural vocoders, but these models suffer from low voice quality. As a solution, this paper proposes a Tacotron 2 and Transformer-based linear spectrogram prediction model that produces high-quality speech and does not use neural vocoders. Experiments suggest that this model can serve as the foundation of a high-quality text-to-speech model with fast inference speed.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Liberal Arts > Department of English Language and Literature > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Nam, Ho sung photo

Nam, Ho sung: 문과대학 (영어영문학과)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :9,522,679; Today View :14,593

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE