화자 임베딩과 발화 리듬의 연관관계에 대한 연구
DC Field | Value | Language |
---|---|---|
dc.contributor.author | 김서현 | - |
dc.contributor.author | 남호성 | - |
dc.date.accessioned | 2022-03-06T10:40:36Z | - |
dc.date.available | 2022-03-06T10:40:36Z | - |
dc.date.created | 2022-02-10 | - |
dc.date.issued | 2021 | - |
dc.identifier.issn | 1225-4975 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/137973 | - |
dc.description.abstract | The present study investigates if speech rhythm is encoded in the utterance-level speaker embedding which is an averaged value of frame-level speaker embeddings. When speaker encoders are used in Computer Assisted Pronunciation Training, finding what information is included in speaker embeddings is crucial because it defines what feature a learner should acquire to be fluent speaker. Rhythm has been regarded as a speaker identifiable feature. The speaker embeddings, however, may fail to capture rhythm features since the temporal dependency of prosody is likely to be lost by simple averaging. To quantify the degree to which rhythm information is encoded in the speaker embedding, the speaker embeddings were projected to the feature space by least square linear regression. The R-squared values for the rhythm features were consistently low across the models with the different number of parameters, in contrast to the acoustic features which showed the significantly high R-squared values. The result indicates that the utterance-mean embeddings did not encode speech rhythm of individual speaker. Based on the result, the way to better adopt speaker embeddings in CAPT system is discussed. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | 한국외국어대학교 외국어교육연구소 | - |
dc.title | 화자 임베딩과 발화 리듬의 연관관계에 대한 연구 | - |
dc.title.alternative | Does the Speaker Embedding Encode Speech Rhythm? | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | 남호성 | - |
dc.identifier.doi | 10.16933/sfle.2021.35.2.131 | - |
dc.identifier.bibliographicCitation | 외국어교육연구, v.35, no.2, pp.131 - 144 | - |
dc.relation.isPartOf | 외국어교육연구 | - |
dc.citation.title | 외국어교육연구 | - |
dc.citation.volume | 35 | - |
dc.citation.number | 2 | - |
dc.citation.startPage | 131 | - |
dc.citation.endPage | 144 | - |
dc.type.rims | ART | - |
dc.identifier.kciid | ART002719268 | - |
dc.description.journalClass | 2 | - |
dc.description.journalRegisteredClass | kci | - |
dc.subject.keywordAuthor | Computer Assisted Pronunciation Training | - |
dc.subject.keywordAuthor | Speaker Embedding | - |
dc.subject.keywordAuthor | Voice Conversion | - |
dc.subject.keywordAuthor | 음성 변환 | - |
dc.subject.keywordAuthor | 컴퓨터 보조 발음 학습 | - |
dc.subject.keywordAuthor | 화자 임베딩 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.