화자 임베딩과 발화 리듬의 연관관계에 대한 연구

김서현; 남호성

doi:10.16933/sfle.2021.35.2.131

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

화자 임베딩과 발화 리듬의 연관관계에 대한 연구

Full metadata record

DC Field	Value	Language
dc.contributor.author	김서현	-
dc.contributor.author	남호성	-
dc.date.accessioned	2022-03-06T10:40:36Z	-
dc.date.available	2022-03-06T10:40:36Z	-
dc.date.created	2022-02-10	-
dc.date.issued	2021	-
dc.identifier.issn	1225-4975	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/137973	-
dc.description.abstract	The present study investigates if speech rhythm is encoded in the utterance-level speaker embedding which is an averaged value of frame-level speaker embeddings. When speaker encoders are used in Computer Assisted Pronunciation Training, finding what information is included in speaker embeddings is crucial because it defines what feature a learner should acquire to be fluent speaker. Rhythm has been regarded as a speaker identifiable feature. The speaker embeddings, however, may fail to capture rhythm features since the temporal dependency of prosody is likely to be lost by simple averaging. To quantify the degree to which rhythm information is encoded in the speaker embedding, the speaker embeddings were projected to the feature space by least square linear regression. The R-squared values for the rhythm features were consistently low across the models with the different number of parameters, in contrast to the acoustic features which showed the significantly high R-squared values. The result indicates that the utterance-mean embeddings did not encode speech rhythm of individual speaker. Based on the result, the way to better adopt speaker embeddings in CAPT system is discussed.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	한국외국어대학교 외국어교육연구소	-
dc.title	화자 임베딩과 발화 리듬의 연관관계에 대한 연구	-
dc.title.alternative	Does the Speaker Embedding Encode Speech Rhythm?	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	남호성	-
dc.identifier.doi	10.16933/sfle.2021.35.2.131	-
dc.identifier.bibliographicCitation	외국어교육연구, v.35, no.2, pp.131 - 144	-
dc.relation.isPartOf	외국어교육연구	-
dc.citation.title	외국어교육연구	-
dc.citation.volume	35	-
dc.citation.number	2	-
dc.citation.startPage	131	-
dc.citation.endPage	144	-
dc.type.rims	ART	-
dc.identifier.kciid	ART002719268	-
dc.description.journalClass	2	-
dc.description.journalRegisteredClass	kci	-
dc.subject.keywordAuthor	Computer Assisted Pronunciation Training	-
dc.subject.keywordAuthor	Speaker Embedding	-
dc.subject.keywordAuthor	Voice Conversion	-
dc.subject.keywordAuthor	음성 변환	-
dc.subject.keywordAuthor	컴퓨터 보조 발음 학습	-
dc.subject.keywordAuthor	화자 임베딩	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Liberal Arts > Department of English Language and Literature > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Nam, Ho sung photo

Nam, Ho sung: 문과대학 (영어영문학과)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,357,340; Today View :11,150

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE