BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions

신운섭; 송상헌

doi:10.18855/lisoko.2022.47.3.004

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions

Full metadata record

DC Field	Value	Language
dc.contributor.author	신운섭	-
dc.contributor.author	송상헌	-
dc.date.accessioned	2022-10-06T13:41:57Z	-
dc.date.available	2022-10-06T13:41:57Z	-
dc.date.created	2022-10-06	-
dc.date.issued	2022	-
dc.identifier.issn	1229-4039	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/144123	-
dc.description.abstract	This study aims to understand BERT’s linguistic ability using naturally occurring data. In particular, the study collected marginal language data, such as what we do is create Frankenstein, which is referred to as a Do-Be construction (DBC) (Flickinger & Wasow, 2013). Using web corpora, the study first collected 17,737 instances of the DBC across text genres and English dialects. The corpus analysis supports the idea that DBC is a computationally challenging phenomenon for data-driven language systems due to its statistical sparsity and linguistic complexity. With manual annotations of DBCs, the study designed two computational prediction tasks: subject―verb agreement and synonym substitution tasks, based on the introspective judgment of linguists. The study found that BERT is hugely sensitive to linguistic acceptability of grammatical forms and felicitous words in the prediction tasks, even though the target phenomenon is rarely observed in corpus data. These results show that the neural language model, BERT, can learn abstract linguistic properties beyond surface frequency information.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	한국언어학회	-
dc.title	BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions	-
dc.title.alternative	BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	송상헌	-
dc.identifier.doi	10.18855/lisoko.2022.47.3.004	-
dc.identifier.bibliographicCitation	언어, v.47, no.3, pp.467 - 489	-
dc.relation.isPartOf	언어	-
dc.citation.title	언어	-
dc.citation.volume	47	-
dc.citation.number	3	-
dc.citation.startPage	467	-
dc.citation.endPage	489	-
dc.type.rims	ART	-
dc.identifier.kciid	ART002882708	-
dc.description.journalClass	2	-
dc.description.journalRegisteredClass	kci	-
dc.subject.keywordAuthor	Do-Be construction	-
dc.subject.keywordAuthor	agreement attraction	-
dc.subject.keywordAuthor	neural language model	-
dc.subject.keywordAuthor	synonym substitution	-
dc.subject.keywordAuthor	web corpora	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Liberal Arts > Department of Linguistics > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,426,085; Today View :2,499

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE