BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions
DC Field | Value | Language |
---|---|---|
dc.contributor.author | 신운섭 | - |
dc.contributor.author | 송상헌 | - |
dc.date.accessioned | 2022-10-06T13:41:57Z | - |
dc.date.available | 2022-10-06T13:41:57Z | - |
dc.date.created | 2022-10-06 | - |
dc.date.issued | 2022 | - |
dc.identifier.issn | 1229-4039 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/144123 | - |
dc.description.abstract | This study aims to understand BERT’s linguistic ability using naturally occurring data. In particular, the study collected marginal language data, such as what we do is create Frankenstein, which is referred to as a Do-Be construction (DBC) (Flickinger & Wasow, 2013). Using web corpora, the study first collected 17,737 instances of the DBC across text genres and English dialects. The corpus analysis supports the idea that DBC is a computationally challenging phenomenon for data-driven language systems due to its statistical sparsity and linguistic complexity. With manual annotations of DBCs, the study designed two computational prediction tasks: subject―verb agreement and synonym substitution tasks, based on the introspective judgment of linguists. The study found that BERT is hugely sensitive to linguistic acceptability of grammatical forms and felicitous words in the prediction tasks, even though the target phenomenon is rarely observed in corpus data. These results show that the neural language model, BERT, can learn abstract linguistic properties beyond surface frequency information. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | 한국언어학회 | - |
dc.title | BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions | - |
dc.title.alternative | BERT Learns More than Word Frequency Information: A Case Study of Do-Be Constructions | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | 송상헌 | - |
dc.identifier.doi | 10.18855/lisoko.2022.47.3.004 | - |
dc.identifier.bibliographicCitation | 언어, v.47, no.3, pp.467 - 489 | - |
dc.relation.isPartOf | 언어 | - |
dc.citation.title | 언어 | - |
dc.citation.volume | 47 | - |
dc.citation.number | 3 | - |
dc.citation.startPage | 467 | - |
dc.citation.endPage | 489 | - |
dc.type.rims | ART | - |
dc.identifier.kciid | ART002882708 | - |
dc.description.journalClass | 2 | - |
dc.description.journalRegisteredClass | kci | - |
dc.subject.keywordAuthor | Do-Be construction | - |
dc.subject.keywordAuthor | agreement attraction | - |
dc.subject.keywordAuthor | neural language model | - |
dc.subject.keywordAuthor | synonym substitution | - |
dc.subject.keywordAuthor | web corpora | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.