Image classification and captioning model considering a CAM-based disagreement loss

Yoon, Yeo Chan; Park, So Young; Park, Soo Myoung; Lim, Heuiseok

doi:10.4218/etrij.2018-0621

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Image classification and captioning model considering a CAM-based disagreement loss

Authors: Yoon, Yeo Chan; Park, So Young; Park, Soo Myoung; Lim, Heuiseok

Issue Date: 2월-2020

Publisher: WILEY

Keywords: deep learning; image captioning; image classification

Citation: ETRI JOURNAL, v.42, no.1, pp.67 - 77

Indexed: SCIE
SCOPUS
KCI

Journal Title: ETRI JOURNAL

Volume: 42

Number: 1

Start Page: 67

End Page: 77

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/57765

DOI: 10.4218/etrij.2018-0621

ISSN: 1225-6463

Abstract: Image captioning has received significant interest in recent years, and notable results have been achieved. Most previous approaches have focused on generating visual descriptions from images, whereas a few approaches have exploited visual descriptions for image classification. This study demonstrates that a good performance can be achieved for both description generation and image classification through an end-to-end joint learning approach with a loss function, which encourages each task to reach a consensus. When given images and visual descriptions, the proposed model learns a multimodal intermediate embedding, which can represent both the textual and visual characteristics of an object. The performance can be improved for both tasks by sharing the multimodal embedding. Through a novel loss function based on class activation mapping, which localizes the discriminative image region of a model, we achieve a higher score when the captioning and classification model reaches a consensus on the key parts of the object. Using the proposed model, we established a substantially improved performance for each task on the UCSD Birds and Oxford Flowers datasets.

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :9,544,505; Today View :8,871

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE