Three-stream fusion network for first-person interaction recognition

Kim, Ye-Ji; Lee, Dong-Gyu; Lee, Seong-Whan

doi:10.1016/j.patcog.2020.107279

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Three-stream fusion network for first-person interaction recognition

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Ye-Ji	-
dc.contributor.author	Lee, Dong-Gyu	-
dc.contributor.author	Lee, Seong-Whan	-
dc.date.accessioned	2021-08-30T20:31:50Z	-
dc.date.available	2021-08-30T20:31:50Z	-
dc.date.created	2021-06-18	-
dc.date.issued	2020-07	-
dc.identifier.issn	0031-3203	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/54928	-
dc.description.abstract	First-person interaction recognition is a challenging task because of unstable video conditions resulting from the camera wearer's movement. For human interaction recognition from a first-person viewpoint, this paper proposes a three-stream fusion network with two main parts: three-stream architecture and three-stream correlation fusion. The three-stream architecture captures the characteristics of the target appearance, target motion, and camera ego-motion. Meanwhile the three-stream correlation fusion combines the feature map of each of the three streams to consider the correlations among the target appearance, target motion, and camera ego-motion. The fused feature vector is robust to the camera movement and compensates for the noise of the camera ego-motion. Short-term intervals are modeled using the fused feature vector, and a long short-term memory (LSTM) model considers the temporal dynamics of the video. We evaluated the proposed method on two public benchmark datasets to validate the effectiveness of our approach. The experimental results show that the proposed fusion method successfully generated a discriminative feature vector, and our network outperformed all competing activity recognition methods in first-person videos where considerable camera ego-motion occurs. (C) 2020 Published by Elsevier Ltd.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	ELSEVIER SCI LTD	-
dc.title	Three-stream fusion network for first-person interaction recognition	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Lee, Seong-Whan	-
dc.identifier.doi	10.1016/j.patcog.2020.107279	-
dc.identifier.scopusid	2-s2.0-85079886700	-
dc.identifier.wosid	000530845000025	-
dc.identifier.bibliographicCitation	PATTERN RECOGNITION, v.103	-
dc.relation.isPartOf	PATTERN RECOGNITION	-
dc.citation.title	PATTERN RECOGNITION	-
dc.citation.volume	103	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordAuthor	First-person vision	-
dc.subject.keywordAuthor	First-person interaction recognition	-
dc.subject.keywordAuthor	Three-stream fusion network	-
dc.subject.keywordAuthor	Three-stream correlation fusion	-
dc.subject.keywordAuthor	Camera ego-motion	-

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lee, Seong Whan photo

Lee, Seong Whan: 인공지능학과

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :9,776,739; Today View :28,834

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE