CATs++: Boosting Cost Aggregation with Convolutions and Transformers

Cho, S.; Hong, S.; Kim, S.

doi:10.1109/TPAMI.2022.3218727

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

CATs++: Boosting Cost Aggregation with Convolutions and Transformers

Full metadata record

DC Field	Value	Language
dc.contributor.author	Cho, S.	-
dc.contributor.author	Hong, S.	-
dc.contributor.author	Kim, S.	-
dc.date.accessioned	2022-12-11T16:41:15Z	-
dc.date.available	2022-12-11T16:41:15Z	-
dc.date.created	2022-12-08	-
dc.date.issued	2022	-
dc.identifier.issn	0162-8828	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/147045	-
dc.description.abstract	Cost aggregation is a process in image matching tasks that aims to disambiguate the noisy matching scores. Existing methods generally tackle this by hand-crafted or CNN-based methods, which either lack robustness to severe deformations or inherit the limitation of CNNs that fail to discriminate incorrect matches due to limited receptive fields and inadaptability. In this paper, we introduce Cost Aggregation with Transformers (CATs) to tackle this by exploring global consensus among initial correlation map with the help of some architectural designs that allow us to benefit from global receptive fields of self-attention mechanism. To this end, we include appearance affinity modeling, which helps to disambiguate the noisy initial correlation maps. Furthermore, we introduce some techniques, including multi-level aggregation to exploit rich semantics prevalent at different feature levels and swapping self-attention to obtain reciprocal matching scores to act as a regularization. Although CATs can attain competitive performance, it may face some limitations, <italic>i.e.</italic>, high computational costs, which may restrict its applicability only at limited resolution and hurt performance. To overcome this, we propose CATs++, an extension of CATs. Concretely, we introduce early convolutions prior to cost aggregation with a transformer to control the number of tokens and inject some convolutional inductive bias, then propose a novel transformer architecture for both efficient and effective cost aggregation, which results in apparent performance boost and cost reduction. With the reduced costs, we are able to compose our network with a hierarchical structure to process higher-resolution inputs. We show that the proposed method with these integrated outperforms the previous state-of-the-art methods by large margins. Codes and pretrained weights are available at: <uri>https://ku-cvlab.github.io/CATs-PlusPlus-Project-Page/</uri> IEEE	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	IEEE Computer Society	-
dc.title	CATs++: Boosting Cost Aggregation with Convolutions and Transformers	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Kim, S.	-
dc.identifier.doi	10.1109/TPAMI.2022.3218727	-
dc.identifier.scopusid	2-s2.0-85141643700	-
dc.identifier.bibliographicCitation	IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1 - 20	-
dc.relation.isPartOf	IEEE Transactions on Pattern Analysis and Machine Intelligence	-
dc.citation.title	IEEE Transactions on Pattern Analysis and Machine Intelligence	-
dc.citation.startPage	1	-
dc.citation.endPage	20	-
dc.type.rims	ART	-
dc.type.docType	Article in Press	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Computer architecture	-
dc.subject.keywordAuthor	Correlation	-
dc.subject.keywordAuthor	cost aggregation	-
dc.subject.keywordAuthor	Costs	-
dc.subject.keywordAuthor	efficient transformer	-
dc.subject.keywordAuthor	Feature extraction	-
dc.subject.keywordAuthor	Semantic visual correspondence	-
dc.subject.keywordAuthor	Semantics	-
dc.subject.keywordAuthor	Task analysis	-
dc.subject.keywordAuthor	Transformers	-

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,700,451; Today View :31,885

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE