Towards robust explanations for deep neural networks

Dombrowski, Ann-Kathrin; Anders, Christopher J.; Mueller, Klaus-Robert; Kessel, Pan

doi:10.1016/j.patcog.2021.108194

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Towards robust explanations for deep neural networks

Full metadata record

DC Field	Value	Language
dc.contributor.author	Dombrowski, Ann-Kathrin	-
dc.contributor.author	Anders, Christopher J.	-
dc.contributor.author	Mueller, Klaus-Robert	-
dc.contributor.author	Kessel, Pan	-
dc.date.accessioned	2022-02-23T03:41:25Z	-
dc.date.available	2022-02-23T03:41:25Z	-
dc.date.created	2022-02-11	-
dc.date.issued	2022-01	-
dc.identifier.issn	0031-3203	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/136578	-
dc.description.abstract	Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insights, we present three different techniques to boost robustness against manipulation: training with weight decay, smoothing activation functions, and minimizing the Hessian of the network. Our experimental results confirm the effectiveness of these approaches. (c) 2021 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	ELSEVIER SCI LTD	-
dc.title	Towards robust explanations for deep neural networks	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Mueller, Klaus-Robert	-
dc.identifier.doi	10.1016/j.patcog.2021.108194	-
dc.identifier.scopusid	2-s2.0-85112531912	-
dc.identifier.wosid	000701175900010	-
dc.identifier.bibliographicCitation	PATTERN RECOGNITION, v.121	-
dc.relation.isPartOf	PATTERN RECOGNITION	-
dc.citation.title	PATTERN RECOGNITION	-
dc.citation.volume	121	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordAuthor	Explanation method	-
dc.subject.keywordAuthor	Saliency map	-
dc.subject.keywordAuthor	Adversarial attacks	-
dc.subject.keywordAuthor	Manipulation	-
dc.subject.keywordAuthor	Neural networks	-

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,700,451; Today View :31,885

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE