Compact and Computationally Efficient Representation of Deep Neural Networks

Wiedemann, Simon; Mueller, Klaus-Robert; Samek, Wojciech

doi:10.1109/TNNLS.2019.2910073

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Compact and Computationally Efficient Representation of Deep Neural Networks

Full metadata record

DC Field	Value	Language
dc.contributor.author	Wiedemann, Simon	-
dc.contributor.author	Mueller, Klaus-Robert	-
dc.contributor.author	Samek, Wojciech	-
dc.date.accessioned	2021-08-31T09:00:32Z	-
dc.date.available	2021-08-31T09:00:32Z	-
dc.date.created	2021-06-18	-
dc.date.issued	2020-03	-
dc.identifier.issn	2162-237X	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/57562	-
dc.description.abstract	At the core of any inference procedure, deep neural networks are dot product operations, which are the component that requires the highest computational resources. For instance, deep neural networks, such as VGG-16, require up to 15-G operations in order to perform the dot products present in a single forward pass, which results in significant energy consumption and thus limits their use in resource-limited environments, e.g., on embedded devices or smartphones. One common approach to reduce the complexity of the inference is to prune and quantize the weight matrices of the neural network. Usually, this results in matrices whose entropy values are low, as measured relative to the empirical probability mass distribution of its elements. In order to efficiently exploit such matrices, one usually relies on, inter alia, sparse matrix representations. However, most of these common matrix storage formats make strong statistical assumptions about the distribution of the elements; therefore, cannot efficiently represent the entire set of matrices that exhibit low-entropy statistics (thus, the entire set of compressed neural network weight matrices). In this paper, we address this issue and present new efficient representations for matrices with low-entropy statistics. Alike sparse matrix data structures, these formats exploit the statistical properties of the data in order to reduce the size and execution complexity. Moreover, we show that the proposed data structures can not only be regarded as a generalization of sparse formats but are also more energy and time efficient under practically relevant assumptions. Finally, we test the storage requirements and execution performance of the proposed formats on compressed neural networks and compare them to dense and sparse representations. We experimentally show that we are able to attain up to x42 compression ratios, x5 speed ups, and x 90 energy savings when we lossless convert the state-of-the-art networks, such as AlexNet, VGG-16, ResNet152, and DenseNet, into the new data structures and benchmark their respective dot product.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Compact and Computationally Efficient Representation of Deep Neural Networks	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Mueller, Klaus-Robert	-
dc.identifier.doi	10.1109/TNNLS.2019.2910073	-
dc.identifier.wosid	000521961300006	-
dc.identifier.bibliographicCitation	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, v.31, no.3, pp.772 - 785	-
dc.relation.isPartOf	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS	-
dc.citation.title	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS	-
dc.citation.volume	31	-
dc.citation.number	3	-
dc.citation.startPage	772	-
dc.citation.endPage	785	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Computer Science, Theory & Methods	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordAuthor	Computationally efficient deep learning	-
dc.subject.keywordAuthor	data structures	-
dc.subject.keywordAuthor	lossless coding	-
dc.subject.keywordAuthor	neural network compression	-
dc.subject.keywordAuthor	sparse matrices	-

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,697,864; Today View :29,323

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE