Design of Processing-&quot;Inside&apos;&apos;-Memory Optimized for DRAM Behaviors

Lee, Won Jun; Kim, Chang Hyun; Paik, Yoonah; Park, Jongsun; Park, Il; Kim, Seon Wook

doi:10.1109/ACCESS.2019.2924240

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Design of Processing-"Inside''-Memory Optimized for DRAM Behaviors

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Won Jun	-
dc.contributor.author	Kim, Chang Hyun	-
dc.contributor.author	Paik, Yoonah	-
dc.contributor.author	Park, Jongsun	-
dc.contributor.author	Park, Il	-
dc.contributor.author	Kim, Seon Wook	-
dc.date.accessioned	2021-09-01T22:47:18Z	-
dc.date.available	2021-09-01T22:47:18Z	-
dc.date.created	2021-06-19	-
dc.date.issued	2019	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholar.korea.ac.kr/handle/2021.sw.korea/68945	-
dc.description.abstract	The computing domain of today's computer systems is moving very fast from arithmetic to data processing as data volumes grow exponentially. As a result, processing-in-memory (PIM) studies have been actively conducted to support the data processing in or near memory devices to address the limited bandwidth and high power consumption due to data movement between CPU/GPU and memory. However, most PIM studies so far have been conducted in a way that the processing units are designed only as an accelerator on the base die of 3D-stacked DRAM, not involved inside memory while not servicing the standard DRAM requests during the PIM execution. Therefore, in this paper, we show how to design and operate the PIM computing units inside DRAM by effectively coordinating with standard DRAM operations while achieving the full computing performance and minimizing the implementation cost. To make our goals, we extend a standard DRAM state diagram to depict the PIM behaviors in the same way as standard DRAM commands are scheduled and operated on the DRAM devices and exploit several levels of parallelism to overlap memory and computing operations. Also, we present how the entire architecture layers from applications to operating systems, memory controllers, and PIM devices should work together for the effective execution by applying our approaches to our experiment platform. In our HBM2-based experimental platform to include 16-cycle MAC (Multiply-and-Add) units and 8-cycle reducers for a matrix-vector multiplication, we achieved 406% and 35.2% faster performance by the all-bank and the per-bank schedulings, respectively, at (1024 x 1024) x (1024 x 1) 8-bit integer matrix-vector multiplication than the execution of only its operand burst reads assuming the external full DRAM bandwidth. It should be noted that the performance of the PIM on a base die of a 3D-stacked memory cannot be better than that provided by the full bandwidth in any case.	-
dc.language	English	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.subject	ENERGY	-
dc.subject	COMPRESSION	-
dc.title	Design of Processing-"Inside''-Memory Optimized for DRAM Behaviors	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Park, Jongsun	-
dc.contributor.affiliatedAuthor	Kim, Seon Wook	-
dc.identifier.doi	10.1109/ACCESS.2019.2924240	-
dc.identifier.scopusid	2-s2.0-85068645446	-
dc.identifier.wosid	000475356000001	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.7, pp.82633 - 82648	-
dc.relation.isPartOf	IEEE ACCESS	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	7	-
dc.citation.startPage	82633	-
dc.citation.endPage	82648	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordPlus	ENERGY	-
dc.subject.keywordPlus	COMPRESSION	-
dc.subject.keywordAuthor	Processing-in-memory	-
dc.subject.keywordAuthor	DRAM	-
dc.subject.keywordAuthor	parallelism	-
dc.subject.keywordAuthor	matrix-vector multiplication	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > School of Electrical Engineering > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Park, Jong sun photo

Park, Jong sun: 공과대학 (전기전자공학부)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,699,095; Today View :30,588

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE