Overlapping computation and communication of three-dimensional FDTD on a GPU cluster
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Ki-Hwan | - |
dc.contributor.author | Park, Q-Han | - |
dc.date.accessioned | 2021-09-06T13:58:24Z | - |
dc.date.available | 2021-09-06T13:58:24Z | - |
dc.date.created | 2021-06-14 | - |
dc.date.issued | 2012-11 | - |
dc.identifier.issn | 0010-4655 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/107118 | - |
dc.description.abstract | Large-scale electromagnetic field simulations using the FDTD (finite-difference time-domain) method require the use of CPU (graphics processing unit) clusters. However, the communication overhead caused by slow interconnections becomes a major performance bottleneck. In this paper, as a way to remove the bottleneck, we propose the 'kernel-split method' and the 'host-buffer method' which overlap computation and communication for the FDTD simulation on the CPU cluster. The host-buffer method in particular enables overlapping without any modifications to the update-kernels that are already in use. We also present theoretical formulas to predict the overlap threshold and the total throughput for each method. By using our overlap methods with 6 CPU nodes, we demonstrate that the total performance of 3D FDTD reaches 92% of a six-fold increase, which is the upper limit that would be reached if there were no communication overhead. (C) 2012 Elsevier B.V. All rights reserved. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | ELSEVIER | - |
dc.subject | IMPLEMENTATION | - |
dc.title | Overlapping computation and communication of three-dimensional FDTD on a GPU cluster | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Park, Q-Han | - |
dc.identifier.doi | 10.1016/j.cpc.2012.06.003 | - |
dc.identifier.scopusid | 2-s2.0-84864442871 | - |
dc.identifier.wosid | 000308122000005 | - |
dc.identifier.bibliographicCitation | COMPUTER PHYSICS COMMUNICATIONS, v.183, no.11, pp.2364 - 2369 | - |
dc.relation.isPartOf | COMPUTER PHYSICS COMMUNICATIONS | - |
dc.citation.title | COMPUTER PHYSICS COMMUNICATIONS | - |
dc.citation.volume | 183 | - |
dc.citation.number | 11 | - |
dc.citation.startPage | 2364 | - |
dc.citation.endPage | 2369 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Physics | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Interdisciplinary Applications | - |
dc.relation.journalWebOfScienceCategory | Physics, Mathematical | - |
dc.subject.keywordPlus | IMPLEMENTATION | - |
dc.subject.keywordAuthor | FDTD | - |
dc.subject.keywordAuthor | GPU cluster | - |
dc.subject.keywordAuthor | CUDA | - |
dc.subject.keywordAuthor | OpenCL | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.