DimensionSlice: A main-memory data layout for fast scans of multidimensional data
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Suh, Ilhyun | - |
dc.contributor.author | Chung, Yon Dohn | - |
dc.date.accessioned | 2021-08-30T07:06:24Z | - |
dc.date.available | 2021-08-30T07:06:24Z | - |
dc.date.created | 2021-06-18 | - |
dc.date.issued | 2020-12 | - |
dc.identifier.issn | 0306-4379 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/51342 | - |
dc.description.abstract | Multidimensional data are exploited in many application areas such as scientific data analysis, business intelligence, and geographic information systems. One of the most frequent operations applied to such multidimensional data is the selection of a subspace of the given multidimensional space, which involves predicate evaluation on multiple dimensions. Existing main-memory data layouts optimized for evaluating predicates on the columnar data can be used to accelerate the subspace extraction by sequentially performing filter scans on each dimension one at a time. However, optimization opportunities emerge if we can consider all predicates together. In this paper, we propose DimensionSlice, a new main-memory data layout optimized for evaluating predicates on multiple dimensions. More specifically, the dimension values are sliced into portions and the portions with the same order of each dimension are arranged together. Multiple predicates are simultaneously evaluated with the sliced dimension values during the scan. In addition, by storing the different portions separately, unnecessary loads and computations of lower portions can be eliminated if the evaluation results are assured after examining the upper portions. For further acceleration of scans, the DimensionSlice layout is designed to easily leverage the SIMD capabilities that most mainstream processors are equipped with. Through experiments, we demonstrate the performance gains of the proposed method over the columnar mainmemory layout that evaluates the partial predicates one dimension at a time. We also show that the proposed method outperforms the state-of-the-art multidimensional index structure when the selectivity is over a very low threshold. (C) 2020 Elsevier Ltd. All rights reserved. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | PERGAMON-ELSEVIER SCIENCE LTD | - |
dc.subject | DATA-MANAGEMENT | - |
dc.title | DimensionSlice: A main-memory data layout for fast scans of multidimensional data | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Chung, Yon Dohn | - |
dc.identifier.doi | 10.1016/j.is.2020.101602 | - |
dc.identifier.scopusid | 2-s2.0-85088831965 | - |
dc.identifier.wosid | 000567083300006 | - |
dc.identifier.bibliographicCitation | INFORMATION SYSTEMS, v.94 | - |
dc.relation.isPartOf | INFORMATION SYSTEMS | - |
dc.citation.title | INFORMATION SYSTEMS | - |
dc.citation.volume | 94 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.subject.keywordPlus | DATA-MANAGEMENT | - |
dc.subject.keywordAuthor | Multidimensional data | - |
dc.subject.keywordAuthor | Data layout | - |
dc.subject.keywordAuthor | Main-memory processing | - |
dc.subject.keywordAuthor | SIMD | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.