Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Studer, Stefan | - |
dc.contributor.author | Bui, Thanh Binh | - |
dc.contributor.author | Drescher, Christian | - |
dc.contributor.author | Hanuschkin, Alexander | - |
dc.contributor.author | Winkler, Ludwig | - |
dc.contributor.author | Peters, Steven | - |
dc.contributor.author | Mueller, Klaus-Robert | - |
dc.date.accessioned | 2022-03-01T01:41:29Z | - |
dc.date.available | 2022-03-01T01:41:29Z | - |
dc.date.created | 2022-02-15 | - |
dc.date.issued | 2021-06 | - |
dc.identifier.issn | 2504-4990 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/137317 | - |
dc.description.abstract | Machine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk of performance degradation over time. With each task of the process, this work proposes quality assurance methodology that is suitable to address challenges in machine learning development that are identified in the form of risks. The methodology is drawn from practical experience and scientific literature, and has proven to be general and stable. The process model expands on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks. The presented work proposes an industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | MDPI | - |
dc.subject | FEATURE-SELECTION | - |
dc.subject | MULTIPLE IMPUTATION | - |
dc.subject | KNOWLEDGE DISCOVERY | - |
dc.subject | NYSTROM METHOD | - |
dc.subject | CLASSIFICATION | - |
dc.subject | MATRIX | - |
dc.subject | CANCER | - |
dc.title | Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Mueller, Klaus-Robert | - |
dc.identifier.doi | 10.3390/make3020020 | - |
dc.identifier.wosid | 000646865900001 | - |
dc.identifier.bibliographicCitation | MACHINE LEARNING AND KNOWLEDGE EXTRACTION, v.3, no.2, pp.392 - 413 | - |
dc.relation.isPartOf | MACHINE LEARNING AND KNOWLEDGE EXTRACTION | - |
dc.citation.title | MACHINE LEARNING AND KNOWLEDGE EXTRACTION | - |
dc.citation.volume | 3 | - |
dc.citation.number | 2 | - |
dc.citation.startPage | 392 | - |
dc.citation.endPage | 413 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Interdisciplinary Applications | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.subject.keywordPlus | CANCER | - |
dc.subject.keywordPlus | CLASSIFICATION | - |
dc.subject.keywordPlus | FEATURE-SELECTION | - |
dc.subject.keywordPlus | KNOWLEDGE DISCOVERY | - |
dc.subject.keywordPlus | MATRIX | - |
dc.subject.keywordPlus | MULTIPLE IMPUTATION | - |
dc.subject.keywordPlus | NYSTROM METHOD | - |
dc.subject.keywordAuthor | automotive industry and academia | - |
dc.subject.keywordAuthor | best practices | - |
dc.subject.keywordAuthor | guidelines | - |
dc.subject.keywordAuthor | machine learning applications | - |
dc.subject.keywordAuthor | process model | - |
dc.subject.keywordAuthor | quality assurance methodology | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.