Human-guided auto-labeling for network traffic data: The GELM approach

Kim, Meejoung; Lee, Inkyu

doi:10.1016/j.neunet.2022.05.007

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Human-guided auto-labeling for network traffic data: The GELM approach

Authors: Kim, Meejoung; Lee, Inkyu

Issue Date: 8월-2022

Publisher: PERGAMON-ELSEVIER SCIENCE LTD

Keywords: Human-guided labeling; Auto-labeling process; Generalized extreme learning machine; Moore-Penrose generalized inverse; Network traffic; Attack prediction

Citation: NEURAL NETWORKS, v.152, pp.510 - 526

Indexed: SCIE
SCOPUS

Journal Title: NEURAL NETWORKS

Volume: 152

Start Page: 510

End Page: 526

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/142741

DOI: 10.1016/j.neunet.2022.05.007

ISSN: 0893-6080

Abstract: Data labeling is crucial in various areas, including network security, and a prerequisite for applying statistical-based classification and supervised learning techniques. Therefore, developing labeling methods that ensure good performance is important. We propose a human-guided auto-labeling algorithm involving the self-supervised learning concept, with the purpose of labeling data quickly, accurately, and consistently. It consists of three processes: auto-labeling, validation, and update. A labeling scheme is proposed by considering weighted features in the auto-labeling, while the generalized extreme learning machine (GELM) enabling fast training is applied to validate assigned labels. Two different approaches are considered in the update to label new data to investigate labeling speed and accuracy. We experiment to verify the suitability and accuracy of the algorithm for network traffic, applying the algorithm to five traffic datasets, some including distributed denial of service (DDoS), DoS, BruteForce, and PortScan attacks. Numerical results show the algorithm labels unlabeled datasets quickly, accurately, and consistently and the GELM's learning speed enables labeling data in real-time. It also shows that the performances between auto-and conventional labels are nearly identical on datasets containing only DDoS attacks, which implies the algorithm is quite suitable for such datasets. However, the performance differences between the two labels are not negligible on datasets, including various attacks. Several reasons that require further investigation can be considered, including the selected features and the reliability of conventional labels. Even with this limitation of the current study, the algorithm will provide a criterion for labeling data in real-time occurring in many areas. (C) 2022 Elsevier Ltd. All rights reserved.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > School of Electrical Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, In kyu photo

Lee, In kyu: 공과대학 (전기전자공학부)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :9,529,830; Today View :22,473

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE