An SSH predictive model using machine learning with web proxy session logs

Lee, Junwon; Lee, Heejo

doi:10.1007/s10207-021-00555-6

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

An SSH predictive model using machine learning with web proxy session logs

Authors: Lee, Junwon; Lee, Heejo

Issue Date: 4월-2022

Publisher: SPRINGER

Keywords: Web proxy; SSH; HTTP CONNECT; TCP tunneling; Machine learning; Random forest; Decision tree; PCA

Citation: INTERNATIONAL JOURNAL OF INFORMATION SECURITY, v.21, no.2, pp.311 - 322

Indexed: SCIE
SCOPUS

Journal Title: INTERNATIONAL JOURNAL OF INFORMATION SECURITY

Volume: 21

Number: 2

Start Page: 311

End Page: 322

URI: https://scholar.korea.ac.kr/handle/2021.sw.korea/142137

DOI: 10.1007/s10207-021-00555-6

ISSN: 1615-5262

Abstract: An adversary can use SSH communication as a route for information leakage or hacking. Many studies have focused on TCP header analysis to detect encrypted communication. However, SSH detection using TCP header analysis is limited when changing TCP port information or modifying components of the SSH protocol. Various machine-learning (ML) techniques have been introduced to enhance network traffic classification by analyzing TCP headers. Most ML-based traffic classification research has analyzed network packet flows. However, because of the complex structures and the various implementations of the TCP protocol, a lot of time and resources are required for the recombination of network packet flows. This paper presents a novel contribution to overcome the problems of network packet analysis that employs web proxy session logs, which do not require the recombination of packets to prepare a dataset for analysis. Moreover, we propose a hybrid predictive model that is useful for web proxy session log analysis. In the modeling process, we collected the web proxy logs from an actual network of ICT companies (more than 10,000 employees, Seoul, South Korea) and used the random forest and decision tree algorithms for the supervised learning. The detection rate (DR) for the training dataset was 99.9%, which is similar to or higher than that of other studies using ML and deep learning. Using the dataset of DARPA99, we proved that the DR and FPR for our proposed model were better than those achieved by Alshammari et al.'s model. We expect that the proposed predictive model can be used to block illegal attempts at SSH communication over HTTP CONNECT by changing the destination port and to detect novel illegal communication protocols.

Files in This Item: There are no files associated with this item.

Appears in Collections: Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Hee jo photo

Lee, Hee jo: 컴퓨터학과

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,849,342; Today View :7,110

RSS_1.0 RSS_2.0 ATOM_1.0

(02841) 서울특별시 성북구 안암로 14502-3290-1114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE