Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Classification of web robots: An empirical study based on over one billion requests

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Junsup-
dc.contributor.authorCha, Sunydeok-
dc.contributor.authorLee, Dongkun-
dc.contributor.authorLee, Hyungkyu-
dc.date.accessioned2021-09-08T12:02:00Z-
dc.date.available2021-09-08T12:02:00Z-
dc.date.created2021-06-11-
dc.date.issued2009-11-
dc.identifier.issn0167-4048-
dc.identifier.urihttps://scholar.korea.ac.kr/handle/2021.sw.korea/119014-
dc.description.abstractMany studies on detection and classification of web robots have focused their attention mostly on text crawlers, and empirical experiments used relatively small data collected at universities. In this paper, we analyzed more than one billion requests to www.microsoft.com in 24 h. Web logs were made anonymous to eliminate potential privacy concerns while preserving essential characteristics (e.g., frequency, queries, etc). We have developed an effective characterization metrics, based on workload characteristics and resource types, in detecting and classifying various web robots including text crawlers, link checkers, and icon crawlers. As expected, web robot behavior was clearly different from that of typical interactive users, and different types of web robots also exhibited different characteristics. However, comparison of the similar type of web robots, text crawlers in particular, revealed different characteristics, thereby enabling characterization with reasonably high confidence level. we divided various feature metrics into five groups, and effectiveness of each group in classification is shown in polar diagram in the decreasing order of effectiveness in the clockwise direction. One can use the findings to classify likely identify of unknown web robots, and organizations can develop appropriate measures to deal with them. Our analysis is based on recent web log data collected at one of the best known site which offers truly global service. Crown Copyright (C) 2009 Published by Elsevier Ltd. All rights reserved.-
dc.languageEnglish-
dc.language.isoen-
dc.publisherELSEVIER ADVANCED TECHNOLOGY-
dc.subjectCRAWLER BEHAVIOR-
dc.titleClassification of web robots: An empirical study based on over one billion requests-
dc.typeArticle-
dc.contributor.affiliatedAuthorCha, Sunydeok-
dc.identifier.doi10.1016/j.cose.2009.05.004-
dc.identifier.scopusid2-s2.0-71849091131-
dc.identifier.wosid000272742700007-
dc.identifier.bibliographicCitationCOMPUTERS & SECURITY, v.28, no.8, pp.795 - 802-
dc.relation.isPartOfCOMPUTERS & SECURITY-
dc.citation.titleCOMPUTERS & SECURITY-
dc.citation.volume28-
dc.citation.number8-
dc.citation.startPage795-
dc.citation.endPage802-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.subject.keywordPlusCRAWLER BEHAVIOR-
dc.subject.keywordAuthorWeb robot classification-
dc.subject.keywordAuthorWeb robot detection-
dc.subject.keywordAuthorWeb robot characterization-
dc.subject.keywordAuthorWeb security-
dc.subject.keywordAuthorWeb usage mining-
Files in This Item
There are no files associated with this item.
Appears in
Collections
Graduate School > Department of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Cha, Sung deok photo

Cha, Sung deok
컴퓨터학과
Read more

Altmetrics

Total Views & Downloads

BROWSE