Crawler detection: A Bayesian approach

Athena Stassopoulou, Marios D. Dikaiakos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we introduce a probabilistic modeling approach for addressing the problem of Web robot detection from Web-server access logs. More specifically, we construct a Bayesian network that classifies automatically access-log sessions as being crawler- or human-induced, by combining various pieces of evidence proven to characterize crawler and human behavior. Our approach uses machine learning techniques to determine the parameters of the probabilistic model. We apply our method to real Web-server logs and obtain results that demonstrate the robustness and effectiveness of probabilistic reasoning for crawler detection.

Original languageEnglish
Title of host publicationInternational Conference on Internet Surveillance and Protection, ICISP'06
DOIs
Publication statusPublished - 2006
EventInternational Conference on Internet Surveillance and Protection, ICISP'06 - Cote d'Azur, France
Duration: 26 Aug 200628 Aug 2006

Other

OtherInternational Conference on Internet Surveillance and Protection, ICISP'06
Country/TerritoryFrance
CityCote d'Azur
Period26/08/0628/08/06

Fingerprint

Dive into the research topics of 'Crawler detection: A Bayesian approach'. Together they form a unique fingerprint.

Cite this