Makro- und Mikro-Mining am Beispiel von Webserver Logfiles

Mayr, Philipp and Nançoz, Christian Makro- und Mikro-Mining am Beispiel von Webserver Logfiles., 2005 . In Knowledge eXtended, Jülich (Germany), 2-4 November 2005. (In Press) [Conference paper]

[img]
Preview
PDF
mayr_nancoz_kx05.pdf

Download (324kB) | Preview

English abstract

Webserver log files are a very interesting data source for analysing the accessibility, visibility and interlinking of any web content. This paper proposes two recent log file or web mining approaches (macro-mining & micro-mining of webserver log files). We try to bring together the popular method called macro analysis which aggregates common server request counts (e.g. number of downloads of a certain document) with the micro analysis method which is less known in log analysis. The micro-mining approach focuses on segmented log files which can be drilled down to transactions of single users. Both analysis methods will be explained by an example. Furthermore we try to identify new use cases and try to sketch ways of combined analysis for both web mining methods.

German abstract

Webserver Logfiles sind eine hochinteressante Informationsquelle zur Untersuchung der Zugänglichkeit, Sichtbarkeit und Verlinkung von beliebigen Webinhalten. Dieser Beitrag stellt zwei neuere Ansätze der Logfile Analyse bzw. des Web Mining vor (Makro-Mining & Mikro-Mining). Der weitverbreiteten Methode der Makro-Analyse, die hauptsächlich allgemeine Zugriffszahlen aggregiert (z. B. Anzahl der Downloads eines Dokuments), wird die bislang weniger bekannte Methode der Mikro-Analyse gegenübergestellt. Die Mikro-Analyse konzentriert sich auf schmale Segmente des Logfiles, die bis auf Transaktionen einzelner User zurückgehen. Beide Analysemethoden werden anhand eines Beispiels erklärt. Weiterhin wird versucht neue Einsatzbereiche der beiden Web-Mining Verfahren zu identifizieren und Formen der kombinierten Nutzung der beiden Methoden zu skizzieren.

Item type: Conference paper
Keywords: Logfile Analysis, Webserver Logfiles, Webmining, Macro-Analysis, Micro-Analysis
Subjects: L. Information technology and library technology > LJ. Software.
I. Information treatment for information services > II. Filtering.
B. Information use and sociology of information > BZ. None of these, but in this section.
C. Users, literacy and reading. > CZ. None of these, but in this section.
Depositing user: Philipp Mayr
Date deposited: 14 Sep 2005
Last modified: 02 Oct 2014 12:01
URI: http://hdl.handle.net/10760/6710

References

Bjöneborn, Lennart; Ingwersen, Peter (2001): Perspectices of webometrics. In: Scientometrics, Vol. 50, pp. 65-82.

Brody, Tim; Harnad, Stevan (2005): Earlier Web Usage Statistics as Predictors of Later Citation Impact. Technical report. URL: http://eprints.ecs.soton.ac.uk/10647/ (access date 14 August 2005)

Gutzman, A. (1999): Analysing Traffic on Your E-commerce Site. URL: http://ecommerce.internet.com/solutions/tech_advisor/article/0,,9561_186011,00.html (access date 14 August 2005)

Koch, Traugott; Golub, Koraljka; Ardö, Anders (2004): Log Analysis of User Behaviour in the Renardus Web Service. URL: www.it.lth.se/knowlib/publ/LIDA2004_final.doc (access date 14 August 2005)

Kosala, Raymond; Bockeel, Hendrik (2000): Web mining research: A survey. In: SIGKDD Explorations, Vol. 2, pp. 1-15.

Lawrence, Steve; Giles, C. Lee; Bollacker, Kurt (1999): Digital Libraries and Autonomous Citation Indexing. In: IEEE Computer, Vol. 32 (6), pp. 67-71. URL: http://citeseer.ist.psu.edu/aci-computer/aci-computer99.html (access date 14 August 2005)

Mayr, Philipp (2004a): Entwicklung und Test einer logfilebasierten Metrik zur Analyse von Website Entries am Beispiel einer akademischen Universitäts-Website. (Berliner Handreichungen zur Bibliothekswissenschaft und Bibliothekarsausbildung ; 129). URL: http://www.ib.hu-berlin.de/~kumlau/handreichungen/h129/ (access date 14 August 2005)

Mayr, Philipp (2004b): Website entries from a web log file perspective - a new log file measure. Proceedings of the AoIR-ASIST 2004 Workshop on Web Science Research Methods. URL: http://cybermetrics.wlv.ac.uk/AoIRASIST/mayr.html (access date 14 August 2005)

Nançoz, Christian (2004): mEdit – membership function editor for fCQL-based architecture. Master Thesis, URL : http://diuf.unifr.ch/is/studentprojects/pdf/M-2004_Christian_Nancoz.pdf (access date 14 August 2005)

Nicholas, David, et al. (1999): Cracking the code: web log analysis. In: Online & CD-ROM Review, Vol. 23, pp. 263-269.

Nicholas, David; Huntington Paul. (2003): Micro-Mining and Segmented Log File Analysis: A Method for Enriching the Data Yield from Internet Log Files. In: Journal of Information Science, Vol. 29 (5), pp. 391-404.

Thelwall, Mike (2001): Web log file analysis: Backlinks and Queries. In: Aslib Proceedings, Vol. 53, pp. 217-223.

Thelwall, Mike; Vaughan, Liwen; Björneborn, Lennart (2003): Webometrics. In: ARIST, Vol. 39, preprint. URL: http://www.db.dk/lb/2003preprint_ARIST.doc


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item