Make Money at Top Bucks
Topbucks can help you make fat cash on your website!

LinuxReviws.org --get your your Linux knowledge
> Linux Reviews > Web Design Tips and Info >

Apache: A good Webalizer.conf for the Webalizer Apache Log Analyzer utility

Webalizer is a great tool to get detailed information about your websites visitors. It generates pretty graphs and useful numbers like daily unique visitors, page impressions, bytes served and so on.


  1. Why Webalizer?
  2. Your logs
  3. A nice webalizer.conf configuration


1. Why Webalizer?

Webalizer has not been updated since April 2002, that is two years before this article appeared on the net. Yet Webalizer remains a powerful, efficient and elegant tool loved by many. Why? It is written in pure optimized C and therefore faster than many other similar tools.

Also, it supports parsing partial log files, meaning you can rotate your log files without breaking any of the statistics.

1.1. Install

How to Install it:

  • Fedora Users without apt can install it by
    • up2date -i webalizer
  • Apt users can install webalizer with:
    • apt-get install webalizer
  • Gentoo Linux users can install it with
    • emerge webalizer
  • The source can be downloaded from mrunix and compiled on most Linux and Unix systems.

2. Your logs

Webalizer works out of the box with standard Apache and Apache2 logs, standard being the Apache 1.3 and 2.0 log format called combined. A correct Apache log setting looks like this:

  CustomLog logs/access_log combined

How it looks

Daily usage graph
Visitors by country

There are many other numbers and graphs that are generated by webalizer.

3. A nice webalizer.conf configuration

This configuration has a wide range of SearchEngine and GroupAgent entries that make your logs show more aqurate and detailed information about your visitors.


  
  LogFile  /your/webroot/statistics/logs/access_log
  OutputDir /your/webroot/statistics/webstat/
  HistoryName	/your/webroot/statistics/webstat/webalizer.hist
  Incremental	yes
  IncrementalName	/your/webroot/statistics/webstat/webalizer.current
  PageType	htm*
  PageType	cgi
  PageType        php
  PageType        shtml
  DNSCache	/var/lib/webalizer/dns_cache.db
  DNSChildren	10
  Quiet		yes
  FoldSeqErr	yes
  HideURL		*.gif
  HideURL		*.GIF
  HideURL		*.jpg
  HideURL		*.JPG
  HideURL		*.png
  HideURL		*.PNG
  HideURL		*.ra
  IgnoreURL       /webstat
   
  SearchEngine    348north.com    search=
  SearchEngine    abcsearch.com   terms=
  SearchEngine    alltheweb.com   q=
  SearchEngine    altavista.com   q=
  SearchEngine    antisearch.net  KEYWORDS=
  SearchEngine    aolsearch       query=
  SearchEngine    ask.com ask=
  SearchEngine    ask.co.uk       ask=
  SearchEngine    augurnet.ch     q=
  SearchEngine    baidu.com       word=
  SearchEngine    barrahome.org   query=
  SearchEngine    blogdex.net     q=
  SearchEngine    blogdigger.com  queryString=
  SearchEngine    blogosphere.us  s=
  SearchEngine    blogmatrix.com  search=
  SearchEngine    blogwise.com    query=
  SearchEngine    boitho.com      query=
  SearchEngine    buscador.ya.com q=
  SearchEngine    by.com  query=
  SearchEngine    daypop.com      q=
  SearchEngine    dir.com req=
  SearchEngine    dmoz.org        search=
  SearchEngine    dogpile.com     q=
  SearchEngine    dpxml   qkw=
  SearchEngine    egoto.com       keywords=
  SearchEngine    elf8888.at      query0=
  SearchEngine    eureka.com      q=
  SearchEngine    excite  search=
  SearchEngine    feedster.com    q=
  SearchEngine    gais.cs.ccu.edu.tw      q=
  SearchEngine    galaxy.com      k=
  SearchEngine    gigablast.com   q=
  SearchEngine    google  q=
  SearchEngine    goo.ne.jp       MT=
  SearchEngine    hotbot.com      query=
  SearchEngine    infoseek.com    qt=
  SearchEngine    ixquick.com     query=
  SearchEngine    kobala.nl       qr=
  SearchEngine    lycos.com       query=
  SearchEngine    look.com        q=
  SearchEngine    looksmart       key=
  SearchEngine    mamma.com       query=
  SearchEngine    metacrawler     q=
  SearchEngine    msn.com q=
  SearchEngine    msxml   qkw=
  SearchEngine    mysearch.com    serachfor=
  SearchEngine    naver.com       query=
  SearchEngine    netscape.com    search=
  SearchEngine    northernlight.com       qr=
  SearchEngine    ntlworld.com    q=
  SearchEngine    openfind        query=
  SearchEngine    overture.com    Keywords=
  SearchEngine    picsearch.com   q=
  SearchEngine    popdex  query=
  SearchEngine    quepasa.com     q=
  SearchEngine    search.com      qt=
  SearchEngine    searchspider.com        q=
  SearchEngine    search.earthlink        q=
  SearchEngine    suchmaschine21.de       search=
  SearchEngine    syndic8 ShowMatch=
  SearchEngine    technorati      query=
  SearchEngine    teensearch      query=
  SearchEngine    teoma.com       q=
  SearchEngine    teradex.com     q=
  SearchEngine    texis   q=
  SearchEngine    voila   kw=
  SearchEngine    walhello        key=
  SearchEngine    waypath.com     key=
  SearchEngine    webcrawler      searchText=
  SearchEngine    webfanatic.lunarpages.com       q=
  SearchEngine    whois.sc        q=
  SearchEngine    wisenut.com     q=
  SearchEngine    yahoo   p=
  
  GroupAgent      Check&Get       Program: Check&Get (Bookmark Manager)
  GroupAgent      eXactSite       Program: eXactSite (HTML authoring. stupid user!)
  GroupAgent      FavOrg  Program: FavOrg (Bookmark Manager)
  GroupAgent      Fetch   Program: Fetch (Offline browser)
  GroupAgent      GetRight        Program: GetRight (Download Manager)
  GroupAgent      HTTrack Program: HTTrack (Website Copier)
  GroupAgent      Lachesis        Program: Packet Loss Report (ftp.intel.com)
  GroupAgent      lachesis        Program: Packet Loss Report (ftp.intel.com)
  GroupAgent      MSFrontPage     Programming: Microsoft FrontPage (stupid user!)
  GroupAgent      Offline Program: Offline Explorer (Offline Browser)
  GroupAgent      Powermarks      Program: Powermarks (Bookmark Manager)
  GroupAgent      SuperBot        Program: SuperBot (Web Site Copier)
  GroupAgent      Teleport        Program: Teleport Pro (Offline Browser tenmax.com)
  GroupAgent      WebStripper     Program: WebStripper (Offline Browser)
  GroupAgent      WebZIP  Program: WebZIP (Offline Browser)
  GroupAgent      Alcatel-        Device: Alcatel Mobile Phone
  GroupAgent      AvantGo Device: AvantGo (Offline Browser)
  GroupAgent      Blazer  Device: Blazer (PalmOS browser)
  GroupAgent      DoCoMo  Device: I-mode Compatible Mobile Phone
  GroupAgent      Elaine  Device: Palm browser
  GroupAgent      Ericsson        Device: Ericsson Mobile Phone
  GroupAgent      MOT-    Device: Motorola Mobile Phone
  GroupAgent      jBrowser        Device: WAP Browser jBrowser (built by Jataayu)
  GroupAgent      Liberate        Device: Liberate (Digital TV)
  GroupAgent      Mitsu   Device: Mitsubishi Mobile Phone
  GroupAgent      Nokia   Device: Nokia Mobile Phone
  GroupAgent      Panasonic       Device: Panasonic Mobile Phone
  GroupAgent      PHILIPS-        Device: Philips Mobile Phone
  GroupAgent      SAGEM-  Device: SAGEM Mobile Phone
  GroupAgent      SAMSUNG-        Device: Samsung Mobile Phone
  GroupAgent      SEC-    Device: Samsung Mobile Phone
  GroupAgent      SHARP-  Device: Sharp Mobile Phone
  GroupAgent      SIE-    Device: Siemens Mobile Phone
  GroupAgent      SonyEricsson    Device: Sony/Ericsson Mobile Phone
  GroupAgent      www.wapsilon.com        Device: www.wapsilon.com (WAP browser)
  GroupAgent      WebGo   Device: Offline Browser WebGo (Windows/CE)
  GroupAgent      WebTV   Device: WebTV
  GroupAgent      AmphetaDesk     RSS: AmphetaDesk
  GroupAgent      Awasu   RSS: Awasu
  GroupAgent      FeedDemon       RSS: Feed Demon
  GroupAgent      Feedreader      RSS: FeedReader
  GroupAgent      FeedOnFeeds     RSS: FeedOnFeeds Reader (http://minutillo.com/steve/feedonfeeds/)
  GroupAgent      FeedValidator   RSS: Archive.org Feed Validator
  GroupAgent      MagpieRSS       RSS: MagpieRSS (PHP-based reader)
  GroupAgent      MyHeadlines     RSS: MyHeadlines (http://www.jmagar.com/myh4)
  GroupAgent      NetNewsWire     RSS: NetNewsWire
  GroupAgent      NewsGator       RSS: NewsGator
  GroupAgent      Newz    RSS: Newz Crawler
  GroupAgent      nntp//rss       RSS: nntp//rss (http://www.methodize.org/nntprss/)
  GroupAgent      Radio*  RSS: Radio Userland
  GroupAgent      Oddbot  RSS: OddPost.com
  GroupAgent      PocketFeed      RSS: PocketFeed (Pocket PC RSS reader)
  GroupAgent      PostNuke        RSS: PostNuke CMS
  GroupAgent      SharpReader     RSS: SharpReader
  GroupAgent      Syndigator      RSS: Syndigator
  GroupAgent      Syndirella      RSS: Syndirella
  GroupAgent      UltraLiberalFeedParser  RSS: Ultra Liberal Feed Parser from Mark Pilgrim
  GroupAgent      Wildgrape       RSS: Wildgrape NewsDesk
  GroupAgent      china   SpamBot: china local browse 2.6
  GroupAgent      cloakBrowser    SpamBot: Fantoma
  GroupAgent      compatible)     SpamBot: Pretends to be Mozilla 3.0
  GroupAgent      Dattatec.com-Sitios-Top SpamBot: Referrer Spam for Dattatec.com
  GroupAgent      DTS     SpamBot: Beijing Express Email Address Extractor
  GroupAgent      EmailSiphon     SpamBot: EmailSiphon
  GroupAgent      fantomBrowser   SpamBot: Fantoma
  GroupAgent      fantomCrew      SpamBot: Fantoma
  GroupAgent      Franklin        SpamBot: Franklin Locator
  GroupAgent      Finder  SpamBot: Mac Finder
  GroupAgent      iaea.org        SpamBot: Atomic Harvester 2000
  GroupAgent      Industry        SpamBot: Industry Program
  GroupAgent      IUFW    SpamBot: IUFW Web
  GroupAgent      IUPUI   SpamBot: IUPUI Research Bot
  GroupAgent      Lincoln SpamBot: Lincoln State Web Browser
  GroupAgent      LinkSweeper     SpamBot: LinkSweeper
  GroupAgent      Microcomputers  SpamBot: Franklin Locator
  GroupAgent      Missauga        SpamBot: Missauga Locate
  GroupAgent      Missigua        SpamBot: Missauga Locate
  GroupAgent      NationalDirectory       Spambot: National Directory Email Harvester
  GroupAgent      Rainbow SpamBot: Under the Rainbow
  GroupAgent      Shareware       Spambot: Program Shareware
  GroupAgent      stealthBrowser  Spambot: Fantoma
  GroupAgent      Sweeper Spambot: Mail Sweeper
  GroupAgent      WEP     SpamBot: WEP Search
  GroupAgent      Xenu    SpamBot: Xenu
  GroupAgent      348NorthNews    Spider: 348north.com
  GroupAgent      almaden.ibm.com/cs/crawler      Spider: almaden.ibm.com
  GroupAgent      antibot Spider: Antidot.net http://www.antidot.net/Welcome/jsp/robots.html
  GroupAgent      http://Ask.24x.Info/    Spider: MnogoSearch.org
  GroupAgent      ASPseek Spider: ASPseek.org free search engine software
  GroupAgent      aspseek Spider: ASPseek.org free search engine software
  GroupAgent      augurfind       Spider: augurnet.ch (Swiss Search Engine)
  GroupAgent      Baiduspider     Spider: Baidu.com
  GroupAgent      BarraHomeCrawler        Spider: Barrahome.org
  GroupAgent      BBot    Spider: http://www.otthon.net/search/
  GroupAgent      Bilbo   Spider: wise-guys.nl
  GroupAgent      blo.gs  Spider: blo.gs
  GroupAgent      BlogBot Spider: Blogdex.net
  GroupAgent      Blogosphere     Spider: Blogosphere.us
  GroupAgent      BlogPulse       Spider: Blogpulse.com
  GroupAgent      BlogShares      Spider: BlogShares.com
  GroupAgent      Blogwise.com    Spider: Blogwise.com
  GroupAgent      boitho.com      Spider: boitho.com
  GroupAgent      bookwatch@onfocus.com   Spider: OnFocus.com Weblog BookWatch
  GroupAgent      brainoff.com/geoblog/   Spider: The World as a Blog (brainoff.com/geoblog/)
  GroupAgent      www.business-socket.com Spider: www.business-socket.com
  GroupAgent      CJNetworkQuality        Spider: CommissionJunction.com
  GroupAgent      combine Spider: http://www.lub.lu.se/combine/
  GroupAgent      COMBINE Spider: http://www.lub.lu.se/combine/
  GroupAgent      CoolBot Spider: www.suchmaschine21.de (German Search Engine)
  GroupAgent      CoologFeedSpider        Spider: CoolLog http://www.webfanatic.lunarpages.com/coolog/
  GroupAgent      CopyHunter      Spider: AWstats referrer log analyzer
  GroupAgent      daypopbot Spider: DayPop.com
  GroupAgent      Ecosystem/development   Spider: Blogging Ecosystem
  GroupAgent      EgotoBot        Spider: Egoto.com
  GroupAgent      ETS     Spider: Freetranslation.com
  GroupAgent      exactseek.com   Spider: exactseek.com
  GroupAgent      Exalead Spider: Exalead.com (AOL France)
  GroupAgent      FAST    Spider: All The Web
  GroupAgent      Fast    Spider: All The Web
  GroupAgent      Feedster        Spider: Feedster.com
  GroupAgent      FlickBot        Spider: DivX Networks FlickBot
  GroupAgent      Gaisbot Spider: GAIS (http://gais.cs.ccu.edu.tw/ )
  GroupAgent      GalaxyBot       Spider: Galaxy.com
  GroupAgent      Genome  Spider: Waypath.com
  GroupAgent      Gigabot Spider: Gigablast.com
  GroupAgent      Google* Spider: Google.com 
  GroupAgent      gossamer-threads.com    Spider: Links SQL
  GroupAgent      grub-client     Spider: Grub.org
  GroupAgent      htdig   Spider: htdig (Open Source Search Engine)
  GroupAgent      ia_archiver     Spider: Archive.org
  GroupAgent      INGRID/3.0      Spider: ilse.nl (Dutch search engine)
  GroupAgent      InternetSeer    Spider: InternetSeer.com (Web Site Monitoring)
  GroupAgent      internetseer    Spider: InternetSeer.com (Web Site Monitoring)
  GroupAgent      IXE     Spider: ideare.com
  GroupAgent      janes-blogosphere       Spider: BlogMatrix.com
  GroupAgent      jiffe   Spider: jiffe.com
  GroupAgent      k2spider        Spider: Verity Spider
  GroupAgent      larbin  Spider: larbin (http://sourceforge.net/projects/larbin/)
  GroupAgent      Leknor.com      Spider: Leknor.com GZIP Tester
  GroupAgent      Linkbot Spider: Linkbot link monitoring tool (Watchfire.com)
  GroupAgent      LinkHype        Spider: LinkHype.com
  GroupAgent      LinksManager.com        Spider: LinksManager.com
  GroupAgent      LinkWalker      Spider: seventwentyfour.com
  GroupAgent      LNSpiderguy     Spider: Lexis-Nexis
  GroupAgent      MnogoSearch     Spider: MnogoSearch.org
  GroupAgent      mogimogi        Spider: www.goo.ne.jp (Japanese Search Engine)
  GroupAgent      MSNBOT  Spider: MSN.com
  GroupAgent      MyWireServiceBot        Spider: MyWireService.com
  GroupAgent      NaverRobot      Spider: Naver.com (Korean Search Engine)
  GroupAgent      Netcraft        Spider: Netcraft Web Survey
  GroupAgent      NetResearchServer       Spider: Look.com
  GroupAgent      NIF     Spider: Newsisfree.com
  GroupAgent      NG/1.0  Spider: Exalead.com (AOL France)
  GroupAgent      NITLE   Spider: Blogcensus.net
  GroupAgent      NPBot   Spider: NameProtect.com
  GroupAgent      NRK-bruker      Spider: NRK.no
  GroupAgent      Openbot Spider: OpenFind (http://www.openfind.com.tw/)
  GroupAgent      Pompos  Spider: Dir.com
  GroupAgent      Popdexter       Spider: Popdex.com
  GroupAgent      psbot   Spider: Picsearch.com
  GroupAgent      QuepasaCreep    Spider: Quepasa.com (Spanish site)
  GroupAgent      Robozilla       Spider: Link Checker for Dmoz.org
  GroupAgent      Scooter Spider: Altavista
  GroupAgent      searchspider.com        Spider: searchspider.com
  GroupAgent      semanticdiscovery       Spider: semanticdiscovery.com
  GroupAgent      SideWinder      Spider: Infoseek
  GroupAgent      slurp@inktomi.com       Spider: Inktomi
  GroupAgent      spider@spider.ilab.sztaki.hu    Spider: http://www.ilab.sztaki.hu/websearch/
  GroupAgent      Spinne  Spider: webauskunft.at
  GroupAgent      Steeler Spider: Kitsuregawa Laboratory, The University of Tokyo
  GroupAgent      SurveyBot       Spider: whois.sc
  GroupAgent      Syndic8 Spider: Syndic8
  GroupAgent      Tagword Spider: Tagword - http://tagword.com/dmoz_survey.php
  GroupAgent      Teoma   Spider: Teoma 
  GroupAgent      Teradex Spider: Teradex.com (directory)
  GroupAgent      Terrar  Spider:  Terrar (http://www.terrar.com)
  GroupAgent      Technoratibot Spider: Technorati
  GroupAgent      T-H-U-N-D-E-R-S-T-O-N-E Spider: Webinator (http://www.thunderstone.com/texis/site/pages/webinator.html)
  GroupAgent      timboBot        Spider: BreakingBlogs.com
  GroupAgent      TurnitinBot     Spider: Turnitin.com
  GroupAgent      http://www.tutorgig.com/        Spider: tutorgig.com
  GroupAgent      Vagabondo       Spider: kobala.nl
  GroupAgent      verzamelgids    Spider: verzamelgids.nl
  GroupAgent      VoilaBot        Spider: Voila.com
  GroupAgent      W3C_Validator   Spider: W3C Validator
  GroupAgent      www.walhello.com        Spider: Walhello.com
  GroupAgent      WebCapture      Spider: WebCapture.biz
  GroupAgent      Webclipping     Spider: Webclipping.com
  GroupAgent      WebFilter       Spider: http://www.ils.unc.edu/webfilter/
  GroupAgent      WebGather       Spider: City Polytechnic of Hong Kong
  GroupAgent      WebRACE Spider: WebRACE (University of Cyprus, Distributed Crawler)
  GroupAgent      websitealert.net        Spider: websitealert.net (Monitoring System)
  GroupAgent      Zealbot Spider: Looksmart.com
  GroupAgent      ZyBorg  Spider: WiseNut.com
  GroupAgent      curl    Programming: curl library (PHP)
  GroupAgent      Indy    Programming: Indy (Delphi-based client)
  GroupAgent      Java    Programming: Java-based client
  GroupAgent      Jakarta Programming: Jakarta (Java)
  GroupAgent      libwww-perl     Programming: LIB-WWW (Perl library)
  GroupAgent      LWP:    Pogramming: LWP: : Simple (Perl library)
  GroupAgent      OPWV-SDK        Programming: OpenWave Mobile Development SDK
  GroupAgent      PEAR    Programming: PEAR Library (PHP)
  GroupAgent      PHP     Programming: PHP-based client
  GroupAgent      Python-urllib   Programming: URLLIB (Python library)
  GroupAgent      rdflib  Programming: rdflib (Python RDF library)
  GroupAgent      RPT-HTTPClient  Programming: RPT-HTTP (Java)
  GroupAgent      Snoopy  Programming: Snoopy (PHP class - http://snoopy.sourceforge.net/ )
  GroupAgent      SOFTWING_TEAR_AGENT     Programming: Softwing Tear Agent (Active Server Pages)
  GroupAgent      Wget    Programming: Wget library (http://www.gnu.org/software/wget/wget.html)
  GroupAgent      WinHttp.WinHttpRequest  Program: WinHttp.WinHttpRequest library (Visual Basic)
  GroupAgent      Bison   Proxy: Proxomitron (Proxomitron.info)
  GroupAgent      BorderManager   Proxy Novell Border Manager Security Suite
  GroupAgent      CE-Preload      Proxy: Cisco Content Engine
  GroupAgent      DA      Proxy: DA
  GroupAgent      junkbuster      Proxy: junkbuster (junkbusters.com)
  GroupAgent      AppleWebKit     Safari (OSX)
  GroupAgent      BFS_method      BeOS browser
  GroupAgent      Camino  Mozilla-based browser Camino (OSX)
  GroupAgent      iCab    iCab (Mac)
  GroupAgent      Konqueror       Konqueror
  GroupAgent      Links   Links (Text-based browser)
  GroupAgent      Lynx*   Lynx    (Text-based browser)
  GroupAgent      NCBrowser       NCBrowser (RISC OS)
  GroupAgent      Opera   Opera
  GroupAgent      SlimBrowser     SlimBrowser (http://www.flashpeak.com/sbrowser/sbrowser.htm)
  GroupAgent      w3m     w3m (Text-based browser - http://w3m.sourceforge.net/ )
  GroupAgent      rv:1.4  Mozilla 1.4
  GroupAgent      3.01    Navigator 3.01 (16-bit version)
  GroupAgent      4.01    Internet Explorer 4.01
  GroupAgent      5.01    Internet Explorer 5.01
  GroupAgent      5.0     Internet Explorer 5.0
  GroupAgent      5.23    Internet Explorer (Mac)
  GroupAgent      5.22    Internet Explorer (Mac)
  GroupAgent      5.21    Internet Explorer (Mac)
  GroupAgent      5.17    Internet Explorer (Mac)
  GroupAgent      5.16    Internet Explorer (Mac)
  GroupAgent      5.15    Internet Explorer (Mac)
  GroupAgent      5.14    Internet Explorer (Mac)
  GroupAgent      5.13    Internet Explorer (Mac)
  GroupAgent      5.12    Internet Explorer (Mac)
  GroupAgent      5.5     Internet Explorer 5.5 (Windows)
  GroupAgent      6.0     Internet Explorer 6.0 (Windows)
  GroupAgent      Mozilla/3.04Gold        Netscape 3.04 Gold
  GroupAgent      Mozilla/4.04    Netscape 4
  GroupAgent      Mozilla/4.06    Netscape 4
  GroupAgent      Mozilla/4.08    Netscape 4
  GroupAgent      Mozilla/4.5     Netscape 4.5
  GroupAgent      Mozilla/4.7     Netscape 4.7
  GroupAgent      Mozilla/4.8     Netscape 4.8
  GroupAgent      MSIE    Internet Explorer
  GroupAgent      Mozilla Netscape
  
  
  
  

Change the LogFile, OutputDir, HistoryName and IncrementalName paths to suit your needs.


Thanks to http://www.tnl.net/blog/entry/More_Webalizer.conf_hacking


- Next: Website hosts: Let users change their passwords
- Previous: Apache: How to use mod_expires with Apache 1.3 and 2.0

Meet new people