Make Money at Top Bucks
Topbucks can help you make fat cash on your website!

LinuxReviws.org --get your your Linux knowledge
> Linux Reviews > Web Design Tips and Info >

Apache: A good Webalizer.conf for the Webalizer Apache Log Analyzer utility

v0.3, LinuxReviews.org

Webalizer is a great tool to get detailed information about your websites visitors. It generates pretty graphs and useful numbers like daily unique visitors, page impressions, bytes served and so on.


  1. Why Webalizer?
  2. Your logs
  3. A nice webalizer.conf configuration


1. Why Webalizer?

Webalizer has not been updated since April 2002, that is two years before this article appeared on the net. Yet Webalizer remains a powerful, efficient and elegant tool loved by many. Why? It is written in pure optimized C and therefore faster than many other similar tools.

Also, it supports parsing partial log files, meaning you can rotate your log files without breaking any of the statistics.

1.1. Install

How to Install it:

  • Fedora Users without apt can install it by
    • up2date -i webalizer
  • Apt users can install webalizer with:
    • apt-get install webalizer
  • Gentoo Linux users can install it with
    • emerge webalizer
  • The source can be downloaded from mrunix and compiled on most Linux and Unix systems.

2. Your logs

Webalizer works out of the box with standard Apache and Apache2 logs, standard being the Apache 1.3 and 2.0 log format called combined. A correct Apache log setting looks like this:

  CustomLog logs/access_log combined

How it looks

Daily usage graph

Visitors by country

There are many other numbers and graphs that are generated by webalizer.

3. A nice webalizer.conf configuration

This configuration has a wide range of SearchEngine and GroupAgent entries that make your logs show more aqurate and detailed information about your visitors.

  • Download: webAlizerConf


      
      LogFile  /your/webroot/statistics/logs/access_log
      OutputDir /your/webroot/statistics/webstat/
      HistoryName	/your/webroot/statistics/webstat/webalizer.hist
      Incremental	yes
      IncrementalName	/your/webroot/statistics/webstat/webalizer.current
      PageType	htm*
      PageType	cgi
      PageType        php
      PageType        shtml
      DNSCache	/var/lib/webalizer/dns_cache.db
      DNSChildren	10
      Quiet		yes
      FoldSeqErr	yes
      HideURL		*.gif
      HideURL		*.GIF
      HideURL		*.jpg
      HideURL		*.JPG
      HideURL		*.png
      HideURL		*.PNG
      HideURL		*.ra
      IgnoreURL       /webstat
       
      SearchEngine    348north.com    search=
      SearchEngine    abcsearch.com   terms=
      SearchEngine    alltheweb.com   q=
      SearchEngine    altavista.com   q=
      SearchEngine    antisearch.net  KEYWORDS=
      SearchEngine    aolsearch       query=
      SearchEngine    ask.com ask=
      SearchEngine    ask.co.uk       ask=
      SearchEngine    augurnet.ch     q=
      SearchEngine    baidu.com       word=
      SearchEngine    barrahome.org   query=
      SearchEngine    blogdex.net     q=
      SearchEngine    blogdigger.com  queryString=
      SearchEngine    blogosphere.us  s=
      SearchEngine    blogmatrix.com  search=
      SearchEngine    blogwise.com    query=
      SearchEngine    boitho.com      query=
      SearchEngine    buscador.ya.com q=
      SearchEngine    by.com  query=
      SearchEngine    daypop.com      q=
      SearchEngine    dir.com req=
      SearchEngine    dmoz.org        search=
      SearchEngine    dogpile.com     q=
      SearchEngine    dpxml   qkw=
      SearchEngine    egoto.com       keywords=
      SearchEngine    elf8888.at      query0=
      SearchEngine    eureka.com      q=
      SearchEngine    excite  search=
      SearchEngine    feedster.com    q=
      SearchEngine    gais.cs.ccu.edu.tw      q=
      SearchEngine    galaxy.com      k=
      SearchEngine    gigablast.com   q=
      SearchEngine    google  q=
      SearchEngine    goo.ne.jp       MT=
      SearchEngine    hotbot.com      query=
      SearchEngine    infoseek.com    qt=
      SearchEngine    ixquick.com     query=
      SearchEngine    kobala.nl       qr=
      SearchEngine    lycos.com       query=
      SearchEngine    look.com        q=
      SearchEngine    looksmart       key=
      SearchEngine    mamma.com       query=
      SearchEngine    metacrawler     q=
      SearchEngine    msn.com q=
      SearchEngine    msxml   qkw=
      SearchEngine    mysearch.com    serachfor=
      SearchEngine    naver.com       query=
      SearchEngine    netscape.com    search=
      SearchEngine    northernlight.com       qr=
      SearchEngine    ntlworld.com    q=
      SearchEngine    openfind        query=
      SearchEngine    overture.com    Keywords=
      SearchEngine    picsearch.com   q=
      SearchEngine    popdex  query=
      SearchEngine    quepasa.com     q=
      SearchEngine    search.com      qt=
      SearchEngine    searchspider.com        q=
      SearchEngine    search.earthlink        q=
      SearchEngine    suchmaschine21.de       search=
      SearchEngine    syndic8 ShowMatch=
      SearchEngine    technorati      query=
      SearchEngine    teensearch      query=
      SearchEngine    teoma.com       q=
      SearchEngine    teradex.com     q=
      SearchEngine    texis   q=
      SearchEngine    voila   kw=
      SearchEngine    walhello        key=
      SearchEngine    waypath.com     key=
      SearchEngine    webcrawler      searchText=
      SearchEngine    webfanatic.lunarpages.com       q=
      SearchEngine    whois.sc        q=
      SearchEngine    wisenut.com     q=
      SearchEngine    yahoo   p=
      
      GroupAgent      Check&Get       Program: Check&Get (Bookmark Manager)
      GroupAgent      eXactSite       Program: eXactSite (HTML authoring. stupid user!)
      GroupAgent      FavOrg  Program: FavOrg (Bookmark Manager)
      GroupAgent      Fetch   Program: Fetch (Offline browser)
      GroupAgent      GetRight        Program: GetRight (Download Manager)
      GroupAgent      HTTrack Program: HTTrack (Website Copier)
      GroupAgent      Lachesis        Program: Packet Loss Report (ftp.intel.com)
      GroupAgent      lachesis        Program: Packet Loss Report (ftp.intel.com)
      GroupAgent      MSFrontPage     Programming: Microsoft FrontPage (stupid user!)
      GroupAgent      Offline Program: Offline Explorer (Offline Browser)
      GroupAgent      Powermarks      Program: Powermarks (Bookmark Manager)
      GroupAgent      SuperBot        Program: SuperBot (Web Site Copier)
      GroupAgent      Teleport        Program: Teleport Pro (Offline Browser tenmax.com)
      GroupAgent      WebStripper     Program: WebStripper (Offline Browser)
      GroupAgent      WebZIP  Program: WebZIP (Offline Browser)
      GroupAgent      Alcatel-        Device: Alcatel Mobile Phone
      GroupAgent      AvantGo Device: AvantGo (Offline Browser)
      GroupAgent      Blazer  Device: Blazer (PalmOS browser)
      GroupAgent      DoCoMo  Device: I-mode Compatible Mobile Phone
      GroupAgent      Elaine  Device: Palm browser
      GroupAgent      Ericsson        Device: Ericsson Mobile Phone
      GroupAgent      MOT-    Device: Motorola Mobile Phone
      GroupAgent      jBrowser        Device: WAP Browser jBrowser (built by Jataayu)
      GroupAgent      Liberate        Device: Liberate (Digital TV)
      GroupAgent      Mitsu   Device: Mitsubishi Mobile Phone
      GroupAgent      Nokia   Device: Nokia Mobile Phone
      GroupAgent      Panasonic       Device: Panasonic Mobile Phone
      GroupAgent      PHILIPS-        Device: Philips Mobile Phone
      GroupAgent      SAGEM-  Device: SAGEM Mobile Phone
      GroupAgent      SAMSUNG-        Device: Samsung Mobile Phone
      GroupAgent      SEC-    Device: Samsung Mobile Phone
      GroupAgent      SHARP-  Device: Sharp Mobile Phone
      GroupAgent      SIE-    Device: Siemens Mobile Phone
      GroupAgent      SonyEricsson    Device: Sony/Ericsson Mobile Phone
      GroupAgent      www.wapsilon.com        Device: www.wapsilon.com (WAP browser)
      GroupAgent      WebGo   Device: Offline Browser WebGo (Windows/CE)
      GroupAgent      WebTV   Device: WebTV
      GroupAgent      AmphetaDesk     RSS: AmphetaDesk
      GroupAgent      Awasu   RSS: Awasu
      GroupAgent      FeedDemon       RSS: Feed Demon
      GroupAgent      Feedreader      RSS: FeedReader
      GroupAgent      FeedOnFeeds     RSS: FeedOnFeeds Reader (http://minutillo.com/steve/feedonfeeds/)
      GroupAgent      FeedValidator   RSS: Archive.org Feed Validator
      GroupAgent      MagpieRSS       RSS: MagpieRSS (PHP-based reader)
      GroupAgent      MyHeadlines     RSS: MyHeadlines (http://www.jmagar.com/myh4)
      GroupAgent      NetNewsWire     RSS: NetNewsWire
      GroupAgent      NewsGator       RSS: NewsGator
      GroupAgent      Newz    RSS: Newz Crawler
      GroupAgent      nntp//rss       RSS: nntp//rss (http://www.methodize.org/nntprss/)
      GroupAgent      Radio*  RSS: Radio Userland
      GroupAgent      Oddbot  RSS: OddPost.com
      GroupAgent      PocketFeed      RSS: PocketFeed (Pocket PC RSS reader)
      GroupAgent      PostNuke        RSS: PostNuke CMS
      GroupAgent      SharpReader     RSS: SharpReader
      GroupAgent      Syndigator      RSS: Syndigator
      GroupAgent      Syndirella      RSS: Syndirella
      GroupAgent      UltraLiberalFeedParser  RSS: Ultra Liberal Feed Parser from Mark Pilgrim
      GroupAgent      Wildgrape       RSS: Wildgrape NewsDesk
      GroupAgent      china   SpamBot: china local browse 2.6
      GroupAgent      cloakBrowser    SpamBot: Fantoma
      GroupAgent      compatible)     SpamBot: Pretends to be Mozilla 3.0
      GroupAgent      Dattatec.com-Sitios-Top SpamBot: Referrer Spam for Dattatec.com
      GroupAgent      DTS     SpamBot: Beijing Express Email Address Extractor
      GroupAgent      EmailSiphon     SpamBot: EmailSiphon
      GroupAgent      fantomBrowser   SpamBot: Fantoma
      GroupAgent      fantomCrew      SpamBot: Fantoma
      GroupAgent      Franklin        SpamBot: Franklin Locator
      GroupAgent      Finder  SpamBot: Mac Finder
      GroupAgent      iaea.org        SpamBot: Atomic Harvester 2000
      GroupAgent      Industry        SpamBot: Industry Program
      GroupAgent      IUFW    SpamBot: IUFW Web
      GroupAgent      IUPUI   SpamBot: IUPUI Research Bot
      GroupAgent      Lincoln SpamBot: Lincoln State Web Browser
      GroupAgent      LinkSweeper     SpamBot: LinkSweeper
      GroupAgent      Microcomputers  SpamBot: Franklin Locator
      GroupAgent      Missauga        SpamBot: Missauga Locate
      GroupAgent      Missigua        SpamBot: Missauga Locate
      GroupAgent      NationalDirectory       Spambot: National Directory Email Harvester
      GroupAgent      Rainbow SpamBot: Under the Rainbow
      GroupAgent      Shareware       Spambot: Program Shareware
      GroupAgent      stealthBrowser  Spambot: Fantoma
      GroupAgent      Sweeper Spambot: Mail Sweeper
      GroupAgent      WEP     SpamBot: WEP Search
      GroupAgent      Xenu    SpamBot: Xenu
      GroupAgent      348NorthNews    Spider: 348north.com
      GroupAgent      almaden.ibm.com/cs/crawler      Spider: almaden.ibm.com
      GroupAgent      antibot Spider: Antidot.net http://www.antidot.net/Welcome/jsp/robots.html
      GroupAgent      http://Ask.24x.Info/    Spider: MnogoSearch.org
      GroupAgent      ASPseek Spider: ASPseek.org free search engine software
      GroupAgent      aspseek Spider: ASPseek.org free search engine software
      GroupAgent      augurfind       Spider: augurnet.ch (Swiss Search Engine)
      GroupAgent      Baiduspider     Spider: Baidu.com
      GroupAgent      BarraHomeCrawler        Spider: Barrahome.org
      GroupAgent      BBot    Spider: http://www.otthon.net/search/
      GroupAgent      Bilbo   Spider: wise-guys.nl
      GroupAgent      blo.gs  Spider: blo.gs
      GroupAgent      BlogBot Spider: Blogdex.net
      GroupAgent      Blogosphere     Spider: Blogosphere.us
      GroupAgent      BlogPulse       Spider: Blogpulse.com
      GroupAgent      BlogShares      Spider: BlogShares.com
      GroupAgent      Blogwise.com    Spider: Blogwise.com
      GroupAgent      boitho.com      Spider: boitho.com
      GroupAgent      bookwatch@onfocus.com   Spider: OnFocus.com Weblog BookWatch
      GroupAgent      brainoff.com/geoblog/   Spider: The World as a Blog (brainoff.com/geoblog/)
      GroupAgent      www.business-socket.com Spider: www.business-socket.com
      GroupAgent      CJNetworkQuality        Spider: CommissionJunction.com
      GroupAgent      combine Spider: http://www.lub.lu.se/combine/
      GroupAgent      COMBINE Spider: http://www.lub.lu.se/combine/
      GroupAgent      CoolBot Spider: www.suchmaschine21.de (German Search Engine)
      GroupAgent      CoologFeedSpider        Spider: CoolLog http://www.webfanatic.lunarpages.com/coolog/
      GroupAgent      CopyHunter      Spider: AWstats referrer log analyzer
      GroupAgent      daypopbot Spider: DayPop.com
      GroupAgent      Ecosystem/development   Spider: Blogging Ecosystem
      GroupAgent      EgotoBot        Spider: Egoto.com
      GroupAgent      ETS     Spider: Freetranslation.com
      GroupAgent      exactseek.com   Spider: exactseek.com
      GroupAgent      Exalead Spider: Exalead.com (AOL France)
      GroupAgent      FAST    Spider: All The Web
      GroupAgent      Fast    Spider: All The Web
      GroupAgent      Feedster        Spider: Feedster.com
      GroupAgent      FlickBot        Spider: DivX Networks FlickBot
      GroupAgent      Gaisbot Spider: GAIS (http://gais.cs.ccu.edu.tw/ )
      GroupAgent      GalaxyBot       Spider: Galaxy.com
      GroupAgent      Genome  Spider: Waypath.com
      GroupAgent      Gigabot Spider: Gigablast.com
      GroupAgent      Google* Spider: Google.com 
      GroupAgent      gossamer-threads.com    Spider: Links SQL
      GroupAgent      grub-client     Spider: Grub.org
      GroupAgent      htdig   Spider: htdig (Open Source Search Engine)
      GroupAgent      ia_archiver     Spider: Archive.org
      GroupAgent      INGRID/3.0      Spider: ilse.nl (Dutch search engine)
      GroupAgent      InternetSeer    Spider: InternetSeer.com (Web Site Monitoring)
      GroupAgent      internetseer    Spider: InternetSeer.com (Web Site Monitoring)
      GroupAgent      IXE     Spider: ideare.com
      GroupAgent      janes-blogosphere       Spider: BlogMatrix.com
      GroupAgent      jiffe   Spider: jiffe.com
      GroupAgent      k2spider        Spider: Verity Spider
      GroupAgent      larbin  Spider: larbin (http://sourceforge.net/projects/larbin/)
      GroupAgent      Leknor.com      Spider: Leknor.com GZIP Tester
      GroupAgent      Linkbot Spider: Linkbot link monitoring tool (Watchfire.com)
      GroupAgent      LinkHype        Spider: LinkHype.com
      GroupAgent      LinksManager.com        Spider: LinksManager.com
      GroupAgent      LinkWalker      Spider: seventwentyfour.com
      GroupAgent      LNSpiderguy     Spider: Lexis-Nexis
      GroupAgent      MnogoSearch     Spider: MnogoSearch.org
      GroupAgent      mogimogi        Spider: www.goo.ne.jp (Japanese Search Engine)
      GroupAgent      MSNBOT  Spider: MSN.com
      GroupAgent      MyWireServiceBot        Spider: MyWireService.com
      GroupAgent      NaverRobot      Spider: Naver.com (Korean Search Engine)
      GroupAgent      Netcraft        Spider: Netcraft Web Survey
      GroupAgent      NetResearchServer       Spider: Look.com
      GroupAgent      NIF     Spider: Newsisfree.com
      GroupAgent      NG/1.0  Spider: Exalead.com (AOL France)
      GroupAgent      NITLE   Spider: Blogcensus.net
      GroupAgent      NPBot   Spider: NameProtect.com
      GroupAgent      NRK-bruker      Spider: NRK.no
      GroupAgent      Openbot Spider: OpenFind (http://www.openfind.com.tw/)
      GroupAgent      Pompos  Spider: Dir.com
      GroupAgent      Popdexter       Spider: Popdex.com
      GroupAgent      psbot   Spider: Picsearch.com
      GroupAgent      QuepasaCreep    Spider: Quepasa.com (Spanish site)
      GroupAgent      Robozilla       Spider: Link Checker for Dmoz.org
      GroupAgent      Scooter Spider: Altavista
      GroupAgent      searchspider.com        Spider: searchspider.com
      GroupAgent      semanticdiscovery       Spider: semanticdiscovery.com
      GroupAgent      SideWinder      Spider: Infoseek
      GroupAgent      slurp@inktomi.com       Spider: Inktomi
      GroupAgent      spider@spider.ilab.sztaki.hu    Spider: http://www.ilab.sztaki.hu/websearch/
      GroupAgent      Spinne  Spider: webauskunft.at
      GroupAgent      Steeler Spider: Kitsuregawa Laboratory, The University of Tokyo
      GroupAgent      SurveyBot       Spider: whois.sc
      GroupAgent      Syndic8 Spider: Syndic8
      GroupAgent      Tagword Spider: Tagword - http://tagword.com/dmoz_survey.php
      GroupAgent      Teoma   Spider: Teoma 
      GroupAgent      Teradex Spider: Teradex.com (directory)
      GroupAgent      Terrar  Spider:  Terrar (http://www.terrar.com)
      GroupAgent      Technoratibot Spider: Technorati
      GroupAgent      T-H-U-N-D-E-R-S-T-O-N-E Spider: Webinator (http://www.thunderstone.com/texis/site/pages/webinator.html)
      GroupAgent      timboBot        Spider: BreakingBlogs.com
      GroupAgent      TurnitinBot     Spider: Turnitin.com
      GroupAgent      http://www.tutorgig.com/        Spider: tutorgig.com
      GroupAgent      Vagabondo       Spider: kobala.nl
      GroupAgent      verzamelgids    Spider: verzamelgids.nl
      GroupAgent      VoilaBot        Spider: Voila.com
      GroupAgent      W3C_Validator   Spider: W3C Validator
      GroupAgent      www.walhello.com        Spider: Walhello.com
      GroupAgent      WebCapture      Spider: WebCapture.biz
      GroupAgent      Webclipping     Spider: Webclipping.com
      GroupAgent      WebFilter       Spider: http://www.ils.unc.edu/webfilter/
      GroupAgent      WebGather       Spider: City Polytechnic of Hong Kong
      GroupAgent      WebRACE Spider: WebRACE (University of Cyprus, Distributed Crawler)
      GroupAgent      websitealert.net        Spider: websitealert.net (Monitoring System)
      GroupAgent      Zealbot Spider: Looksmart.com
      GroupAgent      ZyBorg  Spider: WiseNut.com
      GroupAgent      curl    Programming: curl library (PHP)
      GroupAgent      Indy    Programming: Indy (Delphi-based client)
      GroupAgent      Java    Programming: Java-based client
      GroupAgent      Jakarta Programming: Jakarta (Java)
      GroupAgent      libwww-perl     Programming: LIB-WWW (Perl library)
      GroupAgent      LWP:    Pogramming: LWP: : Simple (Perl library)
      GroupAgent      OPWV-SDK        Programming: OpenWave Mobile Development SDK
      GroupAgent      PEAR    Programming: PEAR Library (PHP)
      GroupAgent      PHP     Programming: PHP-based client
      GroupAgent      Python-urllib   Programming: URLLIB (Python library)
      GroupAgent      rdflib  Programming: rdflib (Python RDF library)
      GroupAgent      RPT-HTTPClient  Programming: RPT-HTTP (Java)
      GroupAgent      Snoopy  Programming: Snoopy (PHP class - http://snoopy.sourceforge.net/ )
      GroupAgent      SOFTWING_TEAR_AGENT     Programming: Softwing Tear Agent (Active Server Pages)
      GroupAgent      Wget    Programming: Wget library (http://www.gnu.org/software/wget/wget.html)
      GroupAgent      WinHttp.WinHttpRequest  Program: WinHttp.WinHttpRequest library (Visual Basic)
      GroupAgent      Bison   Proxy: Proxomitron (Proxomitron.info)
      GroupAgent      BorderManager   Proxy Novell Border Manager Security Suite
      GroupAgent      CE-Preload      Proxy: Cisco Content Engine
      GroupAgent      DA      Proxy: DA
      GroupAgent      junkbuster      Proxy: junkbuster (junkbusters.com)
      GroupAgent      AppleWebKit     Safari (OSX)
      GroupAgent      BFS_method      BeOS browser
      GroupAgent      Camino  Mozilla-based browser Camino (OSX)
      GroupAgent      iCab    iCab (Mac)
      GroupAgent      Konqueror       Konqueror
      GroupAgent      Links   Links (Text-based browser)
      GroupAgent      Lynx*   Lynx    (Text-based browser)
      GroupAgent      NCBrowser       NCBrowser (RISC OS)
      GroupAgent      Opera   Opera
      GroupAgent      SlimBrowser     SlimBrowser (http://www.flashpeak.com/sbrowser/sbrowser.htm)
      GroupAgent      w3m     w3m (Text-based browser - http://w3m.sourceforge.net/ )
      GroupAgent      rv:1.4  Mozilla 1.4
      GroupAgent      3.01    Navigator 3.01 (16-bit version)
      GroupAgent      4.01    Internet Explorer 4.01
      GroupAgent      5.01    Internet Explorer 5.01
      GroupAgent      5.0     Internet Explorer 5.0
      GroupAgent      5.23    Internet Explorer (Mac)
      GroupAgent      5.22    Internet Explorer (Mac)
      GroupAgent      5.21    Internet Explorer (Mac)
      GroupAgent      5.17    Internet Explorer (Mac)
      GroupAgent      5.16    Internet Explorer (Mac)
      GroupAgent      5.15    Internet Explorer (Mac)
      GroupAgent      5.14    Internet Explorer (Mac)
      GroupAgent      5.13    Internet Explorer (Mac)
      GroupAgent      5.12    Internet Explorer (Mac)
      GroupAgent      5.5     Internet Explorer 5.5 (Windows)
      GroupAgent      6.0     Internet Explorer 6.0 (Windows)
      GroupAgent      Mozilla/3.04Gold        Netscape 3.04 Gold
      GroupAgent      Mozilla/4.04    Netscape 4
      GroupAgent      Mozilla/4.06    Netscape 4
      GroupAgent      Mozilla/4.08    Netscape 4
      GroupAgent      Mozilla/4.5     Netscape 4.5
      GroupAgent      Mozilla/4.7     Netscape 4.7
      GroupAgent      Mozilla/4.8     Netscape 4.8
      GroupAgent      MSIE    Internet Explorer
      GroupAgent      Mozilla Netscape
      
      
      
      
    


Change the LogFile, OutputDir, HistoryName and IncrementalName paths to suit your needs.


Thanks to http://www.tnl.net/blog/entry/More_Webalizer.conf_hacking


- Next: Apache: How to use mod_expires with Apache 1.3 and 2.0
- Previous: Web Site Creation for Beginners: Makesure your suite is visitable - for everyone
Thank you for using LinuxReviews. Have a nice day!

Resources

Wikis

Package Search

Meet new people