Webalizer/Configuration file example

From LinuxReviews
Jump to navigationJump to search

This is a rather long Webalizer configuration file example you can use as a basis for creating your own.

Note: Webalizer does not accept all kinds of logs. This works:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
CustomLog /home/httpd/vhosts/linuxreviews.org/statistics/logs/access_log combined

The following configuration can be executed with nice -n 19 /bin/webalizer -Y -c $f 2>/dev/null and you can savely echo > to your log-file once it's done thanks to the use of history files (unless you want the log saved for some reason).

HostName linuxreviews.org
LogFile   /home/httpd/vhosts/linuxreviews.org/statistics/logs/access_log
OutputDir /home/httpd/vhosts/linuxreviews.org/httpdocs/webstat/

HistoryName     /home/httpd/vhosts/linuxreviews.org/statistics/logs/webalizer.hist
IncrementalName /home/httpd/vhosts/linuxreviews.org/statistics/logs/webalizer.current
Incremental     yes

UseHTTPS yes

# GeoIP
GeoDB yes
GeoDBDatabase /home/httpd/webalizer/GeoDB.dat
# Don't do DNS lookups
DNSChildren 0

TopURLs         50
TopReferrers    50
TopAgents       40
TopSites 0
TopKSites 0

AllSearchStr yes

Quiet           yes
FoldSeqErr      yes

MangleAgents 0

IgnoreURL /w/
IgnoreURL /w/load.php
IgnoreURL /load.php
IgnoreURL /favicon.ico
IgnoreURL /robots.txt
IgnoreURL /wp-login.php
IgnoreURL /webstat
IgnoreURL /Special:NewsFeed
IgnoreURL /ads.txt
IgnoreURL /en.rss
IgnoreURL /feed.rss
IgnoreURL /atom.rss
IgnoreURL /Special:RecentChanges
IgnoreURL /Special:MyTalk
IgnoreURL /Special:CreateAccount

IgnoreReferrer gkkiparis
IgnoreReferrer mosbordell

PageType        htm*
PageType        cgi
PageType        php
PageType        shtml

Quiet           yes
FoldSeqErr      yes
HideURL         *.gif
HideURL         *.GIF
HideURL         *.jpg
HideURL         *.JPG
HideURL         *.png
HideURL         *.PNG
HideURL         *.ra
HideURL         *.ogg
HideURL         *.mp3
HideURL         *.webm
 
SearchEngine    348north.com    search=
SearchEngine    abcsearch.com   terms=
SearchEngine    alltheweb.com   q=
SearchEngine    altavista.com   q=
SearchEngine    antisearch.net  KEYWORDS=
SearchEngine    aolsearch       query=
SearchEngine    ask.com ask=
SearchEngine    ask.co.uk       ask=
SearchEngine    augurnet.ch     q=
SearchEngine    baidu.com       word=
SearchEngine    barrahome.org   query=
SearchEngine    blogdex.net     q=
SearchEngine    blogdigger.com  queryString=
SearchEngine    blogosphere.us  s=
SearchEngine    blogmatrix.com  search=
SearchEngine    blogwise.com    query=
SearchEngine    boitho.com      query=
SearchEngine    buscador.ya.com q=
SearchEngine    by.com  query=
SearchEngine    daypop.com      q=
SearchEngine    dir.com req=
SearchEngine    dmoz.org        search=
SearchEngine    dogpile.com     q=
SearchEngine    dpxml   qkw=
SearchEngine    egoto.com       keywords=
SearchEngine    elf8888.at      query0=
SearchEngine    eureka.com      q=
SearchEngine    excite  search=
SearchEngine    feedster.com    q=
SearchEngine    gais.cs.ccu.edu.tw      q=
SearchEngine    galaxy.com      k=
SearchEngine    gigablast.com   q=
SearchEngine    google  q=
SearchEngine    goo.ne.jp       MT=
SearchEngine    hotbot.com      query=
SearchEngine    infoseek.com    qt=
SearchEngine    ixquick.com     query=
SearchEngine    kobala.nl       qr=
SearchEngine    lycos.com       query=
SearchEngine    look.com        q=
SearchEngine    looksmart       key=
SearchEngine    mamma.com       query=
SearchEngine    metacrawler     q=
SearchEngine    msn.com q=
SearchEngine    msxml   qkw=
SearchEngine    mysearch.com    serachfor=
SearchEngine    naver.com       query=
SearchEngine    netscape.com    search=
SearchEngine    northernlight.com       qr=
SearchEngine    ntlworld.com    q=
SearchEngine    openfind        query=
SearchEngine    overture.com    Keywords=
SearchEngine    picsearch.com   q=
SearchEngine    popdex  query=
SearchEngine    quepasa.com     q=
SearchEngine    search.com      qt=
SearchEngine    searchspider.com        q=
SearchEngine    search.earthlink        q=
SearchEngine    suchmaschine21.de       search=
SearchEngine    syndic8 ShowMatch=
SearchEngine    technorati      query=
SearchEngine    teensearch      query=
SearchEngine    teoma.com       q=
SearchEngine    teradex.com     q=
SearchEngine    texis   q=
SearchEngine    voila   kw=
SearchEngine    walhello        key=
SearchEngine    waypath.com     key=
SearchEngine    webcrawler      searchText=
SearchEngine    webfanatic.lunarpages.com       q=
SearchEngine    whois.sc        q=
SearchEngine    wisenut.com     q=
SearchEngine    yahoo   p=

IgnoreAgent 360Spider
IgnoreAgent FemtosearchBot
IgnoreAgent www.semrush.com/bot
IgnoreAgent www.bing.com/bingbot
IgnoreAgent www.sogou.com
IgnoreAgent ahrefs.com/robot
IgnoreAgent yandex.com/bots
IgnoreAgent Go-http-client
IgnoreAgent go.mail.ru/help/robots
IgnoreAgent Chrome/41.0.2272.96
IgnoreAgent www.google.com/bot
IgnoreAgent Googlebot-Image
IgnoreAgent GrapeshotCrawler
IgnoreAgent Mediapartners-Google
IgnoreAgent apple.com/go/applebot
IgnoreAgent baidu.com/search
IgnoreAgent Googlebot-Image
IgnoreAgent ZoominfoBot
IgnoreAgent FeedFetcher-Google
IgnoreAgent python-requests
IgnoreAgent Apache-HttpClient
IgnoreAgent opensiteexplorer.org/dotbot
IgnoreAgent researchscan.comsys
IgnoreAgent Nimbostratus
IgnoreAgent SeznamBot
IgnoreAgent www.exabot.com/go/robot
IgnoreAgent tt-rss.org
IgnoreAgent commoncrawl.org/faq
IgnoreAgent OpenGraphReader
IgnoreAgent yacybot
IgnoreAgent Ocarinabot
IgnoreAgent VelenPublicWebCrawler
IgnoreAgent QuiteRSS
IgnoreAgent TweetmemeBot
IgnoreAgent seozoom
IgnoreAgent Firefox/40.1
IgnoreAgent Bytespider
IgnoreAgent cliqz.com/company
IgnoreAgent lipboardProxy
IgnoreAgent ideasandcode
IgnoreAgent chimebot
IgnoreAgent hatena.ne.jp
IgnoreAgent GarlikCrawler
IgnoreAgent G-i-g-a-b-o-t
IgnoreAgent DomainCrawler
IgnoreAgent The Knowledge AI
IgnoreAgent Upflow/1
IgnoreAgent picoFeed
IgnoreAgent MojeekBot
IgnoreAgent Seekport
IgnoreAgent Datanyze
IgnoreAgent SemanticScholarBot
IgnoreAgent trendictionbot
IgnoreAgent BLEXBot
IgnoreAgent mj12bot
IgnoreAgent MegaIndex.ru/2
IgnoreAgent megaindex
IgnoreAgent Qwantify/Bleriot
IgnoreAgent MetaJobBot
IgnoreAgent help.coccoc.com
IgnoreAgent Barkrowler
IgnoreAgent www.qwant.com
IgnoreAgent domaincrawler.com
IgnoreAgent website-datenbank
IgnoreAgent Mastodon
IgnoreAgent rssbot
IgnoreAgent AccompanyBot
IgnoreAgent Scrapy
IgnoreAgent facebookexternalhit
IgnoreAgent Twitterbot
IgnoreAgent PetalBot
IgnoreAgent Linespider
IgnoreAgent okhttp
IgnoreAgent DuckDuckBot
IgnoreAgent Jetslide
IgnoreAgent NewsBlur
IgnoreAgent Amazonbot
IgnoreAgent SiteCheckerBotCrawler
IgnoreAgent redditbot
IgnoreAgent QuiteRss
IgnoreAgent Amazonb
IgnoreAgent app.hypefactors.com
IgnoreAgent archive.org_bot
IgnoreAgent Synapse
IgnoreAgent proximic
IgnoreAgent special_archiver
IgnoreAgent Pleroma

GroupAgent      "Mozilla/5.0 (X11; CrOS x86_64" ChromiumOS
HideAgent       Mozilla/5.0 (X11; CrOS x86_64

GroupAgent      "Win64; x64; rv:" Firefox on Windows
HideAgent       Win64; x64; rv:

GroupAgent      "Mozilla/5.0 (Windows NT 10.0; rv:" Firefox on Windows
HideAgent       Mozilla/5.0 (Windows NT 10.0; rv:

GroupAgent      "Mozilla/5.0 (X11; Linux x86_64; rv" Firefox on Linux
HideAgent       Mozilla/5.0 (X11; Linux x86_64; rv

GroupAgent      "X11; Linux aarch64; rv:" Firefox on Linux
HideAgent       X11; Linux aarch64; rv:

GroupAgent      "X11; Fedora; Linux x86_64; rv" Firefox on Linux (Fedora)
HideAgent       X11; Fedora; Linux x86_64; rv

GroupAgent      "X11; Ubuntu; Linux x86_64; rv:" Firefox on Linux (Ubuntu)
HideAgent       X11; Ubuntu; Linux x86_64; rv:

GroupAgent      "Mozilla/5.0 (X11; Linux i686; rv:" Firefox on Linux
HideAgent       Mozilla/5.0 (X11; Linux i686; rv:

GroupAgent      "Mozilla/5.0 (Android 10; Mobile; rv:" Firefox on Android
HideAgent       Mozilla/5.0 (Android 10; Mobile; rv:

GroupAgent      "Mozilla/5.0 (Android 9; Mobile; rv:" Firefox on Android
HideAgent       Mozilla/5.0 (Android 9; Mobile; rv:

GroupAgent      "Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/" Chrome on Windows
HideAgent        Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/

GroupAgent      "WOW64) AppleWebKit" Chrome on Windows
HideAgent       WOW64) AppleWebKit

GroupAgent      "Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit" Chrome/Chromium on Linux
HideAgent       Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit

GroupAgent      "SamsungBrowser" Samsung Web Browser (Android)
HideAgent       SamsungBrowser

GroupAgent      "Mozilla/5.0 (iPhone; CPU iPhone OS" iPhone/Safari
HideAgent       Mozilla/5.0 (iPhone; CPU iPhone OS

GroupAgent      "Mozilla/5.0 (X11; Linux x86_64; rv" Firefox on Linux
HideAgent       Mozilla/5.0 (X11; Linux x86_64; rv

GroupAgent      "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome" Chromium on Linux
HideAgent       Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome

GroupAgent      "Raspbian Chromium" Chromium on Raspbian
HideAgent       Raspbian Chromium

GroupAgent      "Dalvik/2.1.0 (Linux; U; Android 10; Redmi" Xiaomi Redmi
HideAgent       Dalvik/2.1.0 (Linux; U; Android 10; Redmi
Back to Webalizer