AWStats

From LinuxReviews
Jump to navigationJump to search
AWStats
Awstats logo6.png
AWStats showing visitors by country
AWStats showing visitors by country
Original author(s)Laurent Destailleur
Initial releaseMay 2, 2000; 20 years ago (2000-05-02)
Stable release
7.8 / April 30, 2020; 6 months ago (2020-04-30)
Written inPerl
Operating systemCross-platform
TypeWeb Traffic Analytics
LicenseGNU General Public License
Documentationawstats.org/docs/
Documentationawstats.org/docs/
Ksysguard-icon-breeze.svg

AWStats is a free very configurable and highly advanced log analyzer written in Perl. It can be used to create detailed statistics using web server logs, mail logs or FTP server logs. AWStats creates a statistics file from parsed logs (using a cron job or system timer) that can be used to either create web traffic reports on the fly or static HTML pages with a timer/cron job. The reports cover a wide range of areas.

AWStats does not require you to keep old web server logs around once it has parsed them and added the relevant information to it's statistics file.

AWStats is highly configurable and it comes with a really long very well documented example configuration file.

AWStats is the most feature-complete, most mature and most powerful free software web server log analyzer there is. It has more features than webalizer and analog is kind of a joke in comparison.

At A Glance

AWStats at a glance
Increase.svg Decrease.svg
  • Provides a nice and simple overview of a websites traffic.
  • Uses web server logs. No web page tags or JavaScript required (It can optionally use a JS tracker script for additional details)
  • Generates numerous reports in all areas, it has all the bases covered.
  • Creates a statistics file from parsed logs so you can throw the old logs away once it's done with it.
  • Offers the option of adding a JavaScript tracker to pages for additional metrics like visitors screen size, color depth and JavaScript status (enabled/disabled)
  • Supports using a Geo-IP database (optional) to provide geographical information about visitors
  • Can either be used to create static HTML pages (using /tools/awstats_buildstaticpages.pl or create them from the statistics page on-the-fly
  • A bit hard and time-consuming to install and configure.
    • Requires configuring the web server to allow Perl to run in a /awstats/ ScriptAlias
  • Takes a while to setup and configure, just reading through the example configuration file takes forever.
  • Some of the pages in the on-the-fly generated statistics portal are a bit resource-intensive and slow to load. That's specially true if you enable a GenIP database and click "Hosts, Full list". AWStats is meant to be used with authentication and you should do that if you use it, it is unsuitable as a public-facing portal.
    • This is a disadvantage compared to less feature-rich alternatives like Webalizer. It doesn't matter if a million people hammered the static HTML pages it creates.
  • No way to specify what is and isn't a web crawler bot without editing a Perl source file

Features

AWStats Summary.jpg
The AWStats summary page.

AWStats creates a portal where you can view pretty much everything you would want to know about a websites traffic on a single Summary page. That page lists:

  • A monthly summary
  • Traffic by day in the current month
  • Traffic by weekdays
  • Traffic by hours
  • Top countries (if you configured it to use a GeoIP database)
  • Top 10 hosts (IPs)
  • Top 10 Robots / web crawlers
  • Visit duration
  • Top file types
  • Top "Downloads" (you can configure what file types are and aren't "downloads")
  • Top page URLs
  • Top 10 operating systems
  • Top 10 web browsers
  • A "Connect to site from" site with
    • "Links from an Internet Search Engine"
    • "Links from an external page"
  • Top search phrases and top search keywords

AWStats Hours and Countries.jpg
Hours and countries on the Summary page.

Most of the items listed in the Summary offer a "Full list" view where you can see more than just the top 10 items, or top XXX. How many items are shown in the summary in each section is configurable.

AWStats Operating systems.jpg
You can expand the short list of operating systems used by visitors on the "Summary page" and get a very long list. The same goes for all the other things AWStats makes statistics of.

AWStats is highly configurable, but only to a degree. The default configuration file is very long. Most of it's length is comments; the default example configuration file has 10+ lines above each item with a detailed description of what a configuration value does. That's good, because some manual configuration will likely be required depending on what web server and web content management system you use. As an example, if you use it with MediaWiki you will want to add:

SkipFiles="/ads.txt REGEX[^\/w] REGEX[^\/Special] REGEX[^\/awstats]

or it will show things like /w/load.php, used by MediaWiki to send CSS and JavaScript, as a page.

There are some things you can't configure. There is no way to add browser user-agents or new countries or operating systems in the configuration file. It is unlikely that a new country shows up, but it is likely that a new web browser or, more importantly, that new web crawlers show up.

Reverse DNS look-ups and caching

AWStats supports reverse DNS look-ups and it can cache those results. You can, alternatively, turn that off in the configuration file with:

DNSLookup=0
DynamicDNSLookup=0

being able to dig -x ${ip} and cache is somehow a feature. Why anyone would want to enable such a feature is a good question, it's utterly meaningless. You can, in fact, simply disable the list of top hosts who visited your site entirely with:

ShowHostsStats=0

Listing a bunch of random IP addresses seems kind of meaningless, and AWStats will create a page with all the IPs who last visited your site (with or without reverse DNS lookups) if that's enabled. It is a real privacy-concern if you make your statistics publicly available.

Accuracy

The AWStats code contains a decent, but not great or complete, list of web crawlers. It does have a useful Browsers ▸ Unknown page where you can see a list of the User-Agents it considers to be "Unknown" web browsers. That page will be filled with a metric ton of not actually web browsers after just a few hours or a day or two if web crawlers don't lik your site.

AWStats does support a single SkipUserAgents= configuration line where you can add a potentially really long list of User-Agents you'd like to completely exclude from your statistics. There is no way to configure what is and isn't robots beyond changing /usr/share/awstats/lib/robots.pm. You can do that, but it seems wrong to edit a Perl file in order to add something that should be in a configuration file.

AWStats will obviously show inflated statistics unless you either edit robots.pm or create a line in the configuration file with SkipUserAgents something like

SkipUserAgents="REGEX[^.*Mastodon] REGEX[^.*Pleroma] REGEX[^.*Synapse] REGEX[^.*okhttp] REGEX[^.*rss2email"] [REGEX[^Tiny_Tiny_RSS] REGEX[^.*NextCloud-News] REGEX[^Hatena_Antenna] REGEX[^.*Miniflux] REGEX[^.*hypefactors] REGEX[^.*WebexTeams]"

Having a long list of user-agents wrapped in REGEX on one line isn't ideal, but you can NOT do

SkipUserAgents="REGEX[^.*Mastodon]"
SkipUserAgents="REGEX[^.*Pleroma]"

Each configuration option has to be just one line (you can list as many as you want but the last one is the only one it will care about).

Traffic By Country

AWStats Countries.jpg
The Americans are in the lead followed by the Germans and the Chinese. Britannia will never rule again.

AWStats can show you what countries your visitors are coming from if you enable GeoIP and you have a maxmind GeoIP database handy. They changed the license for it in December 2019. You'll therefore have to submit to their license OR get an older GeoLite2-Country.mmdb. It's bundled with lots of software such as ElasticSearch, there is actually a fair chance you'll find one if you locate GeoLite2-Country.mmdb on a server that's been around for a while. GeoIP can be activated by adding this fine statement to the configuration file:

LoadPlugin="geoip2_country /etc/awstats/geoip/GeoLite2-Country.mmdb"

Verdict And Conclusion

AWStats is overall a very powerful and mature web log analyzer. The option to add JavaScript for additional tracking is a nice option for those who see deployment of client-side spyware JavaScript as acceptable. It's not, but there are people who lack a moral compass and they should have the freedom to choose.

Installation is not as strait forward as going to a commercial company's website and copying a JavaScript one-liner. AWStats works and it works well if you do take the time do install and configure it.

There are some requirements, you need to be able to install and run software on the web server and you need to know how to do that. If that's within your control and skill-set then AWStats really is a very nice tool worth considering. It is the best pure log analyzer there is. It has more features than Webalizer and the reports are prettier. And it's two decades ahead of analog.

AWStats is the obvious choice if you want a pure log analyzer. Open Web Analytics is a good alternative (though it uses cookies, which could be a problem) if you want a analytics server that runs on PHP and uses either PHP or JavaScript to get statistical data.

Installation & Configuration

AWStats has been around for a very long time and it's still actively maintained so all the GNU/Linux distributions have a package called awstats available. That package may or may not add a httpd/Apache configuration file for you. You will need to change that file or eradicate it if that's the case.

Configuring the awstats configuration file is probably the most time-consuming part of a AWStats setup. It comes with a very long and extremely well-documented (in comments) example configuration file called awstats.model.conf in /etc/awstats. The configuration files must be placed in that folder and they need to be named awstats.WEBSITENAME.conf (as in awstats.linuxreviews.org.conf).

All the options in the configuration file may seem overwhelming since the awstats.model.conf is a whopping 1619 lines long. That's a very long configuration file. Most of it is comments, and the defaults are mostly fine. Consider these two files as examples if the default configuration file seems a bit too long and a bit overwhelming for you:

File: awstats.linuxreviews.org.conf
LogFile="/home/httpd/vhosts/linuxreviews.org/statistics/logs/access_log"

SiteDomain="linuxreviews.org"
HostAliases="REGEX[^.*linuxreviews\.org$]"

Include "/etc/awstats/standard-configuration/mediawiki-standard.conf"
File: standard-configuration/mediawiki-standard.conf
LogFormat = 1
LogSeparator=" "
LogType=W

DNSLookup=0
DynamicDNSLookup=0
AllowToUpdateStatsFromBrowser=0
AllowFullYearView=1
EnableLockForUpdate=1

# Security. Set 0 if you do it from the web servers configuration
AllowAccessFromWebToAuthenticatedUsersOnly=0

SkipUserAgents="REGEX[^.*Mastodon] REGEX[^.*Pleroma] REGEX[^.*Synapse] REGEX[^.*okhttp] REGEX[^.*rss2email"] [REGEX[^Tiny_Tiny_RSS] REGEX[^.*NextCloud-News] REGEX[^Hatena_Antenna] REGEX[^.*Miniflux] REGEX[^.*hypefactors] REGEX[^.*WebexTeams]"
SkipFiles="/ads.txt REGEX[^\/w] REGEX[^\/Special] REGEX[^\/awstats]
NotPageList="css js class gif jpg jpeg png bmp ico rss xml swf eot woff woff2 mp3 mp4 ogg oga webm"

LoadPlugin="geoip2_country /etc/awstats/geoip/GeoLite2-Country.mmdb"

# Folders
DirData="/var/lib/awstats"
DirCgi="/awstats"
DirIcons="/awstatsicons"

# No pop-ups or new tabs when creating HTML pages
DetailedReportsOnNewWindows=0

# HTTP expires header, only relevant if you use CGI
Expires=3600

# Number of things to show
MaxNbOfDomain = 20
# MaxNbOfHostsShown = 20
ShowHostsStats=0
MaxNbOfLoginShown = 20
MaxNbOfRobotShown = 20
MaxNbOfDownloadsShown = 20
MaxNbOfPageShown = 20
MaxNbOfOsShown = 20
MaxNbOfBrowsersShown = 20
MaxNbOfRefererShown = 20
MaxNbOfKeyphrasesShown = 20
MaxNbOfKeywordsShown = 20

The AWStats documentation recommends copying awstats.linuxreviews.org.conf to a new file you can edit. Simply copying the values you need and/or want to change to a new file that's shorter and more manageable may be preferable.

You need a cron job running in order to create a statistics file from the web servers logs.

The awstats package may or may not install a cron job in /etc/cron.hourly/awstats. What you need, if it doesn't, is:

exec /usr/share/awstats/tools/awstats_updateall.pl now -configdir="/etc/awstats" -awstatsprog="/usr/share/awstats/wwwroot/cgi-bin/awstats.pl" >/dev/null

There are two deployment strategies available: You can either make static HTML pages with a cron job (or a systemd timer) OR you can run awstats.pl using CGI and create reports on-demand. You will have to setup a cron job that parses the logs and creates a statistics file, or files, in /var/lib/awstats/ regardless of which strategy you choose.

awstats may or may not add a web server configuration file to /etc/httpd/conf.d/awstats.conf. You will need to modify that file or add something similar to your web servers configuration if you want to go with the CGI option.

For a virtual host you'll need something like:

  <Directory "/usr/share/awstats/wwwroot">
    Options None
    AllowOverride None
    <IfModule mod_authz_core.c>
      # apache 2.4+
      Require ip 2001:2002:51ed:cee0::/64
    </IfModule>
  </Directory>
  # Additional Perl modules
  <IfModule mod_env.c>
    SetEnv PERL5LIB /usr/share/awstats/lib:/usr/share/awstats/plugins
  </IfModule>
  Alias /awstatsclasses "/usr/share/awstats/wwwroot/classes/"
  Alias /awstatscss "/usr/share/awstats/wwwroot/css/"
  Alias /awstatsicons "/usr/share/awstats/wwwroot/icon/" 
  ScriptAlias /awstats/ "/usr/share/awstats/wwwroot/cgi-bin/"

Take note of the Require ip line, you will want to change that to your IP/subnet or you won't have access. You can also restrict access in the AWStats configuration file.

Building static HTML pages

AWStats can create static HTML pages as an alternative to deploying it using CGI.

The developer either did not intend this to be a use-case or didn't think it through. Like at all.

There is a /usr/share/awstats/tools/awstats_buildstaticpages.pl too availalbe. It needs a -config= option and a output -dir=. The -config option expects you to name a host/domain. AWStats requires you to create configuration files in /etc/awstats/ named awstats.WEBSITENAME.conf. The -config should in that case be -config=WEBSITENAME - not what the configuration file is named (and no path). You can use it like:

/usr/share/awstats/tools/awstats_buildstaticpages.pl -config=linuxreviews.org -dir=/home/httpd/vhosts/linuxreviews.org/httpdocs/webstat/awstats

One huge problem with the awstats_buildstaticpages.pl tool is that it will create the Summary page with links to detailed pages it does not create. There is, of course, no -all option for creating all the sub-pages since that would make it easy to use and be practically useful.

The only way to create a HTML version of the report and all the sub-pages is to call awstats.pl a bunch of times with a -output= parameter listing each and every report and a pile to each page. That's just silly and plain stuipd but that's what you're required to do.

for WEBSITE in linuxreviews.org ; do 
  AWSTATS="/usr/share/awstats/wwwroot/cgi-bin/awstats.pl"
  FOLDER="/home/httpd/vhosts/${WEBSITE}/httpdocs/webstat/awstats/"

  ## Main Report
  ${AWSTATS} -config=${WEBSITE} -output -staticlinks > $FOLDER/awstats.${WEBSITE}.html
  ## Parts
  ${AWSTATS} -config=${WEBSITE} -output=alldomains -staticlinks > $FOLDER/awstats.${WEBSITE}.alldomains.html
  # ${AWSTATS} -config=${WEBSITE} -output=allhosts -staticlinks > $FOLDER/awstats.${WEBSITE}.allhosts.html
  # ${AWSTATS} -config=${WEBSITE} -output=lasthosts -staticlinks > $FOLDER/awstats.${WEBSITE}.lasthosts.html
  ${AWSTATS} -config=${WEBSITE} -output=unknownip -staticlinks > $FOLDER/awstats.${WEBSITE}.unknownip.html
  ${AWSTATS} -config=${WEBSITE} -output=alllogins -staticlinks > $FOLDER/awstats.${WEBSITE}.alllogins.html
  ${AWSTATS} -config=${WEBSITE} -output=lastlogins -staticlinks > $FOLDER/awstats.${WEBSITE}.lastlogins.html
  ${AWSTATS} -config=${WEBSITE} -output=allrobots -staticlinks > $FOLDER/awstats.${WEBSITE}.allrobots.html
  ${AWSTATS} -config=${WEBSITE} -output=lastrobots -staticlinks > $FOLDER/awstats.${WEBSITE}.lastrobots.html
  ${AWSTATS} -config=${WEBSITE} -output=urldetail -staticlinks > $FOLDER/awstats.${WEBSITE}.urldetail.html
  ${AWSTATS} -config=${WEBSITE} -output=urlentry -staticlinks > $FOLDER/awstats.${WEBSITE}.urlentry.html
  ${AWSTATS} -config=${WEBSITE} -output=urlexit -staticlinks > $FOLDER/awstats.${WEBSITE}.urlexit.html
  ${AWSTATS} -config=${WEBSITE} -output=browserdetail -staticlinks > $FOLDER/awstats.${WEBSITE}.browserdetail.html
  ${AWSTATS} -config=${WEBSITE} -output=osdetail -staticlinks > $FOLDER/awstats.${WEBSITE}.osdetail.html
  ${AWSTATS} -config=${WEBSITE} -output=unknownbrowser -staticlinks > $FOLDER/awstats.${WEBSITE}.unknownbrowser.html
  ${AWSTATS} -config=${WEBSITE} -output=unknownos -staticlinks > $FOLDER/awstats.${WEBSITE}.unknownos.html
  ${AWSTATS} -config=${WEBSITE} -output=refererse -staticlinks > $FOLDER/awstats.${WEBSITE}.refererse.html
  ${AWSTATS} -config=${WEBSITE} -output=refererpages -staticlinks > $FOLDER/awstats.${WEBSITE}.refererpages.html
  ${AWSTATS} -config=${WEBSITE} -output=keyphrases -staticlinks > $FOLDER/awstats.${WEBSITE}.keyphrases.html
  ${AWSTATS} -config=${WEBSITE} -output=keywords -staticlinks > $FOLDER/awstats.${WEBSITE}.keywords.html
  ${AWSTATS} -config=${WEBSITE} -output=errors404 -staticlinks > $FOLDER/awstats.${WEBSITE}.errors404.html
done

This is.. one way to do it since there's no -all option for some odd reason.

Alternatives

AWStats really is the best log analyzer there is. Webalizer is an alternative that works. It shows less information, but it does work. Webalizer has the ability to parse logs and store the logs in a history file so you can throw the logs away, just like AWStats. analog is in theory another alternative log analyzer you could use. It requires you to keep old logs forever & re-parse them all each time it creates a statistics page and it hasn't seen any real updates in a decade so it's not really an alternative.

Open Web Analytics is an alternative if you want web statistics. It's not a log-analyzer, it is a full analytics server written in PHP. It requires you to either call it from PHP or embed JavaScript in your web pages. It's an alright alternative to using a log analyzer. It does set browser cookies - something AWStats doesn't do.

TODO

  • Write something about Installation / Configuration

Links

Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.