Open Web Analytics

From LinuxReviews
Jump to navigationJump to search
Open Web Analytics
Open Web Analytics 1.7.0 dashboard.jpg
Original author(s)Peter Adams
Initial release1.0rc1 June 1, 2006; 15 years ago (2006-06-01)
Stable release
1.7.0 / September 16, 2020; 16 months ago (2020-09-16)
Repositorygithub.com /Open-Web-Analytics/Open-Web-Analytics/
TypeWeb Traffic Analytics
LicenseGNU General Public License Version 2
Documentationgithub.com /Open-Web-Analytics/Open-Web-Analytics/wiki
Websitewww.openwebanalytics.com
Ksysguard-icon-breeze.svg

Open Web Analytics (OWA) is a free software web traffic analysis package written in PHP. It can be used to track visitors by placing a JavaScript snippet on a web page, by calling it in a PHP file or by using a special WordPress plug-in. There's also, in theory, a MediaWiki plugin but it's broken. OWA will, once it is setup and configured, provide a lot of detailed and useful information about a web sites visitors.

OWA can be configured to use modules for additional functionality. It comes bundled with a few. One, Domstream, is described as Logs the users mouse and other DOM movements. That's.. quite frankly.. a bit creepy. The right things for a web site owner to do is to resist the urge to use that module. And the right thing to do as someone visiting a website using Open Web Analytics is to assume that privacy-invading piece of spyware is activated and actively spying on you and your mouse movements.

Open Web Analytics can be used to track visitors by calling it from a PHP file using a few lines of PHP if it is installed on the same host/server as the site you want to track. The PHP method does not require client-side JavaScript though it can insert JavaScript tags if a module requires client-side JavaScript. OWA can also be enabled using a WordPress plugin. A third option is to add a general-purpose JavaScript block to all the pages you want to track. That method can be used with any content management system and it does not require Open Web Analytics to be installed on the same host as the website(s) you want to track. A OWA installation can be used to track one website or many sites if you go with the JavaScript option. You can set it up on a dedicated server if you have a lot of different sites served from different locations and you want to have statistics for all your sites available in one location.

At A Glance[edit]

Open Web Analytics at a glance
Increase.svg Decrease.svg
  • Provides a detailed informative overview over a websites visitors
  • Can be used to track visitors by calling it from PHP without the use of client-side JavaScript (It does need to insert a JavaScript tag when it is invoked from PHP you want to use Domstream tracking or other features requiring client-side JS)
  • OWA can, alternatively, be used to track any website, even those not on the same host, by embedding JavaScript within a sites HTML code
  • Can be setup on a dedicated host or cloud instance independent of the site(s) it tracks if you use the JS method
  • One OWA installation can be used to track multiple sites, adding sites to track is easy once it's setup and configured
  • Configurable modules can provide additional functionality
  • Assigns a owa_visitor_id to every visitor & tracks them forever
  • Sets browser cookies
  • Relies client-side JavaScript unless you call it from a PHP script on the same host
  • Privacy-invasive plugins like Domstream that "Logs the users mouse and other DOM movements" can make it very Orwellian
  • Does not care about Do Not Track web browser headers (you can wrap it in $_SERVER['HTTP_DNT'] yourself)

Features And Usability[edit]

Open Web Analytics provides a nice Dashboard whre you can see how many visits, unique visitors and page views your site has had. It includes a section with "Top Content" where you can see your sites most viewed pages and two pie-chart graphs showing "Visitor Types" and Traffic Sources". There's also a list of "Latest Visits" at the bottom of the page with all kinds of incriminating information about your sites latest visitors such as their IP address, the page(s) they viewed, their visit length (if they viewed multiple pages) and their web browser type.

Open Web Analytics 1.7.0 dashboard.jpg
The Open Web Analytics dashboard in OWA 1.7.0.

A side-bar on the left side of a sites statistics view can be used to get more information. The sections and sub-sections are:

  • Content
    • Top Pages
    • Page Types
    • Feeds
    • Entry Pages
    • Exit Pages
  • Action Tracking
    • Action Groups
  • Visitors
    • Geo-location
    • Domains
    • Visitor Loyalty
    • Visitor Recency
    • Visitor Age
    • Browser Types
    • Operating Systems
  • Traffic
    • Search Terms
    • Inbound Link Text
    • Search Engines
    • Referring Web Sites
    • Campaigns
    • Ad Performance
    • Ad Types
    • Creative Performance
    • Attribution History
  • Goals
    • Funnel Visualization

Visitor ID & User Tracking[edit]

Open Web Analytics creates a unique visitor ID for each visitor. It uses this to show information about a visitors prior visits, among other thing. This kind of data-collection may raise some eye-brows if you're a privacy-concerned person and you don't like the idea of being tracked when you visit websites. It may be useful, but it may be a bit much.

The dashboard lets you click on each unique visitor and view that persons click-stream, visit length, browser type and prior visits.

Open Web Analytics 1.7.0 user-tracking.jpg
The single-user "Visit Clickstream" page in Open Web Analytics 1.7.0. Notice how this person has been assigned a special 1602372927941738560 user-ID.

OWA's ability to track users across time relies on two browser cookies.

Open Web Analytics 1.7.0 cookies in Falkon.jpg
The cookies OWA sets viewed in the Falkon web browser (because Mozilla Firefox decided to remove that functionality and Chromium never had it to begin with). Notice how Immortal Poetry is showing a "cookie warning" even though it's already set two browser cookies.

How OWA sets and uses cookies may be a non-issue for you or the corporation you work for. It could also be a big deal; if you show EU users (or everyone) a cookie warning then you will have to ensure that OWA is not invoked until/unless the "cookie warning" has been accepted. Take a good look at the screenshot of the Immortal Poetry website above: That's what happen if you just add OWA tags and don't care, not even a little, about how it fits in with your sites privacy policy (if you have one), cookie warning functionality, advertisement filtering based on DNT headers and things like that.

The way Immortal Poetry shows a cookie warning after OWA cookies are set in the above screenshot is something worth thinking about before you add OWA to a website. OWA was intentionally temporarily implemented like that for screenshot purposes, you may want to avoid permanently using a configuration like that.

GDPR compliance[edit]

You may want to consider if you want to use Open Web Analytics if you are in the fascist union or your web server is located within EU borders. We are not lawyers, we do not know if the data Open Web Analytics collects violates the fascist union GDPR law or not.

We can tell that Open Web Analytics does use cookies without a law-degree. That means that you may want to not invoke it (not give users the JavaScript or not call it using PHP) if a visitor either rejects a cookie warning (if your site has one) or a visitors web browsers sets the Do Not Track web browser header.

Any "privacy policy" document your site may or may not have is also a concern, you may want disclose OWA tracking in your sites "privacy policy" (if it has one).

Content Reports[edit]

The "Content" section in OWA has the sub-pages "Top Pages", "Page Types", "Feeds", "Entry Pages" and "Exit Pages". It will optionally show "Domstrems" if you enable that module.

"Top Pages" gives you a nice overview of most visited ("Top Pages") pages with a list of the number of visits those pages have had.

The "Page Types" and "Feeds" views showed "(not set)" and nothing respectively. It may be possible to get those views to show something by tweaking the default configuration.

Open Web Analytics 1.7.0 entry pages.jpg
Open Web Analytics 1.7.0 showing the Entry Pages on a website.

"Entry Pages" and "Exit Pages" will show you the first and last pages a single visitor viewed along with the number of visits to those pages, the average visit duration, unique visitors, the number of pages per visit those who entered or exited through those pages viewed and the bounce rate.

"Action Tracking"[edit]

The "Action Tracking" menu has, by default, a single sub-menu: "Action Groups". It shows nothing by default. You can define certain actions that should be tracked. It is possible to configure it to show events like a visitor clicking a button to start playing a video and things like that. This functionality relies on client-side JavaScript, this functionality can not be used if you want to invoke OWA using PHP and you do not want to serve any client-side JS.

The Visitors View[edit]

The "Visitors" section offers the following reports:

Visitors Shows a list of your sites latest visitors along with their IP, OWA tracking ID, browser type, pages viewed, visit length, prior visits and the traffic source (search engines, etc).
Geo-location Shows you a list of the countries your visitors came from and how many came from each country (assuming you installed the GeoIP database during installation)
Domains Shows a list of host-names your visitors came from (relies on reverse IP look-ups)
Visitor Loyalty Shows you how many of your visitors were new vs visitors who've previously visited your site
Visitor Recency Shows you you the number days between visits by regular visitors
Visitor Age Tells you how many days it has been since the first visits from people who re-visit your site
Browser Types Shows a nice clean list of the web browsers your visitors were using
Operating Systems Shows a list of operating systems used by your sites visitors

The Traffic View[edit]

Open Web Analytics 1.7.0 search engine report.jpg
The "Search Engine Report" in Open Web Analytics indicating that everyone uses Google. Nobody uses Bing. This view is very misleading in OWA 1.7.0: DuckDuckGo, Quant and other search engines are listed as "Referrals".

The "Traffic" section gives you a list of "Traffic Sources" with "Top Sources" (like search engines), "Top Referrals" (sites linking to your site) and "Top Keywords".

That section has the following sub-menus:

Search Terms This is supposed to show you the search terms people typed into search engines when they visited your sites. It will show some search keywords, but that will be the exception. The vast majority of search engine visitors will be grouped as "(not provided)". This isn't OWA's fault, most web browsers stopped providing full referrer URLs years ago.
Inbound Link Text This shows the number of links from other sites but it doesn't actually show the text other sites use for inbound links. Perhaps it can do that if you configure it correctly. It seems obvious that OWA would have to fetch remote pages and analyze them, it could be that links don't show up unless OWA or the web server it's running on is configured to allow that.
Search Engines The Search Engines view shows a limited list of search engines and the number of visitors from each search engine. This could be useful for figuring out where your traffic is coming from if it was anywhere near accurate. It's not; everything but Google and Bing are shown as "Referring Web Sites", not search-engines.
Referring Web Sites This shows a list of sites people used before they found yours. OWA will show DuckDuckGo and other search engines along with regular non-search-engine websites. That seems like a bug/problem: The list of search engines OWA knows about is lacking.
Campaigns Doesn't show anything. At all. Probably something that requires configuration.
Ad Performance Doesn't show anything. At all. Probably something that requires configuration.
Ad Types Shows you the total number of visits your site has had during the period you've used OWA wrapped in a text that says "There were 456 visits from ads". Some configuration is likely required to make this report show anything meaningful. It seems unlikely that 100% of your sites visitors came from .. ads. You'll probably love the default report if you're the head of marketing at a medium-sized corporation as it will show that you're very successful at managing ad-campaigns.
Creative Performance Same "Ad Types", shows the total number of visitors. Supposedly all through ads.
Attribution History Same as "Ad Types" and "Creative Performance", it will by default show the total number of visitors. All through ads, of course.

The list of search-engines OWA knows about is defined in modules/base/classes/client.php. It does know about ['d' => 'duckduckgo', 'q' => 'q'] so it is a bit strange that DuckDuckGo shows up as a "Referrer" and not a "Search Engine" in the reports. It does not know about Qwant, Gigablast, Exaled or other lesser-known search engines.

Notable Shortcomings[edit]

OWA has a very limited list of search engines and a very limited list of web crawlers. The lack of web crawler awareness means that the numbers it presents will be inflated. The effect may be 2-3x or more the amount of actual visitors if a site with little traffic and a small overall bump on a site with a lot of traffic.

Verdict And Conclusion[edit]

Open Web Analytics provides a nice dashboard where you can see very detailed information about your website(s) visitors. It's very compelling. There's a lot of sub-reports with useful information.

Privacy issues are a real concern with this software, your visitors may not like the level of tracking it provides and the fact that it uses cookies to set a unique ID that's tracked seemingly forever (until the cookie expires or is deleted, anyway). It is not worse than similar commercial solutions like Google Analytics and user tracking using JavaScript and cookies are very common on the web. It is probably better, most of the commercial alternatives use various web browser fingerprinting techniques. That doesn't make the user tracking it does morally acceptable. The legality of assigning a unique ID to each visitor and storing it forever is also a concern.

Setting up a Open Web Analytics server probably better if you're already Google Analytics or a similar proprietary solution or you're considering it. Data on your server isn't automatically shared with who knows who for who knows what purpose. You (or the corporation you work for) get to control the data OWA collects, so OWA is better than any third-party solution in that regard.

If you think invasive tracking is acceptable and you want very detailed statistics about your users presented in a fairly nice way then Open Web Analytics may be for you. If you are privacy-minded yourself and you have any kind of respect for your websites visitors then you may want to think twice before you deploy Open Web Analytics.

Installation[edit]

One OWA installation can be used to track multiple websites. You can install it on it's own domain or sub-domain or install it in a sub-folder on any virtual host. Separating it to it's own virtual host may be a good idea.

OWA requires a MySQL database to function. You can use the same DB as your websites content management system uses but you should, if possible, let it have it's own database.

The GitHub releases page offers .tar tarballs with no sub-folder in it. All the files, and some sub-folders, will just unpack to the current folder if you just tar xf that file. Be very aware of this, don't just extract it in a folder like your websites root.

Open Web Analytics expects to be in a web hosts folder accessible to the world and it expects to be in /owa/. Going with that location is the simplest choice. You can place the actual PHP files elsewhere and create an alias in your web servers configuration fiel.

See the GitHub OWA installation page for WordPress instructions. We have not tested the WordPress plugin, perhaps it works.

OWA comes with a MediaWiki plugin that's plain broken. Don't use that even if you use MediaWiki, just ignore it. There are two general options you can use on any content management system including MediaWiki. Use one of those.

Downloading and extracting the tarball can be done like this:

wget https://github.com/Open-Web-Analytics/Open-Web-Analytics/releases/download/1.7.0/owa_1.7.0_packaged.tar
mkdir owa
cd owa
tar xf ../owa_1.7.0_packaged.tar

Notice that owa_1.7.0_packaged.tar does not extract it's content into a folder which is why you should create a folder called owa first and extract it there.

OWA expects to be able to write a configuration file to that folder so you may want to make the owa/ folder owned by the user the web server is running at with chown.

GeoIP location requires a GeoLite2-City.mmdb GeoIP database from Maxmind. They changed the license for it no December 30, 2019. Many distributions have a package named something like geolite2-city available with a version that's something like geolite2-city-20191217. Just install geolite2-city if you don't want to register with Maxmind and accept the new GeoIP license. That file is also included as part of most elasticsearch packages for some reason (it's placed in /usr/share/elasticsearch/modules/ingest-geoip/GeoLite2-City.mmdb). The GeoLite2-City.mmdb file needs to be placed in /owa/owa-data/maxmind/GeoLite2-City.mmdb. See the OWA "Maxmind GeoIP Module" wiki page if you have problems getting GeoIP working with OWA.

OWA's configuration is done by navigating to the location where you installed it (yoursite.tld/owa/) using a web browser. It will show a Open Web Analytics Installer page the first time you, or anyone, loads that location. Clicking Let's Get Started takes you to a page where you have to enter a MySQL, and it has to be MySQL, database to use. Next step is to enter a the domain name you'd like to track, your "admin name", a some e-mail address (perhaps a real one) and a password.

You'll be able to configure one or more websites to track once you set it up and you've logged in with the admin account. You can either copy and paste a JavaScript block into the HTML pages of the site(s) you want to track or call it using PHP. Using JavaScript may be easier and it is the only option if you want to run OWA on one host or cloud instance and your site on another. However, the PHP method is more discrete since it can be used without any JavaScript. All you need is something similar to:

<?php
if ( isset($_SERVER['HTTP_DNT']) && $_SERVER['HTTP_DNT'] !== true ){
  require_once('/path/to/owa/owa_env.php');
  require_once('/path/to/owa/owa_php.php');
  $owa = new owa_php();
  $owa->setSiteId('eaa6909389305d0d2775881461de8234');
  $owa->setPageTitle($mytitle);
  $owa->trackPageView();
} ?>

You will have to change the setSiteID variable to something else (the Open Web Analytics dashboard will provide that for you). You can optionally add a tag that inserts JavaScript if you want to log mouse movements. Note that your PHP CMS needs to provide a page title for setPageTitle or OWA won't actually display any title. You can do that with MediaWiki with something like:

<?php
require_once('/path/to/owa/owa_env.php');
require_once('/path/to/owa/owa_php.php');

$wgHooks['BeforePageDisplay'][] ='startOWAtracking';
function startOWAtracking( OutputPage &$out, Skin &$skin ){
  global $wgTitle;
  $mytitle=$wgTitle->getPrefixedText();

  $owa = new owa_php();
  $owa->setSiteId('2ee60595e76911be59e70bbd7bd17f00');
  $owa->setPageTitle($mytitle);
  $owa->trackPageView();
}
?>

The PHP method is more in-depth described in the OVA wiki as "PHP Invocation".

Disabling Cookies[edit]

OWA lacks any web interface setting to enable/disable cookies. It will happily set them. The only way to fix this is to change owa_coreAPI.php where you will find this code-block starting at line 1162:

        // set compact privacy header
        header(sprintf('P3P: CP="%s"', owa_coreAPI::getSetting('base', 'p3p_policy')));
        // owa_coreAPI::debug('time: '.$expires);
        setcookie($cookie_name, $cookie_value, $expires, $path, $domain);
        return;

OWA sets cookies for the admin interface login in another file, this is just for user tracking. Commenting out the header() and the setcookie() functions shown above stops OWA from setting any cookies when it counts visitors. You will lose some functionality by doing this.

Bugs[edit]

OWA 1.7.0 show two Undefined variable: remote_host in /owa/modules/base/classes/trackingEventHelpers.php errors on every single visitor who uses IPv6 to connect. The statistics dashboard does show visitors who use IPv6, it does not appear to have any practical implications beyond the web servers error log being filled with errors on every IPv6-using visitor. Without looking too deeply into why, it's easy to fix:

--- a/modules/base/classes/trackingEventHelpers.php  2020-10-11 06:24:37.371145337 +0100
+++ b/modules/base/classes/trackingEventHelpers.php  2020-10-11 06:24:26.342072043 +0100
@@ -604,7 +604,9 @@
                  if ( is_array( $result ) && isset( $result[0] ) && isset( $result[0]['host'] ) ) {
 
                      $remote_host = $result[0]['host'];
-                 }
+                 } else {
+                    $remote_host = $ip_address;
+                }
 
             } else {

Alternatives[edit]

Webalizer and Analog are log file analyzers capable of showing some of the information Open Web Analytics can show you using nothing but web server logs. Webalizer is the better of those two.

Links[edit]

OWA is being developed at github.com/Open-Web-Analytics/Open-Web-Analytics/. There's also a WordPress website at www.openwebanalytics.com.


avatar

Anonymous (9ac0afa494)

5 months ago
Score 0

You should seriously reconsider your usage of "Fascist Union". This is insulting.

The GDPR was created to protect the consumer from abusive pratices, like Cambridge Analytica. Ironically, this was made by the seemingly free world, not within Europe. Educate yourself.
avatar

Anonymous (60918ce8c0)

22 seconds ago
Score 0

What it was "created for" and what it now does is something different. It was created to "protect" us from google and facebook but now every single blog is fucked and half illegal because of this. Hundreds of thousands of sites that never sold or abused any data are on the brink of beeing illegal... nice. And in the end the big players still do it but you on the other hand are not allowed to show ads on your blog without consent....what a joke. Maybe it is your site and the user can just fuck off and not visit ist if he doesnt want ads and stuff. Remember that the big players will find a way but small ones will die. And in the end you cry about monopolism and stuff. Very nice. More education is needed on "it things" in the genereal sheep population.

So i feel the frustration here.
Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.