Matomo

From LinuxReviews
Jump to navigationJump to search
Matomo
The default Matomo dashboard
The default Matomo dashboard
Developer(s)Innocraft
Repositorygithub.com/matomo-org/matomo
Written inPHP
Available in54 languages
TypeOpen Web Analytics
LicenseGNU GPL v3
Documentationmatomo.org/docs/
Documentationmatomo.org/docs/
Ksysguard-icon-breeze.svg

Matomo is a highly advanced free GNU GPL licensed web traffic analytics server written in PHP with a lot of features. You will have to install it on a server, preferably one capable of handling a bit of a load of you plan on using it to do web traffic analysis using JavaScript tags on one or more websites with high traffic volume. There's also a HTTP tracking API available.

Matomo can track users using JavaScript, a HTTP tracking API and to some degree using an image tag. It can also, in theory but on practice, analyze traffic using web server logs.

There are numerous Matomo plugins for commonly used content management systems available. Most of them do little more than add the JavaScript tags required. There's also numerous plugins for Matomo itself available that can extend it's functionality. Some of those are freely available under free software licenses and some are commercial.

Matomo is developed by a for-profit corporation who would like to sell you "could" hosting services for the Matomo software. The actual software is free software available under the GNU General Public License version 3. Their website refers to downloading the software as "Install Matomo On-Premise". You don't have to install it "On-Premise", it's free software you can install on a server on premise in the basement or on a remote cloud or bare metal server.

At A Glance

Open Web Analytics at a glance
Increase.svg Decrease.svg
  • Customizable professional-looking traffic dashboard
  • Easy installation with a nice guide
  • Supports four different options for web traffic tracking:
    • JavaScript, with a configurable tag
    • Image tracking tag. Practically useless, most visits will be seen as your sites /root page
    • HTTP API
    • In theory, server logs
  • Supports Do Not Track web browser headers (configurable, default is on)
  • Support "anonymizing" IP addresses down to their /16 (10.0.X.X).
  • Very resource-heavy compared to other (simpler) alternatives like Open Web Analytics and pure log analyzers like AWStats.
  • The Matomo web portal is riddled with advertisements for "Premium Features and Services".
  • Server log import is flawed in so many ways it's practically useless
  • Using JavaScript for tracking is basically a requirement

Features

Matomo dashboard - default 2020-09-14.jpg
The default Matomo web portal.

Matomo can in theory track a websites visitors using:

The JavaScript is the only viable option if you want to quickly deploy Matomo and not write a proper plug-in for the CMS you're using yourself. That is what you will be using even if you download and install a plugin for your CMS; all the Matomo "plugins" for WordPress, MediaWiki and other commonly used content management systems are basically foolproof methods of inserting the required JavaScript tags. Matomo does offer a very powerful HTTP tracking API you could use to silently add statistics without the need for any JavaScript if you are willing to develop something for your existing web software stack yourself.

The Matomo analytics portal has a really nice and professional-looking interface with floating widgets you can move around. There's a lot of different widgets you can add to the dashboard in order to see daily, weekly or monthly statistics at a glance.

Matomo dashboard - customized 2020-09-15.jpg
The Matomo dashboard slightly customized to have a three column layout where the third one is broader than the first two and the default spam widgets offering commercial services are eradicated.

Most of the Matomo widgets are highly configurable. As an example, the Channel Types widget can show a simple table, a simple table with visitor engagement metrics, a vertical bar graph, a pie chart or a tag cloud.

The Matomo dashboard offers more than just an overview with widgets. There are three out-of-the-box usable menus with sub-menus that can provide a lot of useful information: Visitors, Behavior and Acquisition. The overviews in these sub-menus can be changed to show a day, week, month or a set numbers of days

The Visitors menu lets you see an Overview, Visits Log, Real-time list of visitors, a Real-time Map, Locations people visited from (if you enabled GeoIP), Devices people used, Software (shows OS and web browsers in different columns), Times people visited and it can also show User IDs and Custom Variables if you configured it to track those things.

The Behaviour menu lets you see the most viewed Pages (just URLs), Entry pages (the first page a visitor viewed) and Exit pages (the last page a visitor viewed), Page titles (same as Pages but with titles instead of URLs), Site Search (Matomo can be configured to track site-internal search engines), Outlinks, Downloads, Events (has to be configured), Content (doesn't show anything by default, perhaps it's configurable), Engagement, which will only work if you track using browser cookies and Transitions. Transitions will show you the most popular pages people went through to get to other pages.

Matomo Transitions View 2020-09-15.jpg
The Matomo Transitions View in the Behaviour menu.

Acquisition can show you a lot of detailed information about where visitors came from or how they found your website. The sub-menus under Acquisition let you see an Overview, "All channels" (search engines and referring websites), Search Engines & Keywords, Websites, Social Networks (excluded from Websites) and, if you configure any, Campaigns. There is a Campaign URL Builder you can use to track specific URLs. This can be used to track how well advertising campaigns or e-mail spam ("newsletter") are in bringing traffic to a specified page.

There's also Goals and Marketplace available in the main Dashboard menu. Goals can be used to configure and track.. goals. They can be configured in a number of ways. Matomo's website has a goals documentation page explaining how to use this feature and a promotional Goals video in very Oxford British. It's informative if you speak Oxford British.

Marketplace has two sub-menus. Browse lets you see a list of plugins you can install to add more functionality. Some of them are free software and some of them are proprietary software sold at a price. The Premium Features sub-menu is just a list of the proprietary software plugins the Innocraft corporation who maintains Matomo offers.

Tracking Methods

Matomo offers four different methods of web visitors:

  • Embedded client-side JavaScript within your sitess pages
  • A highly configurable a image tag
  • A HTTP API (actually the same API the image tag uses)
  • Importing web server logs

There's plenty of plugins for various content management systems available. Most of them are utterly useless jokes that do little or nothing beyond inserting the JavaScript tags used by the JavaScript method.

Inserting JavaScript that tracks your websites users into every page is the easiest and most strait-forward method. This requires copy-pasting a code-block from Matomo's web interface into your websites pages. The JavaScript snippet can have a configuration options that prevents cookies from being set. Some big brother total surveillance features in the interface will not work if you disable cookies.

The image tag method is only for those who are willing to spend some time implementing a proper way of calling it (in which case you might as well use it as a pure HTTP API, it's the same API). The "default" image tag proposed by Matomo's web interface is borderline useless. The basic image tracking tag proposed by the admin in interface under Websites ▸ Tracking Code is:

<img src="https://yoursite.tld/matomo/matomo.php?idsite=5&amp;rec=1" style="border:0" alt="" />

Modern browsers tend to strip the HTTP referrer field. Other data is obviously not available to a PHP file being opened from a IMG tag in a web browser. The result is that the default image tag makes every page a visitor views appear to be your website's front page. It is, by default, utterly useless and you should ignore it as an option unless you are willing to spend time customizing it. The API supports a lot of arguments and it is well documented on the matomo "Tracking HTTP API" page. The image tag can be useful if you make your content management system call it with the pages title, the page URL, referring URLs (if any) and so on.

The same matomo/matomo.php file used to serve an image tag can be called directly from a content management system, or any piece of software (like mobile "apps").

Matomo can in theory import server logs and use those without any need for JavaScript or image tags or anything else on your website. The log importer is very slow and utterly stupid. There does not appear to be any way to prevent it from importing URLs used to load things like CSS and JavaScript (like /load.php) and the data generated is borderline useless. It works by parsing the logs using a Python program called import_logs.py which posts the data from the logs to the HTTP API. There is a log analytics README.me but it's not very useful.

CMS Plugins

There are Matomo "plugins" for a lot of content management systems out there, most of which are very basic and essentially garbage-tier. They are all fine if you expect a plug-in to insert the required JavaScript tag for you. Don't expect them to be able to actually seamlessly integrate with a Matomo instance on the same server using the HTTP API or PHP directly.

The Matomo website has a "Integrate" page that lists them all. The one you're using is probably listed on that page. The short list below is limited to the ones we've tested.

MatomoAnalytics

(MediaWiki)

MatomoAnalytics is a MediaWiki "extension" that places the requires JavaScript on pages and that's all it does. It has a $wgMatomoAnalyticsDisableJS configuration option. The extension doesn't actually track anything if you disable JavaScript with that setting.

Matomo does have a HTTP API available so this, and other "integration" plug-ins, could use that and not rely on JavaScript. This one does not. You might as well just write 5 lines of codes and insert the JS with $out->addHeadItem("HeaderAdvertisement", $matomoTracking ); yourself.

It's fine if you never written a line of PHP and you just want to install something that insert Matomo's JavaScript tag for you.

Anti-Features

It's hard not to notice the unacceptably high amount of subtle advertisements in Matomo's interface. Many of them are in the form of misleading "Help texts" that are actually advertisements for proprietary software plugins. Most of this garbage is in files named promoForTopic.twig in plugins/ProfessionalServices/templates/promo.

You can put an end that most of that crap from the matomo/ folder with this fine command:

for f in plugins/ProfessionalServices/templates/promo*;do echo > $f;done

The advertising in Matomo isn't all that bad, there's free software with worse anti-features in them. We just note it as a minus column since it does have advertisement as an anti-feature.

Verdict And Conclusion

Matomo dashboard - map 2020-09-14.jpg
Matomo showing a real-time map of a websites visitors

Matomo is a fully featured and very mature web traffic analytics platform. It's got a nice configurable dashboard, all the reports you would wish for with configurable time-frames and a really nice and fully featured API for developers.

Matomo is a completely fine alternative to solutions like Google Analytics and other proprietary software solutions. It is developed like a commercial product, and it sort-of is, but the product itself is fully free software (though many of the plugins available for it are not).

There are some downsides to it worth considering:

  • JavaScript tracking is the only viable out-of-the-box method that just works. Using an image tag or the HTTP API requires you to develop something for it (or hire someone to do that) since all the "integration" plugins seem to be incapable of doing anything beyond serving the JavaScript tag it uses.
  • It is very resource-heavy. It's not something you can just put on a shared server with a dozen websites and expect it to have zero impact on the average server load - depending on how much traffic you have. You won't really notice it if you've got less than 5k visitors/day. You'll probably

System Requirements

Matomo website story is that you need an "app" server with "2 CPU, 2 GB RAM, 50GB SSD disk" to handle 100,000 page views per month (3300 per day). The supposed requirements go up to "4 CPU, 8 GB RAM, 250GB SSD disk" if you want to track one million page views per month (33k/day) (from matomo.org: MATOMO REQUIREMENTS). While the clueless zoomer who calls web server software "app" is wrong about the storage space requirements and one CPU with 4 cores isn't "4 CPU", it is an indication of something worth being aware of: Matomo is a rather heavy and resource-hungry PHP application which puts a noticeable load on the MySQL server.

Matomo is something that will add noticeable load to a server even if you're just tracking 10000 visitors per day, it is not like adding a hourly cron job generating statistics from log using AWStats. A mere 10000/day won't break your box won't break your average bare metal server, but it does add enough of a load to make it show up in system monitoring tools like Munin as increased I/O utilization.

You should put it on a dedicated server or a dedicated cloud instance, not on a shared server hosting your websites, if you have something north of 20k visitors per day on average.

Innocraft's Cloud Offerings

Innocraft, the company who makes Motomo, is for-profit corporation who does offer commercial hosting for the Matomo analytics server software. The prices for their various solutions can be seen at matomo.org/pricing. Their offerings have, like so many other "cloud" offerings, small print that says there's an increased cost if you pay for an amount of pageviews and you go beyond that number (it's $5 per 10,000 additional page-views if you go for 100,000). Using that service seems as risky as all cloud offerings are in terms of gradually or rapidly increasing costs. Put slightly differently, if you pay $80/month for a dedicated bare metal server that's more than capable of handling five million pageviews a month instead of paying Innocraft/Matomo about $250/month for a million pageviews you're not risking having to pay an additional $26 if the monthly number of pageviews ends up at 1,000,010 or $78 extra if it's 1,200,010.

Matomo's cloud hosting products seem to be overly expensive and really risky in terms of cost. It's hard to not recommend against using them if you are an individual or a small company.

All of the above being said: Innocraft's offerings may be worth it even thought they are wildly overpriced, from a pure server-cost perspective, with increasing costs as pageviews increase. Having an employee setup, install and maintain Matomo isn't free. Matomo's prices could be perfectly reasonable if you work for a medium-sized or big corporation and you would have to hire a consultant or add an employee to get Matomo installed on your own bare-metal server. You can look at matomo.org/pricing if that's the case.

Installation

The installation package can be acquired from builds.matomo.org/matomo.zip. You will need to have a MySQL server with a database and a user-name and password you can use.

Download it and extract it to a web server folder where you want it to be installed and become publicly accessible. Matomo works by serving JavaScript embedded in the pages of the website(s) you want to track, so it needs to be publicly accessible.

wget https://builds.matomo.org/matomo.zip
unzip matomo.zip

Matomo unpacks to a folder named matomo/. Matomo expects the web server to be able to write to that folder tree so you will need to chown -R to make the files be owned by the user the web server is running as.

The next step is to navigate to the code>matomo folder using a web browser without any ad-blocking extensions like Ublock Origin (/matomo/ is on several anti-tracker blacklists). That takes you to a installation guide with 8 steps, two of which require nothing more than clicking "Next" (unless you encounter an error). The first is just a "system check". The next step requires you to enter a MySQL database. That's followed by creating a super user login name and password. Filling in some e-mail address is required. There's two check-boxes below it asking if you want spam or not. We used a unique e-mail address just for Matomo, we'll update this page if they send spam even though we didn't sign up for it. Step 6 is to configure a website you'd like to spy on/analyze. It proposes using a Website URL beginning with http:// like it's the 1990s, make sure you change that if you're using https://. There is an option to enable "Ecommerce" features. Next, you'll get a JavaScript tracking code snippet. This should go in the <head></head> tag.

How you need to add the required JavaScript block to your sites pages will depend on the CMS you are using. There are plenty of plugins that mostly do nothing beyond inserting the JavaScript for at matomo.org/integrate/.

Enabling GeoIP

You will need to make a GeoIP database available. Using the Maxmind City database is probably the easiest way if you're running it on a Linux box. Maxmind changed their Terms Of Service on Decmeber 30th, 2019. You can still get a special "download URL" for the database if you sign up with them and accept their new terms. The easier way is to install one of the many Linux programs that bundle the database or the package geolite2-city-20191217 with a older, but still fine, GeoLite2-City.mmdb file. That file needs to be placed in /matomo/misc.

Go to Administration ▸ System ▸ Geolocation and enable DBIP / GeoIP 2 (Php) once GeoLite2-City.mmdb (or another simliar file if you choose DBIP) is in place.

Configuring A Practically Required Cron Job

Matomo can create reports from the data in it's database when you visit the web portal. That's really slow and it gets slower as the database fills with more data. You really should setup a hourly or bi-hourly cron job for it. The job needs to run matomo/console with a lot of parameters including the URL to your Matomo installation. The cron job should look something like:

nice -n 19 /path/to/matomo/console core:archive --force-all-websites --force-all-periods=315576000 --force-date-last-n=1000 --url='https://your.analytics.serve.rltd/matomo/'

You don't strictly need that and you probably don't if you have 500 visitors per day or less.

Generating reports does some time (a minute or five depending on how many sites you ahve and how much traffic they have) and it does add some load to the server.

Disabling Useless Plugins

One of the first things you should do after installing Matomo is to go to Administration ▸ Settings ▸ Plugins and look over all the plugins listed there. You will be overwhelmed by a list of 50 plugins, 46 of them active by default, in 3.14.

All the plugins provide some kind of functionality someone likes and/or needs. There is a fair chance that you don't need or want half of them for your particular use-case. You canc make Matomo a bit more light-weight if you disable things like IntranetMeasurable if it's not used for a Intranet website and Ecommerce if you're not running a Ecommerce site.

Using Web Server Logs To Track Visitors

First, a word of warning. Matomo is clearly not designed to be used as a log analyzer. It is really bad at it. See Log Analyzers for better alternatives.

The main issues with using it as a long analyzer are:

  • Matomo will happily re-import the same log time and time again. You will have to reset your server log after it has been imported if you choose to import logs with a cron job/systemd timer.
  • Matomo supports excluding URL parameters but it does not support excluding URLs. There is no way to make it not track URLs like /feed.rss or /load.php (something MediaWiki uses to serve PHP). That makes URLs loading JavaScript and CSS your sites "most popular" pages..

Server logs can be imported by calling the misc/log-analytics/import_logs.py Python script bundled with Matomo. It requires some options:

--recorders The number of threads
--idsite The site id set in the Matomo installation
--url= The URL of your Matomo installation (not the site you're importing logs for)
--recorders= Number of CPU threads to allow
--enable-http-errors Import errors
--enable-http-redirects Import redirects
--enable-static Import static files
--enable-bots Import information about web crawlers

The log importer can be used this way:

python /path/to/matomo/misc/log-analytics/import_logs.py --url=http://analytics.example.com 
--idsite=1234 --recorders=2 --enable-http-errors --enable-http-redirects --enable-static
--enable-bots /path/to/access.log

Importing logs is really slow. It the Python program parses the log file and dumps the result in /matomo/piwik.php with a few requests per second. It puts a fairly high load on the server. System monitoring programs like Munin will start showing a very clear and very visible spike if you add a hourly job that imports logs to Matomo.

matomo.org/docs/log-analytics-tool-how-to/ and the log analytics README.me file can provide some help if you want to use this method.

Alternatives

Open Web Analytics is a somewhat similar free software web traffic analysis package. It's nowhere near advanced as Matomo, it has less features and there's no floating dashboard with movable and configurable widgets. It's like Matomo but a lot simpler. It's also way more light-weight.

Links

The Matomo homepage is at matomo.org. There's ample documentation at matomo.org/docs/ and developer.matomo.org/guides/integrate-introduction.

Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.