Web browser fingerprinting

From LinuxReviews
Jump to navigationJump to search
Spying-icon.png

Web browser fingerprinting is the art of uniquely identifying website visitors by their web browser fingerprints so they can be tracked across browser sessions and other websites loading the same or similar scripts. Fingerprinting is done by checking a broad variety of factors including installed web browser extensions, permissions, audio format support and graphics capabilities. The combination of fingerprint data readily available in all modern web browsers allows for accurate user identification across the web without browser cookies or other voluntary identification.

30% of the webs top 1000 websites use web browser fingerprinting as of September 2020.

Pervasiveness

Javascript spyware script.jpg
A heavily obscured piece of JavaScript used to do web browser fingerprinting.

Most of the major web browser vendors begun phasing out support for third party cross-site tracking cookies early 2020, much to the ad-tech industry's frustration. This lead to an explosion in web browser fingerprinting as an alternative way to uniquely identify users across the web without relying on web browser cookies.

The sum of unique features present in someone's web browser can be used to create a unique fingerprint using simple JavaScript. This fingerprint can then be sent to a server and stored in a database. A web browser fingerprint is in many ways as good as a cross-site web browser cookie and it is, in some ways, better since it does not rely on storing anything client-side.

A study titled "Fingerprinting the Fingerprinters: Learning to Detect Browser Fingerprinting Behaviors" (Fpinspector-sp2021.pdf) published on August 4th 2020 by Umar Iqbal of The University of Iowa, Zubair Shafiq of the University of California and Steven Englehardt from the Mozilla Corporation sound that 30% of the worlds top 1000 websites were using web browser fingerprinting techniques as of August 2020.

Distribution of Alexa top-100K websites that deploy web browser fingerprinting August 2020.
Rank Websites (count) Websites (%)
1 to 1K 266 30.60%
1K to 10K 2,010 24.45%
10K to 20K 981 11.10%
20K to 50K 2,378 8.92%
50K to 100K 3,405 7.70%
1 to 100K 9,040 10.18%

The larger and more profitable a website is, the more likely it is that it is using web browser fingerprinting techniques. Sites operated by the American government are an exception, they deploy tracking based on web browser fingerprinting even though the sites themselves are not all that profitable.

"Fascism should more appropriately be called Corporatism because it is a merger of state and corporate power"

Benito Mussolini

Sites deploying web browser fingerprinting operated by the US government include the fbi.gov (Democratic Party spyring), weather.gov (National Weather Service), uspto.gov (US Patent Office), nhtsa.gov (National Highway Traffic Safety Administration), irs.gov (The Internal Revenue Service), sec.gov (US Securities and Exchange Commission). All of those sites embed the exact same JavaScript spyware libraries provided by an American outfit called ForeSee who describe themselves as providing products that lets their clients "Listen to all customer signals - across web, mobile, location, and contact center — then connect data for deeper insights". The US data can use this cross-side web browser fingerprinting to match a visit to weather.gov with a later visit to fbi.gov without the use of web browser cookies.

Techniques

Modern web browsers allow scripts to collect a huge variety of data about the web browser they are running on. It is the wide variety of data that makes web browser fingerprinting possible. It is also what makes it hard to resist or obfuscate web browser fingerprinting as long as a web browser allows JavaScript execution.

Most fingerprinting scripts do not begin fingerprinting until a few milliseconds after a web page has loaded. This is usually accomplished by calling setTimeout and/or requestIdleCallback.

Commonly Used Metrics

Metic Stable between browser upgrades Stable between browsers
on the same device
userAgent Mostly Dialog-cancel.svg
language Dialog-ok.svg Dialog-ok.svg
colorDepth Dialog-ok.svg Dialog-ok.svg
deviceMemory Dialog-ok.svg Dialog-ok.svg
pixelRatio Dialog-ok.svg Dialog-ok.svg
hardwareConcurrency Dialog-ok.svg Dialog-ok.svg
screenResolution Dialog-ok.svg Dialog-ok.svg
availableScreenResolution Dialog-ok.svg Dialog-ok.svg
timezoneOffset Dialog-ok.svg Dialog-ok.svg
timezone Dialog-ok.svg Dialog-ok.svg
sessionStorage Dialog-ok.svg Dialog-cancel.svg
localStorage Dialog-ok.svg Dialog-cancel.svg
indexedDb Dialog-ok.svg Dialog-cancel.svg
addBehavior Dialog-ok.svg Dialog-cancel.svg
openDatabase Dialog-ok.svg Dialog-cancel.svg
cpuClass Dialog-ok.svg Dialog-ok.svg
platform Dialog-ok.svg Dialog-ok.svg
doNotTrack Dialog-ok.svg Dialog-cancel.svg
plugins Dialog-ok.svg Dialog-cancel.svg
canvas Dialog-ok.svg (most of the time) Dialog-cancel.svg
webgl Dialog-ok.svg (most of the time) Dialog-cancel.svg
webglVendorAndRenderer Dialog-ok.svg Dialog-ok.svg (most of the time)
adBlock Dialog-ok.svg Dialog-cancel.svg
touchSupport Dialog-ok.svg Dialog-ok.svg
fonts Dialog-ok.svg (most of the time) Dialog-ok.svg (most of the time)
audio support Dialog-ok.svg Dialog-ok.svg
enumerateDevices only on mobile devices Blond-anime-girl-with-red-questionmark.png

Several web browser fingerprinting data-points can be used to do device fingerprinting. Using data-points that differ depending on the device/OS yet remain the same independent of what web browser is used on a device makes it possible to do device fingerprinting. What this means, concretely, is that a website can use techniques that allow them to see that you are still you if you switch from one web browser to another.

Audio/Video Formats

Web browsers let JavaScript scripts check and see what audio and video formats are supported by a web web browser. The audio and video formats supported by a web browser varies depending on web vendor, version and the operating system it is running on.

File: https://services.nofraud.com/js/device.js
function(){var e=['video/mp4; codecs="avc1.42c00d"','video/ogg; codecs="theora"','video/webm; codecs="vorbis,vp8"','video/webm; codecs="vorbis,vp9"','video/mp2t; codecs="avc1.42E01E,mp4a.40.2"'],t=["audio/mpeg",'audio/mp4; codecs="mp4a.40.2"','audio/ogg; codecs="vorbis"','audio/ogg; codecs="opus"','audio/webm; codecs="vorbis"','audio/wav; codecs="1"'],n=function(e){for(var t={},n=0;n<e.length;n++){var r=e[n];window.MediaSource?t[r]=window.MediaSource.isTypeSupported(r):window.WebKitMediaSource&&(t[r]=window.WebKitMediaSource.isTypeSupported(r))}return t};
return{audio:n(t),video:n(e)}}

Canvas Fingerprinting

Canvas fingerprinting is one of the oldest and most well-known web browser fingerprinting techniques. The Mozilla Firefox web browser has "fingerprinting protection" specific to this particular technique. Do not be fooled by this, it is just one of many ways scripts identify users by their web browsers properties. Protection against canvas fingerprinting is nice, but it makes zero practical difference.

WebGL

WebGL features present in modern web browsers allow scripts to get a lot of browser-specific information. WebGL features are typically not browser-independent, they can not identify someone using different browsers on the same device/OS as being the same person. The WebGL features can be used to very uniquely identify users. Here is one example of code deployed by the NoFraud corporation:

File: https://services.nofraud.com/js/device.js
function(e,t){
  var n="attribute vec2 attrVertex; varying vec2 varyinTexCoordinate; uniform vec2 uniformOffset; void main() {   varyinTexCoordinate = attrVertex + uniformOffset;   gl_Position = vec4(attrVertex, 0, 1); }",r="precision mediump float; varying vec2 varyinTexCoordinate; void main() {   gl_FragColor = vec4(varyinTexCoordinate, 0, 1); }",o=e.createBuffer();
  e.bindBuffer(e.ARRAY_BUFFER,o);
  var a=new Float32Array([-.2,-.9,0,.4,-.26,0,0,.732134444,0]);
  e.bufferData(e.ARRAY_BUFFER,a,e.STATIC_DRAW),o.itemSize=3,o.numItems=3;
  var l=e.createProgram(),c=e.createShader(e.VERTEX_SHADER);e.shaderSource(c,n),e.compileShader(c);
  var u=e.createShader(e.FRAGMENT_SHADER);
  return e.shaderSource(u,r),e.compileShader(u),e.attachShader(l,c),
e.attachShader(l,u),e.linkProgram(l),e.useProgram(l),
l.vertexPosAttrib=e.getAttribLocation(l,"attrVertex"),
l.offsetUniform=e.getUniformLocation(l,"uniformOffset"),
e.enableVertexAttribArray(l.vertexPosArray),
e.vertexAttribPointer(l.vertexPosAttrib,o.itemSize,e.FLOAT,!1,0,0),
e.uniform2f(l.offsetUniform,1,1),e.drawArrays(e.TRIANGLE_STRIP,0,o.numItems),
i(t.toDataURL())}

Obfuscation

A lot of the somewhat malicious scripts on the web, like those used for user tracking using web browser fingerprinting, are heavily obfuscated. Take a long look at this example from a 200 kB JavaScript from a semi-popular tracking vendor used by many top 10000 websites:

function(_0x38108e,_0x25f99e){var _0x5138c2=function(_0x40649a){while(--_0x40649a){_0x38108e['\x70\x75\x73\x68'](_0x38108e['\x73\x68\x69\x66\x74']());}};_0x5138c2(++_0x25f99e);}(_0xd604,0x15b));var _0x4d60=function(_0x4f0c2d,_0x181af7){_0x4f0c2d=_0x4f0c2d-0x0;var _0x5e4665=_0xd604[_0x4f0c2d];if(_0x4d60['\x69\x6e\x69\x74\x69\x61\x6c\x69\x7a\x65\x64']===undefined){(function(){var _0x2e8507=Function('\x72\x65\x74\x75\x72\x6e\x20\x28\x66\x75\x6e\x63\x74\x69\x6f\x6e\x20\x28\x29\x20'+'\x7b\x7d\x2e\x63\x6f\x6e\x73\x74\x72\x75\x63\x74\x6f\x72\x28\x22\x72\x65\x74\x75\x72\x6e\x20\x74\x68\x69\x73\x22\x29\x28\x29'+'\x29\x3b');var _0x5dc16e=_0x2e8507();

A function like:

function(_0x40649a){
  while(--_0x40649a){
    _0x38108e['\x70\x75\x73\x68'](_0x38108e['\x73\x68\x69\x66\x74']());
  }
}

makes zero sense. There are some tools that can be used to make slightly more sense of it but even then it's a mess. This kind of obfuscation is pretty common.

Fingerprinting Software

Most of the more advanced fingerprinting scripts are only available as either commercial software or pay-as-you-go services. There are some ready-to-be-deployed free software libraries available for those who want to engage in this pure evil behavior.

Fingerprint Libraries
Package Story Type npm cdn License
fingerprintjs2 "99.5% identification accuracy" JavaScript Library Dialog-ok.svg npm: fingerprintjs2 Dialog-ok.svg cdnjs.com:fingerprintjs2 MIT Software License

End-User Countermeasures (or "Is Resistance Futile?")

All modern web browsers allow scripts to do detailed browser fingerprinting, and device fingerprinting, as long as they allow JavaScript to run. It is a unavoidable possibility when JavaScript is executed.

A few web browsers, like Mozilla Firefox, is marketed as having "Enhanced Tracking Protection" against "Fingerprinters". That's just marketing. It means absolutely nothing to anyone who employ data collection scripts. Let's say an actor is using 20 fingerprinting data-points to uniquely identify people. Let's also say Mozilla had done a much better job of building fingerprinting resistance into Firefox and their browser had built-in "protection" again 15 of those. That would leave the sum of their protections as one unique factor and five unique factors, totaling six, which is more than enough to make one in a million stand out.

Resisting web browser fingerprinting without breaking functionality is very hard. Checking what audio and video formats a web browser happens to support is useful for those who make custom video players embedded in websites. It can also be used as one of many factors in web browser fingerprinting. Lying so a web browser which does not support AV1 video claims it does is not a good or viable solution. Removing JavaScript functionality that have been standard features for years is also not a good solution. There is not much web browser vendors can do without breaking many existing websites and that is not something they are willing to do.

The only barely good solution for end-users, as of late 2020, is to ensure that scripts doing web browser fingerprinting don't run. This can, to some degree, be accomplished by using a web browser filter like the Ublock Origin extension with a frequently-updated privacy-focused filter list like EasyLists easyprivacy.txt. Using a filter helps but it does not prevent scripts that are not on that list from running.

The only sure way to avoid web browser fingerprinting is to disable JavaScript. That breaks several sites, notably larger websites. That is not a great solution and it is not a very practical solution. It is, however, the only actual solution due to one simple truth: You can uniquely identify someone by their web browser if you are able to run JavaScript in their web browser.

Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.