Comparison of HOST file blacklists
Linux and other operating systems support using a hosts file for matching domains to IPs. Linux systems use the file
/etc/hosts. This functionality can be used to filter away bad hostnames used for malware, spam and other undesirable evil by matching bad DNS hostnames to
0.0.0.0. There's many huge lists you can download and use floating around the Internets. Here's a comparison of some of the more popular ones.
Host files can be placed in the file
/etc/hosts. It can be used for pointing to valid hosts like machines on your LAN. It can also be used to blacklist. This can be done in two ways, either by an entry with
127.0.0.1 domain.tld or
0.0.0.0 domain.tld. The latter is preferable. Using
127.0.0.1 will result in time-outs if nothing is listening on
0.0.0.0 will not. A locally running webserver will be hit with requests either way. You may want to convert host files using
sed or, if you prefer,
awk (why not perl, you may wonder. No reason, if that works for you then great!). The majority of the hosts files listed below use the
The hosts files below do not contain the typical
localhost entries GNU/Linux systems require. Linux machines should to have valid localhost entries like the following in
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
The off-the-shelf blacklists listed below do not include these entries. A simple trick is to make a file called
/etc/hosts.local with the required localhost entries and any other entries you'd like and a
/etc/cron.weekly/update-hosts.sh file that puts your
/etc/hosts.local and a blacklist you'd like to use in your
Don't forget to
chmod a+rx /etc/cron.weekly/update-hosts.sh
Host Blacklists Reviewed
This is a rather short lists (1286) entries listing subdomains used by the notorious 2o7 tracking-service which has plagued the Internet for more than a decade. It's maintained by github user "FadeMind" and it can be downloaded from https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.2o7Net/hosts
This is a rather short and specialized list limited to one tracking service. Still, it's worth adding if you are making your own
/etc/hosts using multiple sources.
AdAway is an Android app for blocking advertisements and is, as such, mainly targeting advertisements served to mobile users. It is available for free in the F-Droid app store. This blacklist is rather short with about 500 entries.
This is one of the few blacklists which can safely be used with no issues or risk of false positives. It targets the most common and annoying trackers and advertisement servers and that's it. The raw blacklist can be acquired from raw.githubusercontent.com: hosts.txt
hpHosts is a service run by malwarebytes which offers a rather broad range of small host files covering different classifications. It's no "one stop shop" where you can grab some huge fits-the-bill hosts file. And that is probably a good thing since a lot of malwarebytes choices are questionable at best.
hpHost's advertisement list (ATS) appears to have advertisement-related sites on it. It gets dicey very quickly when moving on to their other lists such as the "warez/piracy sites" (WRZ) blacklist. The site thelinuxcode.com, as an example, does explain "HOW TO DOWNLOAD TORRENT WITH COMMAND LINE IN UBUNTU". An article explaining how one can use BitTorrent software to download Fedora ISOs on Ubuntu Linux does not make a site a "warez site" - but if you think it does then malwarebytes products and their hpHosts files are for you.
You can download the various hpHosts blacklists from http://hosts-file.net/?s=Download
KADhosts is a Polish blacklist which is supposedly limited to "Fraud/adware/scam websites". It is primarily a uBlock Origin and AdGuard filter which can be found at github.com/PolishFiltersTeam/KAD. It is also available in the standard hosts file format in two varieties at github.com/PolishFiltersTeam/KADhosts.
KADhosts offers three different hosts files:
|KADhosts_without_controversies.txt||Identical to KADhosts.txt, apart from a comment, as of 2020-07-14|
|KADhosts.txt||Identical to KADhosts_without_controversies.txt, apart from a comment, as of 2020-07-14|
|KADfakeHosts.txt||"Fake" means supposedly "fake news". Interpret that as you wish.|
KADhosts has a homepage, in Polish, at kadantiscam.netlify.app/.
The MVPS host blocklist blocklist is one of the oldest hosts blockists on the Internet. It has been available from the website winhelp2002.mvps.org since 1998.
MVPS is one of a few host blocklists which is available as a pre-configured option in Ublock Origin.
The MVPS list is described as one which includes "major parasites, hijackers and unwanted Adware/Spyware programs". The list appears to only block those things. It can safely be used, it appears to list the things it claims it blocks and nothing more.
The raw MVPS list is available at winhelp2002.mvps.org/hosts.txt.
Steven Black's "Unified hosts file"
Steven Black's hostss are files made using collections of other hosts files such as adaway.org, mvps.org, malwaredomainlist.com and someonewhocares.org. The shortest "adware + malware" list weighs in at 1.2 MB and holds 40k entries. It seems fine. Things get rather strange very quickly when examining the rest of the categories this person provides blacklists for. There is a "fakenews" news blacklist which lists just about every single site where real journalists do investigative and objective reporting. Not a single known propaganda-site which peddles fake news continuously has made it into this list. This raises questions regarding this persons ability to understand the difference between up and down, left and right as well as good or bad sites. Using anything from this source without checking every domain is inadvisable and checking 40-55k domains for false positives would be a full-time job for a month. The better option is to avoid. The various hosts files can be found at https://github.com/StevenBlack/hosts
The "Ultimate" hosts blacklist is probably the biggest HOST file blacklist on the Internet yet it is small compared to how huge it used to be.
The "Ultimate" hosts blacklist was 43 MiB large and it contained 1.860.653 different domains the first time we looked at it. It used to contain a lot of false positives. This site was in it along with Norway's biggest newspaper VG, Russia's pravda.ru, the Free Software Foundations Free Software Directory, Linux Today and a very long list of other totally legitimate websites.
The "Ultimate" hosts blacklist has since been trimmed down to, as of V1.2042.2020.07.13, 463,466 domains (less than a forth of what it used to contain).
It is hard to recommend this blacklist after we discovered that we were randomly put on it along with too many other sites that were put on it for no apparent reason. It has been cleaned up and reduced to less than a quarter of its original size and this site is no longer on it, so current versions of this list may be an acceptable choice.
|Note: Avoid this list if you do any kind of crypto currency mining. All the major mining pools for all the bigger crypto currencies are on this list. They are likely on it because of all the stealth crypto mining malware in the wild. It is, of course, a problem if you actually do want to mine some crypto currency.|
The "Ultimate" hosts blacklist can be acquired from github.com/mitchellkrogza/Ultimate.Hosts.Blacklist.
Yoyo's Adservers is a small-ish list of about 3000 advertisement servers. That's what it focuses on and that's what it is. There's no useful sites thrown in by mistake as far as we can tell, it's just advertisement servers. This list looks safe to use with no risk of false positives or issues. It's homepage is at pgl.yoyo.org/adservers and the raw list can be acquired from pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext&useip=0.0.0.0 - yes, that's a very long URL. But it's good, it is simply Yoyo's way of telling you that you can change the format by altering the links variables.
If you are looking for a general host blacklist to use you're best off sticking with a shorter one which focuses on advertisement and tracking servers and only advertisement and tracking servers. Every blacklist which attempts to go beyond that get some sites, or in the case of "fake news" all sites, wrong. It would also be mentioned that browser-based content filters are better suited for removing advertisements anyway; they can remove sub-folders like
/ads/ without making the whole site unavailable.