SearX

From LinuxReviews
Jump to navigationJump to search
SearX
Legacy-results.png
Developed byasciimoo
Written inPython
OSGNU/Linux
TypeMetasearch engine
LicenseAGPLv3
Websitegithub.com/asciimoo/searx

SearX is a fast and user-friendly privacy-oriented metasearch engine with many different public installation which can be freely used. It is written in Python and it is easy to install on standard Linux desktops and servers.

What it is and isn't[edit | edit source]

A metsearch engine like SearX takes search queries from users and forwards them to multiple search-engines. The results returned are then sorted and presented. The SearX instance will see who's making the requests, the search-engines it uses as sources will not. There's two main advantages to this:

  1. you get more results when many sources are mixed than you do when you ask just one source.
  2. there's no tracking-cookies and likely no long-term logging tracking your behavior

That last point is up to the person(s) who configured the SearX site you are using. You can easily tell if a SearX tries to track you with cookies. If there's server-logs or not is just a question of trust.

SearX itself is not a search-engine. If does not have an index, it knows nothing about anything. It has to ask others and it is helpless without external sources.

Features and usability compared to "regular" search-engines[edit | edit source]

You can use the SearX software to search by simply visiting one of the many publicly listed installations.

The results and experience you get from public SearX instances varies a lot. Many of them are blacklisted from one or more search-engines due to too much traffic or too "suspicious" traffic coming from one IP or IP-range. Some are just slow because the person who set it up thought it was a good idea to put it on a Pi and publicly announce it's existence even though the hardware can't handle 2 concurrent users. You can find a SearX instance which works really well for you on the list of public instances - but it will take some time. It is also worth knowing that things change and a SearX site you love and use daily could become horribly slow or disappear tomorrow.

The SearX software is quite rich in features when it comes to searching for specific kinds of content like music, pictures, videos and so on.

Skins[edit | edit source]

SearX supports skins and each skins can be themed with different CSS styles. See SearX skins for screenshots of those who are included in the default installation.

Languages[edit | edit source]

It is possible to configure a default language in the interface. You can also restrict a given search to a specific language by using a : with a language-code like :fr for results in French.

Combining SearX with YaCy[edit | edit source]

It's quite possible to combine SearX with the YaCy peer to peer search engine software. We did this and tested it and we even wrote a howto called "HOWTO setup your own YaCy-backed SearX search-engine". Testing this combination for a few days revealed that it is a bad idea. YaCy will reach it's timeout for all searches that are more than two words. Setting YaCy's limit higher than the default doesn't help very much. The results YaCy does produce the few times it does not time-out are mostly irrelevant. There is no practical benefit to combining the two; the net effect is a slower SearX. There is no benefit to having YaCy as a source. None. Thus; we never published our HOWTO for combining them even though we spent quite a lot of time on it.

Installation and configuration[edit | edit source]

You can do it, you can configure your own SearX instance. We believe in you and so does too.

Setting up SearX on your own box will require three pieces of software: SearX itself. Then you will need the uWSGI application server, which, despite the name, isn't going to server anyone - not directly, anyway. uWSGI will make your SearX instance available via a socket. This socket can be used by a front-end web server such as Apache or Nginx. Thus; you will need to configure and setup SearX -> uWSGI -> Web Server.

This may sound complicated but it really isn't. It's quote possible to setup your own SearX instance if you have a bare minimum of command-line skills. You can do it, we believe in you.

Installing SearX with Apache as a front-end on Fedora[edit | edit source]

You will need install these packages (if you don't have them already):

Installing the required packages[edit | edit source]

dnf -y install git python3-virtualenv httpd uwsgi uwsgi-plugin-python3

Now you are ready to install SearX. We'll do it in /opt to make things easy with selinux. First, clone SearX from git and create a user for it - and give that user ownership of the files:

Installing SearX from git[edit | edit source]
cd /opt
git clone https://github.com/asciimoo/searx.git
useradd --system searx -d /opt/searx
chown searx:searx -R /opt/searx

Install dependencies - and there's a lot of python dependencies - in a virtualenv:

sudo -u searx -i
virtualenv searx-ve
. ./searx-ve/bin/activate
./manage.sh update_packages

The last step will take some time as it will install a lot of python packages.

Configuring SearX[edit | edit source]

First, change the default key in the configuration file to something random:

sed -i -e "s/ultrasecretkey/`openssl rand -hex 16`/g" searx/settings.yml

Then optionally, and you want to do this, open the configuration file searx/settings.yml in an editor like nano.

Now you can start it and see that it runs stand-alone.

python searx/webapp.py

This should produce a message telling you it's running on localhost the defaultport 8888. Press ctrl+c to terminate it.

Now you're done in the virtual environment. Press ctrl-d or type exit to leave it.

Installing uwsgi so Searx can run as a service accessible from a socket[edit | edit source]

Create a .ini file for uwsgi which tells it how to use SearX:

File: uwsgi uwsgi-plugin-python3
[uwsgi]
# Who will run the code
uid = searx
gid = searx

# disable logging for privacy
disable-logging = true

# Number of workers (usually CPU count)
workers = 4

# The right granted on the created socket
chmod-socket = 666

# Plugin to use and interpretor config
single-interpreter = true
master = true
plugin = python
lazy-apps = true
enable-threads = true

# Module to import
module = searx.webapp

# Virtualenv and python path
virtualenv = /opt/searx/searx-ve/
pythonpath = /opt/searx/
chdir = /opt/searx/searx/

socket = /run/uwsgi/searx.socket

Change the permission of the file:

chown uwsgi:uwsgi /etc/uwsgi.d/searx.ini

The next step is required even though it shouldn't be. On Fedora 30 you will get "!!! UNABLE to load uWSGI plugin: /usr/lib64/uwsgi/python_plugin.so: cannot open shared object file: No such file or directory !!!" when trying to start uWSGI - which you should pre-emtibly solve with:

ln -s /usr/lib64/uwsgi/python3_plugin.so /usr/lib64/uwsgi/python_plugin.so

Now you can start it and enable it as a permanent system service:

systemctl start uwsgi.service
systemctl enable uwsgi.service
Setting up Apache so you can use SearX through the uWSGI socket[edit | edit source]

You now have a working SearX and access to it through uWSGI at the socket /run/uwsgi/searx.socket. This isn't very helpful if you'd like to use it from a web browser and you probably do.

Thus, you need a web server. Many people love nginx and if you do then good for you. We know that web server is light and popular but we've always been fans of Apache and it's long track-record and versatile features.

Distributions tend to have their own special ways of managing Apache configuration files. RedHat likes to use /etc/httpd/conf.d for per-service configuration files.

This VirtualHost configuration file is enough for Apache:

File: /etc/httpd/conf.d/searx.conf
<VirtualHost *:80>
    ServerName localhost
    <Location />
        ProxyPass unix:/run/uwsgi/searx.socket|uwsgi://searx/
    </Location>
</VirtualHost>

Now you can start httpd and make it a permanently running service:

systemctl enable httpd.service
systemctl start httpd.service

However, it will not yet work. The SELinux is not about to let you or your web server have access to that socket without some serious convincing.

You can try using SearX and study the /var/log/audit/audit.log selinux audit log file for clues as to why you will get Service Unavailable from Apache in order to learn a bit about SELinux.

Kemonomimi rabbit.svg
Note: You could and some pages will advice you to do semanage permissive -a httpd_t to let Apache have access to sockets. That does work since it will disable ALL security for Apache. You should absolutely NOT do that unless you are absolutely sure it's something you want. Somewhat related, you do NOT need to enable httpd_can_network_connect with setsebool -P httpd_can_network_connect 1 since that allows TCP connections - not sockets.

What you need to do is this:

  • Make a "type enforcement" policy file.
  • Compile that into a policy module which is then compiled into a SELinux module.
  • Insert the SELinux module.

First, create a temporary "type enforcement" policy file:

echo 'module httpd-searx-socket 1.0;
require {
        type var_run_t;
        type httpd_t;
        class sock_file write;
}
allow httpd_t var_run_t:sock_file write;'> /tmp/httpd-searx-socket.te

Next you need to compile that into a policy module with:

checkmodule -M -m -o /tmp/httpd-searx-socket.mod /tmp/httpd-searx-socket.te

That module needs to be compiled[1]

semodule_package -o /tmp/httpd-searx-socket.pp -m /tmp/httpd-searx-socket.mod

Then install this into the SeLinux framework:

semodule -i /tmp/httpd-searx-socket.pp

That's it. You do not need to restart httpd to make it work, you can access your personal SearX by pointing your browser at http://127.0.0.1/ and it will work.

Freshly-installed-searx-on-fedora.png

It works, now what?[edit | edit source]

If you installed SearX on your desktop to just use it on your desktop you're done. However, if you will be using it remotely from outside the comfort of your own home there's one more step (at minimum) which you should do: Install certbot[2] and get a Let's Encrypt certificate. They are free but you have to re-new them all the time (every 90 days). You may also want to get yourself a free sub-domain and set it up to dynamically change if your IP changes. Those things are a bit out of the scope of this particular howto. You can probably figure it out if you got this far. We believe in you! Good luck.

notes[edit | edit source]