The visible web

From LinuxReviews
Jump to navigationJump to search

The visible web is the part of the World Wide Web you can find using search engines.

It's not the whole world wide web

The visible web is limited by the The indexable web, which is the part of the world wide web which is not restricted to web spiders by rules such as those webmasters can apply in robots.txt.

The indexable web is further limited by search engines (and their spiders) resources, and censorship applied by search engines. The fact that a search engine is allowed to crawl as site does not mean they will do it, they may censor the content simply because someone working for the search engine don't like it, or because a government secretly (like within NATO) or openly (like in China) force search-engines to remove websites from their indexes.

What's left, after crawling restrictions and censorship, is the visible web - the parts of the web you can expect to find using commercial search engines.

What's visible depends on where you look

Distributed Free Software search engines like YaCy are not subject to the same censorship as commercial search engines, but they are - as of now - limited in their crawling ability.

This means that YaCy will show you parts of the The Deep Web (the whole web, including the invisible parts) that commercial search engines won't show you, but not the whole deep web, only a different (and still limited) visible web.