The deep web (also called the invisible or hidden web) is the part of the Internet that is not indexed by search engines and does not appear in search results.
For example, the deep web includes private profiles on social networks and other resources, e-mails, corporate websites, password-protected documents, paid content, and so on. Meanwhile, online content that does get indexed by search engines is called the surface web (or visible web).
How Web pages can become part of the deep web
Online content can evade search engine indexing in several ways:
- Using a noindex meta tag in a page’s HTML code prevents search robots from indexing it;
- Placing an exclusion in the robots.txt file tells search engine crawlers to ignore certain site content;
- Using dynamic content generation to show each visitor a different version of a page, such as a personalized recommendations page in an online store;
- Password-protecting access to content, such as is common for private online planners or corporate cloud storage;
- Placing a website in a domain that requires specialized software for access (for example, a normal browser cannot access .onion sites; users need the Tor browser).
In addition, search engines do not index pages for which no links exist in public resources.
Deep web, dark web, and darknet
The term deep web is often confused with the terms dark web and darknet. In reality, they are three different, albeit overlapping, concepts.
- A darknet is an overlay network (i.e., it’s built on top of another network) that requires specialized software for access. Examples of such software include the Tor browser and SecureDrop, a free software platform for secure communication between journalists and sources that require anonymity. Darknets allow the exchange of information without revealing any personal information, which is why they are popular with criminals.
- The dark web is content hosted on darknets. The dark web forms a part of the deep web.