Gathering information is the first step in identifying security vulnerabilities and analyzing risks. To collect data, security professionals use advanced and specific search engines. This article compiled the ten most used search engines by pentesters and bug bounty hunters
SHODAN (Sentient Hyper Optimize Data Access Network) is a search engine that indexes all internet assets. It collects information about all systems and devices connected to the Internet, from a baby monitor to traffic signal lights, and scans for vulnerabilities.
This gathered data can be used for various purposes, including network security monitoring, identifying cyber risks, conducting market research, and measuring the popularity of IoT devices.
To get the most out of Shodan, it is necessary to use the right search query syntax. It is critical to use the right keywords to pinpoint the searched information. Some of the most common search parameters are: hostname: os:, port:, ip_str:, country:, city:, geo:, net:, before/after:, org:, product:, version:, data:, vuln:,
By combining the search parameters with each other, filtering that narrows and expands the outcomes is possible to obtain more specific results.
In the above search example, the parameters “default password” and country: “US” are used, and as a result of the search, 175 assets available in the US and using default passwords are listed. The second image shows open ports in the first of these listed assets.
IntelligenceX is a search engine. Also, it can be used as a data archive.
The site can search the following selector types:
- Email address
- IP, CIDR. Both IPv4 and IPv6
- Phone number
- Bitcoin address
- Ethereum address
- MAC address
- IPFS Hash
- Credit Card Number
- Social Security Number
- IBAN (International Bank Account Number).
The selectors stated above can be searched within buckets such as Paste sites, Darknet: Tor and I2P, Wikileaks & Cryptome, North Korean and Russian government sites, Data Leaks, Whois Data, and Public Web, and so on.
Furthermore, IntelligenceX maintains a historical data repository of the results.
The IntelX interface makes choosing any categories or specific data accessible while searching. Also, arranging settings for ‘most relevant,’ ‘oldest,’ or ‘newest’ is possible.
PublicWWW is commonly used as a source code search engine. It indexes the content of over 500 million websites, can search for an alphanumeric snippet, signature, or keyword within web page HTML, JS, and CSS codes, and downloads a list of websites containing it.
Here are some query syntax examples for searching in PublicWWW:
- site: TLD operator; Get results from the specific top-level domain (site:edu bootstrap)
- filetype:css and filetype:js; Search in CSS and JS files (“stackoverflow.com/questions”filetype:css)
- ip: IP address; Search by IP address or class C subnet (ip:160.153.136.* bootstrap)
- Phrase search operator; put a word or phrase in quotes (“math.min.js”)
- Combine phrases; Combining multiple phrases or keywords allows for more specific searches (“<html lang=\”fr\”>” bootstrap)
BinaryEdge is a cybersecurity/data Science company that collects, analyzes, and categorizes data from the internet. BinaryEdge analyzes data from across the web using a custom-built platform that combines machine learning and cybersecurity techniques. At the end of this analysis, it identifies the organizations’ attack surface in detail and provides a threat intelligence service.
The API provides access to the scanning platform and databases for querying and analyzing BinaryEdge’s globally collected (constantly updated) data.
Some of the query syntax for BinaryEdge are below:
- as_name: (string)
- asn: (int)
- country: (string)
- created_at: (date)
- IP: (string)
- IPv4: (boolean)
- GeoIP.city_name: (string)
- has_screenshot: (boolean)
- port: (int)
- protocol: (string)
- rdns: (string)
- rdns_parent: (string)
- type: (string)
- tag: (string)
- device: (string)
SOCRadar is a cloud-based, artificial intelligence-powered Digital Risk Protection Platform with cyber threat intelligence capabilities. The SOCRadar platform is a cybersecurity early warning system that combines Cyber Threat Intelligence, Digital Risk Protection, and External Attack Surface Management into a single solution.
SOCRadar collects data that could lead to a cyberattack from thousands of sources, including the Deep Web, Dark Web, Black Market, Bot Market, PasteBin Sites, Github, and social media. It analyzes this data using artificial intelligence and big data technologies and converts it into intelligence.
ThreatFusion provides a search environment where all critical threat information can be securely accessed using the SOCRadar Threat Hunting module. With its advanced search capabilities, the Threat Hunting module allows users to access deep web data on a common threat-sharing platform.
Threat Hunting is a valuable tool for obtaining trends about threat actors, such as APT groups and their tactics, techniques, and procedures (TTPs), that can be used with the MITRE ATT&CK framework.
With integrations with SOAR, firewall, SIEM, and EDR, ThreatHose can provide compatible protection.
6. Google Dorks
Google is a well-known and widely used search engine. In everyday life, it is used for many information acquisition queries. However, advanced searches on Google with queries that allow easier and faster access to the most accurate results are also possible. Google Dorking refers to these advanced Google search techniques.
Google Dorking assists in the discovery of not only difficult-to-find information but also sensitive but poorly protected data. Exploitable vulnerabilities, usernames and passwords, corporate mailing lists, financial information, and more confidential data and documents are available in public sources but cannot be found with a simple search.
Some most known operators for Google Dorking are filetype (filetype: log), intext, ext, inurl, intitle, site, cache, allintext …etc. Search parameters can be combined to narrow down the results (allintext:username filetype:log)
ONYPHE is a Cyber Defense Search Engine that scans various internet resources to provide open-source and cyber threat intelligence data. ONYPHE actively monitors the internet for devices connected to the global network. It also correlates scanned data with data gathered via Website URLs. The data is then made available via an API and its query language.
At the ONYPHE database, users can conduct searches by categories. These categories include Geolocation data, IP addresses, Inetnum information (data about netnames, subnet, and description of netblocks), Active Internet scanning, Passive DNS, Threatlist lookups, and Paste sites lookups.
Some of query syntax for ONYPHE are bellows;
- category:vulnscan ?cve:CVE-2018-13379 ?tag:fortigate2018
- category:datascan cve:”CVE-1999-0017″
- category:datascan device.class:”3D Printer”
- category:datascan tag:”admin”
grep.app is a search engine for codes on GitHub. GitHub is the most popular code host, hosting some of the most critical open-source projects. grep.app inquiries code from over 500,000 GitHub public repositories.
Grep.app searches for the exact typed string, including any punctuation or other characters. Users can also specify the programming language they require. The RE2 syntax can also be used to search for regular expressions.
LeakIX is a leak search engine for security and research professionals. LeakIX collects data on the most common security misconfigurations from the Internet.
The platform’s primary goal is to provide insight into compromised devices and servers. Furthermore, LeakIX alerts its users about actively monitored ransomware campaigns. It also provides information on indexed leaks from various network operators. LeakIX indexes potentially ‘leaked’ or ‘compromised’ company data.
Some of the query syntax for LeakIX are below.
- ip ip:188.8.131.52 or range ip:184.108.40.206/16
- port port:443
- dataset.rows dataset.rows:>100
- dataset.size dataset.size:>1024
- dataset.infected dataset.infected
- tags +tags:printer
- plugin plugin:NucleiPlugin
- leak_count leak_count:>3
LeakIX aims to provide a platform for fixing misconfigurations that lead to leaks and security issues by bridging the source, CERTs, hosting companies, and researchers.
Vulners is a security database that contains explanations for numerous software vulnerabilities. Cross-references between bulletins and a constantly updated database keep users up to date on the latest security threats by bringing together more than 100 data sources.
Vulners collects vendor security bulletins, lists of vulnerabilities discovered by researchers, the content of vulnerability and exploit databases, posts on hacking forums, and vulnerability scanner detection rules. It displays the connections between these entities with an efficient search interface.
Vulners use Lucene-based queries. Here are some examples of search snippets.
- type:openbugbounty AND title:”your-domain-here.com” AND openbugbounty.patchStatus:unpatched
- affectedSoftware.name:nginx AND affectedSoftware.version:”1.11.0″
- type:openwrt AND cvelist:CVE-2016-0799
- h1team.handle:yahoo order:bounty
- bulletinFamily:ioc AND tags:* AND -iocType:ip