Categories
Apps & Packages

theHarvester – Open Source Intelligence (OSINT)

Here we examine how theHarvester can be used to carry out OSINT (Open Source INTelligence) in order to gather emails, names, subdomains, IP addresses and URLs.

In a previous article, Network Scanning Tools (part 2), we briefly covered theHarvester which is described as:

… a very simple to use, yet powerful and effective tool designed to be used in the early stages of a penetration test or red team engagement. Use it for open source intelligence (OSINT) gathering to help determine a company’s external threat landscape on the internet. The tool gathers emails, names, subdomains, IPs and URLs using multiple public data sources …

theHarvester

theHarvester has Passive and Active aspects to its functionality.

Passive

This includes using the following modules:

  • baidu – dominant search engine in China.
  • bing – Microsoft’s search engine.
  • bingapi* – Microsoft’s search engine via the API.
  • bufferoverrun – uses data from Rapid7’s Project Sonar.
  • censys* – use certificates searches to enumerate subdomains and gather emails from the Censys search engine.
  • certspotter – monitors Certificate Transparency logs.
  • crtsh – Comodo Certificate search.
  • dnsdumpster – DNSdumpster search engine.
  • duckduckgo – DuckDuckGo search engine.
  • exalead – Exalead search engine.
  • github-code* – GitHub code search engine.
  • google – Google search engine via Google Dorks.
  • hackertarget – online vulnerability scanners and network intelligence to help organizations.
  • hunter* – Hunter search engine.
  • intelx* – Intelligence X search engine.
  • linkedin – Google search engine, specific search for LinkedIn users.
  • linkedin_links – specific search for LinkedIn users for target domain.
  • netcraft – Internet Security and Data Mining.
  • omnisint – Project Crobat, A Centralised Searchable Open Source Project Sonar DNS Database.
  • otx – AlienVault Open Threat Exchange.
  • pentesttools* – Powerful Penetration Testing Tools.
  • projectdiscovery* – actively collects and maintains internet-wide assets’ data.
  • qwant – Qwant search engine.
  • rapiddns – DNS query tool which make querying subdomains or sites of a same IP easy.
  • rocketreach* – Access real-time verified personal/professional emails, phone numbers, social media links.
  • securityTrails* – Security Trails search engine, the world’s largest repository of historical DNS data.
  • shodan* – Shodan search engine, will search for ports and banners from discovered hosts.
  • spyse* – find Internet assets by digital fingerprints.
  • sublist3r – Fast subdomains enumeration tool for penetration testers.
  • threatcrowd – a Search Engine for Threats.
  • threatminer – Data mining for Threat Intelligence.
  • trello – Search trello boards using Google search.
  • twitter – Twitter accounts related to a specific domain using Google search.
  • urlscan – URL and website scanner.
  • virustotal – VirusTotal domain search.
  • yahoo – Yahoo search engine.

* these modules require API keys to be configured.

Active

This includes the following functions:

  • DNS brute force – dictionary brute force enumeration.
  • Screenshots – take screenshots of subdomains that were found.

Tool Usage

The synopsis or syntax of theHarvester is as follows:

theHarvester [-h] -d DOMAIN [-l LIMIT] [-S START] [-g] [-p] [-s] [--screenshot SCREENSHOT] [-v] [-e DNS_SERVER] 
[-t DNS_TLD] [-r] [-n] [-c] [-f FILENAME] [-b SOURCE]

Below are the main command line settings:

  • -h
    • show help message.
  • -d DOMAIN
    • company name or domain to search.
  • -l LIMIT
    • limit the number of search results, default=500.
  • -S START
    • start with result number X, default=0.
  • -g
    • use Google Dorks for Google search.
  • -p
    • use proxies for requests, enter proxies in ‘proxies.yaml’.
  • -s
    • use Shodan to query discovered hosts.
  • –screenshot SCREENSHOT
    • take screenshots of resolved domains & specify screenshot output directory.
  • -v
    • verify host name via DNS resolution and search for virtual hosts.
  • -e DNS_SERVER
    • DNS server to use for lookup.
  • -t DNS_TLD
    • Perform a DNS TLD expansion discovery, default False.
  • -r
    • check for takeovers.
  • -n
    • enable DNS server lookup, default False.
  • -c
    • perform a DNS brute force on the domain.
  • -f FILENAME
    • save the results to an HTML and/or XML filename.
  • -b SOURCE
    • the source to be searched is one of the following:
      • baidu, bing, bingapi, bufferoverun, censys, certspotter, crtsh, dnsdumpster, duckduckgo, exalead, github-code, google, hackertarget, hunter, intelx, linkedin, linkedin_links, netcraft, omnisint, otx, pentesttools, projectdiscovery, qwant, rapiddns, securityTrails, spyse, sublist3r, threatcrowd, threatminer, trello, twitter, urlscan, virustotal, yahoo

Test Environment

For the purposes of this article we will be using theHarvester v3.2.4 installed on a Kali Linux 2021.2 virtual machine (IP = 10.0.2.15) running within VirtualBox 6.1.

Demonstrations

1) Search Google for a domain & limit results to 10

2) Search Google using Google Dorks for a domain & limit results to 10

3) Search LinkedIn for a domain & limit results to 20 starting at #5

4) Search Otx for a domain & limit results to 15

5) Run URL & website scan for a domain & perform takeover check

Further Information


Do you have experience of using theHarvester? If so please share how useful you found the tool in the comments below.

Leave a Reply