Skip to content

08. Email Harvesting with theHarvester

What is theHarvester?

theHarvester is an open-source tool used for reconnaissance to gather emails, subdomains, IP addresses, and employee names from public sources. It is commonly used in cybersecurity for ethical hacking and penetration testing.


Purpose of theHarvester

  • To collect publicly available email addresses, subdomains, and other information from various search engines and services.
  • Useful for OSINT (Open Source Intelligence) in ethical hacking.

Syntax

theHarvester -d <target_domain> -b <data_sources>

Explanation of Parameters

  1. -d (Domain): Specifies the target domain for gathering information.

    • Example: example.com
    • -b (Data Sources): Defines the sources from which data will be fetched.
      Supported sources include:

    • shodan

    • crtsh
    • hackertarget
    • yahoo
    • duckduckgo
    • Many more…
    • Other Options:

    • -l: Limits the number of results to be fetched.

    • -f: Outputs the results to a file (e.g., HTML or JSON format).
    • -h: Displays help and usage information.

Basic Example

Command:

theHarvester -d example.com -b shodan

Description:

  • Collects emails, subdomains, and related information from Google for the domain example.com.

Advanced Example

Command:

theHarvester -d example.com -b shodan,linkedin,bing -l 100 -f results.html

Description:

  • -d example.com: Targets the domain example.com.
  • -b google,linkedin,bing: Uses Google, LinkedIn, and Bing as data sources.
  • -l 100: Limits the results to 100 entries per source.
  • -f results.html: Saves the output to an HTML file named results.html.

Common Use Cases

  1. Email Harvesting:

    • To collect email addresses for security testing or phishing simulations.
    • Example:

      theHarvester -d example.com -b shodan
      
  2. Subdomain Discovery:

    • Identifies subdomains of a target for further analysis.
    • Example:

      theHarvester -d example.com -b crtsh
      
  3. Employee Name Gathering:

    • Extracts employee names or contact information from sources like LinkedIn.
    • Example:

      theHarvester -d example.com -b linkedin
      

Output

The output typically includes:

  • Emails: user@example.com, admin@example.com
  • Subdomains: sub.example.com
  • IP Addresses: Associated IPs for the domain.
  • Other Information: Depending on the source, additional metadata may be included.

Best Practices

  1. Use Ethically: Always have permission before running theHarvester against a domain.
  2. Combine with Other Tools: Use theHarvester with tools like Nmap or dnsenum for more comprehensive reconnaissance.
  3. Limit Sources: Choose relevant sources to minimize irrelevant data.

Common Data Sources for theHarvester

  • Bing: Similar to Google but can return different results.
  • Shodan: Identifies exposed services and devices.
  • crtsh: Certificate Transparency logs for subdomains.
  • Yahoo: Similar to Google and Bing.
  • DuckDuckGo: Privacy-focused search engine.

Command to See All Supported Sources

theHarvester -h

Example Output

[-] Emails found:
    admin@example.com
    contact@example.com

[-] Hosts found:
    sub.example.com
    mail.example.com

[-] IPs found:
    192.168.1.1
    192.168.2.2

Important Notes

  • Ensure you comply with legal and ethical guidelines when using theHarvester.
  • Using LinkedIn and similar services may require additional authentication or APIs for advanced data collection.