01. Website Recon and Footprinting
It involve gathering as much information as possible about a target website to prepare for further analysis or penetration testing. Here's a detailed guide on what you are looking for and the tools or techniques you can use, along with proper explanations.
Key Objectives¶
- IP Addresses: Identify the hosting server and IP range.
- Hidden Directories: Detect restricted or hidden content.
- Names, Emails, Phone Numbers: Extract personal or organizational data.
- Physical Addresses: Locate the entity's geographical presence.
- Web Technologies: Identify technologies, frameworks, and CMS used.
Tools and Techniques¶
1. Basic DNS and Host Lookup¶
-
Command:
host <domain_name>- Finds the IP address associated with a domain name.
-
Example:
-
Output includes IP address and aliases if any.
2. Accessing Robots.txt and Sitemap.xml¶
-
Robots.txt:
- Found at
https://example.com/robots.txt - Reveals disallowed directories and crawlers' instructions.
-
Example content:
- Found at
-
Sitemap.xml:
- Found at
https://example.com/sitemap.xml - Provides structured information about website URLs.
- Useful for discovering less obvious pages or sections.
- Found at
3. Browser Plugins¶
- BuiltWith: Analyzes the website's backend technologies.
- Wappalyzer: Detects CMS, web servers, frameworks, and plugins in use.
4. WhatWeb¶
- Command:
whatweb <domain_name>- Identifies web technologies, such as server software, frameworks, or CMS.
-
Example:
5. WebHTTrack¶
- Description:
- A website copier tool to download and browse websites offline.
-
Usage:
-
Install:
-
Command:
-
Downloads the site's content to analyze for hidden files or directories.
-
6. Checking for security.txt¶
A security.txt file is a standardized method for website owners to publicly share security contact information.
It is similar to robots.txt but specifically for vulnerability reporting.
Purpose
Helps ethical hackers / security researchers easily find how to responsibly report security issues.
Location Requirements
A valid security.txt file must be hosted at:
-
https://example.com/.well-known/security.txt← Preferred and standardized path -
or
https://example.com/security.txt
Format Requirements
-
Must be served over HTTPS
-
Must be in plaintext
-
Must be easily readable by both humans and automated tools
Common Fields Inside security.txt
Example:
Contact: mailto:security@example.com
Expires: 2025-12-31T23:59:59z
Encryption: https://example.com/pgp-key.txt
Acknowledgments: https://example.com/security-hall-of-fame/
Preferred-Languages: en, fr
Why It Matters in Recon
-
Reveals security contact emails
-
May expose bug bounty programs or responsible disclosure policies
-
Could leak data handling practices or organizational structure
Real-World Adoption
Major companies using security.txt include:
-
Google
-
GitHub
-
LinkedIn
-
Facebook
How to Check
Use direct URL access in a browser or curl:
curl -I https://example.com/.well-known/security.txt
curl https://example.com/.well-known/security.txt
Note-Taking¶
While using the above commands and tools, maintain proper documentation:
- Command Executed:
- Mention the tool and the syntax used.
- Output Summary:
- Record useful results like IPs, detected technologies, or URLs.
- Interpretation:
- Describe what the output indicates and how it can be leveraged.
Example Notes¶
Host Lookup¶
Command: host example.com
Output: example.com has address 93.184.216.34
Interpretation: This is the IP address of the hosting server.
Robots.txt¶
URL: https://example.com/robots.txt
Content:
User-agent: *
Disallow: /private/
Interpretation: The directory `/private/` is restricted for crawlers and may contain sensitive data.
WhatWeb¶
Command: whatweb example.com
Output:
WordPress [5.7.2], Apache [2.4.41], PHP [7.4.3]
Interpretation: The website uses WordPress CMS, Apache server, and PHP.
Security.txt¶
URL Checked:
https://example.com/.well-known/security.txt
Findings:
- Contact email: security@example.com
- Public PGP key available
- Bug bounty acknowledgment page
Interpretation:
The organization actively supports responsible disclosure,
which may indicate a mature security posture.