GitHub Dorking
GitHub dorking is the practice of using GitHub’s search (and advanced query operators) to locate specific files, patterns, and exposed content across public repositories. It’s used by security researchers and maintainers for reconnaissance and discovery (both defensive and offensive), but it’s dual-use — it can reveal accidentally leaked data or be abused by attackers. Use only for authorized, ethical testing and defensive discovery. (InfoSec Write-ups)
Table of contents¶
-
What is GitHub dorking?
-
Why it matters / risks
-
Legal & ethical rules (short)
-
Common GitHub search operators (cheat sheet)
-
Safe usage example (web UI +
ghCLI) -
Remediation & mitigation (what to do when you find something)
-
Further reading & resources
What is GitHub dorking?¶
GitHub dorking means crafting targeted search queries (sometimes called “dorks”) with GitHub’s advanced search operators to find files, configuration snippets, keys, endpoints, or patterns of interest across repositories. Security researchers use it to discover exposed assets or misconfigurations; attackers may use it to find secrets or vulnerabilities. (InfoSec Write-ups)
Why it matters / risks¶
-
Public repositories can unintentionally contain API keys, credentials, private configuration, or insecure code patterns. Large studies and industry reports show a significant volume of accidentally committed secrets and that attackers/abusers may scan public code for those secrets. (arXiv)
-
Finding secrets in public repos can lead to account compromise, data breaches, or supply-chain attacks. That’s why platforms and vendors provide secret-scanning and other protections. (GitHub Docs)
Legal & ethical rules (short)¶
-
Only search public repositories unless you have explicit authorization to test private repositories.
-
If you discover active credentials or secrets, do not use them. Follow responsible disclosure: notify repository owners and relevant providers, and recommend revocation/rotation. Many providers and GitHub offer formal reporting or removal processes. (GitHub Docs)
Common GitHub search operators (cheat sheet)¶
Use these in the GitHub search box or in the Code Search endpoint (cs.github.com) / gh CLI:
-
org:ORG— restrict to an organization. -
user:USERNAME— restrict to a user. -
repo:OWNER/REPO— restrict to a single repository. -
filename:NAME— search by file name. -
path:PATH— match file path. -
in:file/in:path/in:readme— choose where to match terms. -
language:LANGorextension:ext— filter by language or extension. -
created:YYYY-MM-DD/pushed:YYYY-MM-DD— filter by dates. -
stars:>N— filter repositories by stars. -
Boolean operators:
AND,OR,NOTand parentheses for grouping.
These operators are part of GitHub’s code search and advanced search features. (GitHub)
Safe usage example¶
Goal (safe / non-sensitive): list Dockerfiles inside an organization to review container configurations.
Web UI (quick)¶
-
Open
https://github.comand sign in. -
In the top search box enter:
- Switch the results tab to Code (or use Code Search at
https://cs.github.com/to widen scope).
The results showDockerfileoccurrences across repos inoctocat. This is harmless and useful for auditing container build practices. (GitHub)
gh CLI (code search)¶
If you prefer CLI, gh provides search code:
Or use specific flags where available:
See gh search code in the GitHub CLI manual for flags and JSON output options. (GitHub CLI)
What to do if you find sensitive data (remediation & responsible next steps)¶
-
Assume the secret is compromised. Immediately recommend rotation/revocation of the key or credential. (Blog | iamluminousmen)
-
Inform the repository owner via a private channel (issues are public — prefer email or the project’s security contact). If the leak is high risk, use GitHub’s private information removal process or contact the vendor. (GitHub Docs)
-
Remove the secret from history (if you have permission to fix the repo). Use
git-filter-repo(recommended) or BFG to purge secrets from all commits, then force-push and coordinate with contributors. Document the steps so contributors can rebase or reclone. (GitHub Docs) -
Enable prevention: enable GitHub’s secret scanning / push protection and adopt pre-commit hooks (Talisman, pre-commit, etc.) and automated scanners (GitHub Advanced Security, GitGuardian) to catch secrets before they’re committed. (GitHub Docs)
Defensive recommendations (for maintainers)¶
-
Add pre-commit checks and secret detection (e.g., pre-commit, Talisman).
-
Use environment config (secrets managers) instead of hardcoding secrets.
-
Rotate keys regularly and use least privilege for tokens.
-
Enable GitHub secret scanning and push protection for organization repos. (GitHub Docs)
Further reading & curated resources¶
-
Practical write-ups and walkthroughs on GitHub dorking and discovery: InfoSec Writeups and Intigriti blog posts. (InfoSec Write-ups)
-
Public collections of GitHub dorks (used for research / auditing):
techgaun/github-dorks,Proviesec/github-dorks,jcesarstef/ghhdb. Use these responsibly (they contain search patterns and lists). (GitHub) -
GitHub Secret Scanning docs (how to enable and what it detects). (GitHub Docs)
-
Removing sensitive data from a repository (official GitHub guide). (GitHub Docs)
Short checklist (for reviewers)¶
-
Only run searches against public data or with explicit permission.
-
If you find secrets: do not use them; notify owner; recommend rotation. (GitHub Docs)
-
Add automated secret scanning to CI and pre-commit. (GitHub Docs)