Git Leak Recovery Overview This skill guides the process of recovering sensitive data (secrets, credentials, API keys) that have been removed from Git history through history-rewriting operations, extracting the data, and then securely cleaning the repository to ensure complete removal. Key Concepts Git Object Persistence When commits are "removed" via operations like git reset , git rebase , or git commit --amend , the underlying Git objects are not immediately deleted. They become "unreachable" but persist in the repository until garbage collection occurs. This behavior enables recovery but also means secrets remain accessible until explicit cleanup. Common Hiding Places for Secrets When searching for removed secrets, check these locations in order of likelihood: Reflog - Most common location for rewritten history ( git reflog ) Dangling commits - Commits with no branch reference ( git fsck --unreachable ) Stashes - Often overlooked ( git stash list ) Other branches - May contain the original commits Tags - May reference old commits Git notes - Annotations attached to commits Workflow Phase 1: Reconnaissance Before attempting recovery, gather information about the repository state:
View current commit history
git log --all --oneline
Check reflog for all references
git reflog show --all
Find unreachable objects
git fsck --unreachable
List stashes
git stash list
List all branches
git branch -a
List tags
git tag -l Phase 2: Recovery Once a target commit is identified:
View commit contents without checking out
git show < commit-hash
View specific file from commit
git show < commit-hash
: < path/to/file
Extract file to working directory
git show < commit-hash
: < path/to/file
recovered_file.txt Phase 3: Cleanup To completely remove secrets from the repository, perform cleanup in this specific order:
Step 1: Expire all reflog entries
git reflog expire --expire = now --all
Step 2: Run aggressive garbage collection
git gc --prune = now --aggressive The order matters: reflog must be expired first, otherwise GC will not remove the objects since they are still referenced. Phase 4: Verification Verify cleanup was successful using multiple approaches:
Attempt to access the old commit (should fail)
git show < old-commit-hash
Search for secret patterns in repository
grep -r "secret_pattern" . .git 2
/dev/null
Check for unreachable objects
git fsck --unreachable
Count loose objects (should decrease after GC)
find .git/objects -type f | wc -l
Verify working tree is clean
- git
- status
- Verification Strategies
- Confirming Recovery Success
- Verify the recovered data matches expected format/content
- Write recovered data to the designated output location
- Confirm existing commits and history remain intact after recovery
- Confirming Cleanup Success
- Old commit hash should return error when accessed via
- git show
- grep -r
- for secret patterns should return no matches
- git fsck --unreachable
- should show no objects containing the secret
- Compare object count before and after GC to confirm removal
- Common Pitfalls
- Investigation Pitfalls
- Only checking recent history
- - Use
- git reflog show --all
- not just
- git reflog
- to see all references
- Forgetting stashes
- - Stashes are a common place for accidentally stored secrets
- Missing other branches
- - Always check all branches with
- git branch -a
- Cleanup Pitfalls
- Wrong order of operations
- - Always expire reflog before running GC
- Missing the
- --all
- flag
- -
- git reflog expire --expire=now
- without
- --all
- only affects HEAD
- Using
- --prune
- without
- =now
- - Default prune time is 2 weeks, use
- --prune=now
- for immediate effect
- Not using
- --aggressive
- - Standard GC may not remove all unreachable objects
- Verification Pitfalls
- Only checking working directory
- - Secrets in
- .git
- directory require explicit checks
- Not verifying object removal
- - Always confirm the commit hash is inaccessible
- Incomplete grep patterns
- - Search for multiple variations of the secret pattern
- Decision Tree
- Is the task about recovering lost data from Git?
- ├── Yes → Check reflog first (git reflog show --all)
- │ ├── Found in reflog → Use git show
to view/extract - │ └── Not in reflog → Check fsck, stashes, branches, tags
- │
- └── Is the task about cleaning up secrets from Git?
- ├── Yes → Follow cleanup sequence:
- │ 1. git reflog expire --expire=now --all
- │ 2. git gc --prune=now --aggressive
- │ 3. Verify with multiple methods
- │
- └── Both recovery AND cleanup needed?
- → Complete recovery first, verify data saved,
- then proceed with cleanup
- Important Considerations
- Backup first
-
- Before any cleanup operations, ensure recovered data is saved outside the repository
- Remote repositories
-
- This cleanup only affects the local repository; if secrets were pushed to a remote, additional steps are needed
- Cloned copies
-
- Any cloned copies of the repository may still contain the secrets
- Credential rotation
- After recovering exposed secrets, rotate them immediately regardless of cleanup success