OSINT Techniques for Web, API, Cloud, and Mobile Penetration Testing

Security Research

Quick ethics note (because we're grownups): OSINT is a scalpel, not a sledgehammer. Use it only on engagements you're authorized for (Rules of Engagement signed, scope defined, coffee brewed). Unauthorized probing = drama + felony. Now that we agreed to be lawful, onward.

1. Introduction — What OSINT Actually Buys You (Besides Bragging Rights)

Open-Source Intelligence (OSINT) is the disciplined collection, enrichment, and analysis of publicly available signals so you can build a realistic attack surface model before you touch an exploit. Across web, API, cloud, and mobile, the philosophy is identical: find everything the target didn't intend to expose, prioritize what's actionable, and map that to risk. The tactics and data shapes change—so you bring different tools and mental models to each domain.

2. Web OSINT — The Classic Reconnaissance Buffet

Goal: find pages, assets, config files, forgotten admin portals, and credentials that were posted to the internet as if by accident.

Techniques & Tools

Search Operators / Dorking

site:example.com inurl:env "DB_PASSWORD" — rapid way to find leak patterns.

Archive Crawling

Wayback Machine + waybackurls + gau to find historical files and endpoints that may still be accessible.

Repo Leakage

GitHub + gitrob/truffleHog/gitleaks to scan public repos for keys, secrets, and config files.

Subdomain & DNS Enumeration

subfinder, amass, dnsdumpster, passive DNS (SecurityTrails). Combine results, then probe with httpx/naabu.

Example Commands (Lab-Friendly)

# mass-harvest URLs from wayback + live-check
waybackurls example.com | tee wayback-urls.txt
cat wayback-urls.txt | httpx -status -o alive-urls.txt

# quick git leakage scan (local clone required)
gitleaks detect --source=/path/to/repo --report=gitleaks-report.json

What you typically find: credentials, old API endpoints, forgotten admin interfaces, misconfigured CORS, and files intended for devops but left public.

Defense notes: monitor for exposed keywords in public repos, block .env and .git via robots/ACLs, rotate creds discovered in CI logs, and monitor Cloudflare/hosting for new subdomains.

3. API OSINT — The Soap Opera of Endpoints, Tokens, and Weird Headers

Goal: discover endpoints, understand auth models, and locate tokens, documentation, or parameter behavior that reveal misconfigurations or privilege gaps.

Techniques & Tools

API Discovery

Harvest OpenAPI/Swagger files, Postman collections, and developer docs (GitHub, dev portals, SDKs). Tools: swagger-cli, openapi-grabber.

Intercept & Replay

Burp Suite/OWASP ZAP/Postman + extensions to view live traffic, extract bearer tokens, cookie flags, and dangerous headers.

Parameter Fuzzing

ffuf, fuzzapi, or Burp Intruder to push unexpected input and trigger verbose error messages revealing stack traces or hidden endpoints.

Static Discovery in Code

grep for apiKey, Authorization: Bearer, or fetch('https://api.example.com') in public repos, SDKs, or mobile app source.

Lab Example (Intercepting a Token)

  1. Configure Burp as proxy
  2. Trigger a client action that calls an API endpoint
  3. Inspect Authorization header or JSON body for tokens; test token replay against other endpoints

What you typically find: weak auth flows, tokens with too-broad scopes, predictable pagination that leaks records, and excessive trust in client-side input.

Defense notes: enforce least-privilege tokens, token rotation, strict CORS, robust rate limiting, input validation, and central API gateway logging.

4. Cloud OSINT — Where an ls Can Become a Full-Blown Identity Crisis

Goal: find misconfigured buckets, exposed metadata endpoints, weak IAM policies, and anything that turns public files into living credentials.

Techniques & Tools

Bucket Enumeration

aws s3 ls (public), s3scanner, bucket_finder or gcloud equivalents. Look for public ACLs or listable buckets.

Metadata Harvesting

SSRF → IMDS calls (EC2 IMDSv1/v2 nuances), check for exposed cloud metadata via misconfigured endpoints.

IAM Mapping

Collect service principal names, role ARNs, and attached policies; use aws iam get-role (when authorized) or infer from exposed artifacts.

Automated Enumeration

cloud_enum.py, custom scripts to test S3, GCS, Azure Blob permissions, and enumerate publicly reachable resources.

Example Scenario

SSRF from a web app fetching http://169.254.169.254/latest/meta-data/iam/security-credentials/ → temporary credentials returned → use aws sts get-caller-identity to enumerate privileges.

What you typically find: public buckets with PII, long-lived keys in repos, permissive IAM roles attached to compute instances, and overly broad policies like s3:* or iam:PassRole.

Defense notes: enforce IMDSv2, require MFA for privilege escalation paths, least-privilege IAM, block public bucket listing, enable bucket-level logging and object versioning, and use monitoring/alarms for unusual API calls.

5. Mobile OSINT — APKs, Plist, and the Glorious Reveal of Hardcoded Secrets

Goal: decompile apps, harvest embedded API keys and endpoints, and analyze network behavior for weak crypto or insecure patterns.

Techniques & Tools

Static Analysis

apktool, jadx, class-dump for iOS, scan binaries for strings like api_key, secret, or https://api.

Dynamic Analysis

Set up Burp/mitmproxy with a test device or emulator, install custom CA cert (if allowed) and inspect requests/responses.

Configuration Review

Analyze AndroidManifest.xml and iOS entitlements, check for exported components (exported=true) or weak allowBackup flags.

Reverse Engineering

Find logic that constructs auth tokens client-side, insecure token storage (SharedPreferences without encryption), or debug endpoints left in release builds.

Typical Commands

# decompile APK
apktool d app-release.apk -o app_src
# view Decompiled java/x86 smali
jadx -d jadx_out app-release.apk

What you typically find: hardcoded API keys, insecure certificate pinning (or none at all), debug endpoints, and secrets stored without encryption.

Defense notes: avoid shipping keys in binaries, use secure storage (Keychain/Keystore), implement proper certificate pinning, and remove debug backdoors before release.

6. Cross-Domain Similarities

So you can be lazy and reuse good patterns:

  • Recon discipline: all domains reward automation and chaining small findings into a bigger picture.
  • Scripting is your friend: build parsers that normalize results into one format (jq, python, pandas) so you can triage faster.
  • Actionability > volume: millions of URLs are fun to collect; a single usable credential is what pays the bills.
  • Logging & detection: anything that produces credentials should be covered by detection (SIEM alerts, anomaly detection).

7. Quick Cheat-Sheet

Domain First Tools to Run Quick Win to Look For
Web amass, waybackurls, httpx, gitleaks .env, /admin, .git
API Burp, openapi-grabber, ffuf exposed swagger, bearer token reuse
Cloud s3scanner, passive DNS, cloud_enum.py public buckets, SSRF → IMDS
Mobile apktool, jadx, mitmproxy hardcoded keys, debug endpoints

8. Closing (Practical Takeaways)

OSINT is the art of turning noise into vectors. Be methodical, automate what's repetitive, and focus on what moves the box (tokens, credentials, IAM roles). Keep proof-of-concept steps minimal and non-destructive, and always recommend remediation (rotate keys, lock buckets, tighten policies). Also — compliment your work with logging rules so defenders can spot the same reconnaissance the next time.

Comparative Analysis: Cloud Pentesting vs Web, API, and Mobile

More Meat, Less Flair

1. Intro — Short Version

Cloud pentesting is different because identity = access. In web/API/mobile, you often snag data; in cloud, a single misconfiguration can hand over an identity that actually controls the infrastructure. That's why cloud OSINT has exponentially higher strategic impact.

2. Methodology Differences

Cloud Pentest

Primary vectors: object storage, metadata endpoints, IAM roles/policies, serverless functions

Impact: Live credentials and role assumptions leading to infrastructure control

Web Pentest

Primary vectors: exposed files, directories, subdomains, backup artifacts

Impact: Data exposure, rarely immediate cloud-level control

API Pentest

Primary vectors: authentication flaws, token leakage, endpoint logic flaws

Impact: Scoped access, authorization bypass if tokens are privileged

Mobile Pentest

Primary vectors: embedded secrets, weak local storage, insecure network calls

Impact: Data exposure or user account compromise

5. Practical Comparison Matrix

Aspect Web API Cloud Mobile
Primary Artifacts HTML, JS, backups JSON, tokens, headers Buckets, roles, metadata APK/IPA, local storage
Typical Outcome Data exposure, pages Scoped access, logic flaws Live creds, privilege escalation App secrets, compromised users
Attack Speed Medium Fast Fast → explosive Slow (decompilation)
Detection Signals Web logs, WAF alerts API gateway logs CloudTrail, CloudWatch, Audit logs Mobile analytics, app logs

6. Recommendations for Defenders

Cloud

Enforce IMDSv2, least-privilege IAM, enable CloudTrail + alerts, disallow public listable buckets

API

Centralize auth at API Gateway, token scoping, strong rate limiting, validate JWT claims server-side

Web

CI/CD hygiene, proper robots.txt and server configs, remove debug artifacts, monitor archives

Mobile

Avoid hardcoded keys, use Keychain/Keystore, enforce pinning, remove debug modes

7. Closing Thought — Playing the Long Game

Cloud pentesting requires a slightly different mindset: think identity graphs and trust boundaries rather than just files and endpoints. Web/API/mobile OSINT often leaves you with intelligence and screenshots. Cloud OSINT can leave you with credentials that actually move bits and bytes in production. Prioritize detection and least-privilege, and treat any public artifact as "sensitive until proven otherwise."