Use this skill at the start of any authorized web application security engagement — before any vulnerability probing, scanning, or exploitation. The mapping phase determines which areas of the application are worth investing time in and which vulnerability classes are most likely.
Invoke it when:
Do not invoke it for unauthorized access to systems you do not own or have permission to test.
.har, Burp export), or API spec filesroutes.py, urls.py, web.xml, routes.rb, *.controller.ts), config files (application.properties, .env), and template directories..har files, Burp Suite exports, or traffic captures in the working directory.package.json, pom.xml, Gemfile, requirements.txt, composer.json to infer the stack without HTTP traffic.ACTION: Walk through the application manually using a browser proxied through an interception tool (Burp Suite, OWASP ZAP, or equivalent), visiting every linked page and submitting every form. Simultaneously, run an automated spider against the already-visited content.
WHY: Automated spiders miss content hidden behind JavaScript navigation, Flash/Java applets, forms with validation, and authentication-protected areas. User-directed spidering with a proxy captures everything the human can see — including JavaScript-triggered navigation — while the proxy parses all server responses for additional content automatically. The combination is more thorough than either technique alone.
Procedure:
robots.txt — it frequently lists directories the application owner does not want indexed, which are often the most sensitive areas worth testingAGENT: EXECUTES (when source code is available) — grep for all route definitions, template links, and form action attributes to produce a complete URL list without live access.
HANDOFF TO HUMAN (when live access is required) — the human browses while the proxy captures traffic; the agent analyzes the captured site map.
WARNING: Never run an automated spider against an application without first identifying and excluding dangerous endpoints (admin delete functions, data-erasure operations, logout URLs). An automated spider that follows all links can cause real damage — defacing content, deleting users, or breaking sessions.
ACTION: Enumerate content not linked from the visible application using three complementary techniques: brute-force enumeration, inference from naming patterns, and mining public sources.
WHY: Applications routinely contain unlinked content — debug pages left from development, old versions not removed from the server, functionality visible only to higher-privilege users, configuration files with credentials, and backup copies of live pages. None of these appear in a spider's site map. Finding them can reveal critical vulnerabilities that the main application surface does not expose.
2a. Brute-Force Enumeration
Use a tool with a wordlist (Burp Intruder, DirBuster, ffuf, gobuster, or a custom script) to request common directory names and file names within every known directory:
200 OK with a custom "not found" page rather than 404. Record the response fingerprint for genuinely missing resources so you can filter it from results.302 redirect to the login page indicates an authenticated-only resource that exists; a 401/403 indicates an existing but access-restricted resource; a 500 often indicates a resource that exists and expects specific parametersResponse code interpretation guide:
200 OK — Resource exists and is accessible (verify it is not a custom "not found" page)302 to login page — Resource exists, authentication required302 to error page — May indicate a different condition; investigate further401 Unauthorized / 403 Forbidden — Resource exists but access is restricted regardless of privilege level400 Bad Request — May indicate nonstandard naming conventions or invalid wordlist entries500 Internal Server Error — Resource likely exists and expects specific parameters2b. Inference from Naming Patterns
Add, Edit, View, Delete)ForgotPassword exists in /auth, look for ResetPassword, ChangePassword, UpdatePassword/pub/media/117), probe adjacent values in the observed range.bak, .src, .inc, .old, .tmp, .php-1, .DS_Store.java, .cs), request the source extension — misconfigured servers may serve raw source code2c. Mine Public Sources
site:target.com — all indexed pagessite:target.com admin — pages containing specific keywords (administrative areas, login functions)link:target.com — pages on other sites that link to the target (may reveal partner-only URLs)archive.org) for historical content that may no longer be linked but is still live on the serverAGENT: EXECUTES — analyzes source code for hardcoded paths, commented-out links, disabled form fields, and server-side include references. Produces a candidate URL list.
HANDOFF TO HUMAN — wordlist-based brute-force and live HTTP probing require tool execution against a live target.
ACTION: Catalog every location where user-controlled data enters the application — including non-obvious channels that are frequently overlooked.
WHY: Vulnerability testing is only as comprehensive as the entry point catalog. Missed entry points mean missed vulnerabilities. Many critical flaws (SQL injection, cross-site scripting, path traversal) are discovered at entry points that automated scanners miss because they do not appear in HTML forms — they appear in HTTP headers, URL path segments, or out-of-band channels.
Entry point categories to enumerate:
| Category | What to collect |
|---|---|
| --- | --- |
| URL path segments | Every segment in REST-style URLs (e.g., electronics and iPhone3G in /shop/browse/electronics/iPhone3G/) |
| URL query string parameters | Every name=value pair, including non-standard separators (;, $, %3d) |
| POST body parameters | Every field in every form, including hidden fields |
| Cookies | Every cookie name and value |
| Standard HTTP request headers | User-Agent, Referer, Accept, Accept-Language, Host — all may be logged or processed |
| Custom HTTP headers | X-Forwarded-For, X-Real-IP, and any application-specific headers — often processed for IP-based access control or geolocation |
| Out-of-band channels | Email content processed by a mail-parsing function, HTTP content fetched by server-side URL retrievers, data from APIs consumed by the application |
Special attention — HTTP headers: Many applications trust the X-Forwarded-For header for the client's IP address when running behind a proxy. If this header is processed without validation, injecting SQL or scripting content into it can trigger injection vulnerabilities. Similarly, spoofing User-Agent to a mobile device string often reveals a separate mobile-optimized code path that has received less security review.
Special attention — non-standard parameter formats: If the application does not use the standard name=value&name2=value2 format, understand the actual encoding before testing. Treating a URL like /dir/file?data=%3cfoo%3e%3c%2ffoo%3e%3cbar%3e%3c%2ffoo2%3e%3e as a single parameter called data will miss injection points inside the embedded XML.
ACTION: Determine the technology stack — web server software and version, application framework, programming language, database, and third-party components — from the available indicators.
WHY: Technology identification directly predicts which vulnerability classes to prioritize. A PHP application on Apache has a different vulnerability profile than a Java application on WebSphere. Known third-party components may have published Common Vulnerabilities and Exposures (CVEs) that are directly exploitable. Version information enables precise vulnerability lookup.
Fingerprinting sources:
| Indicator | Where to look | What it reveals |
|---|---|---|
| --- | --- | --- |
Server HTTP header | Every HTTP response | Web server software and version |
X-Powered-By header | Application responses | Framework (e.g., PHP/7.4, ASP.NET) |
| Custom headers | Non-standard headers in responses | Application-specific platform details |
| HTML source comments | Page source, especially error pages | Developer notes, framework version, build info |
| File extensions | URLs across the site map | Programming language (.jsp=Java, .aspx=ASP.NET, .php=PHP, .py=Python, .rb=Ruby, .cfm=ColdFusion) |
| Directory names | URL structure | Servlet containers (/servlet/), ColdFusion (/cfdocs/, /cfide/), Rails (/rails/) |
| Session token names | Cookie names in HTTP responses | Platform (JSESSIONID=Java, ASPSESSIONID=IIS, PHPSESSID=PHP, CFID/CFTOKEN=ColdFusion) |
| Error page format | 404, 500 responses | Framework-generated error pages are distinctive |
| URL patterns with comma-separated numbers | URL structure | Vignette content management platform |
HTTP fingerprinting: Even when the Server header is suppressed or falsified, behavior differences in how the server handles invalid requests, the ordering of response headers, and the exact formatting of error messages can identify the underlying software. Run a behavioral fingerprinting tool (httprecon, WhatWeb) against the target when banner-based identification is inconclusive.
Third-party component identification: Search for the names of unusual cookies, custom HTTP headers, or distinctive JavaScript library calls. Locate other applications using the same component to understand its full feature set and known vulnerabilities. Check CVE databases for the identified component and version.
ACTION: Reason about the server-side implementation by analyzing request structure, parameter names, and application behavior — treat every observable artifact as a clue about how the server processes requests.
WHY: Understanding what the server is doing enables identification of vulnerability classes that are not yet visible from the application's surface. Parameters named OrderBy suggest database queries where the value may be used directly in an SQL ORDER BY clause. Parameters named template or loc suggest file retrieval that may be vulnerable to path traversal. Boolean parameters set to false may control functionality that attackers benefit from setting to true.
Analysis approach:
OrderBy, sort, sortField parameters → SQL ORDER BY injection candidatestemplate, page, include, file, path parameters → path traversal or server-side include candidatesredirect, url, next, returnUrl parameters → open redirect candidatesto, from, subject parameters in mail-sending functions → email header injection candidatesisExpired, isAdmin, edit, debug Boolean parameters → access control bypass candidates by toggling the valueACTION: For each functional area identified, assign the most likely vulnerability classes based on the behavior and technology patterns observed.
WHY: Attack surface mapping is only actionable when it produces a prioritized test plan. A behavior-to-vulnerability matrix translates the reconnaissance findings into specific things to test, preventing both the unfocused "test everything" approach and the risk of missing high-probability vulnerability areas.
Apply this mapping:
| Observed Behavior or Functionality | Primary Vulnerability Classes to Investigate |
|---|---|
| --- | --- |
| Client-side input validation in forms | Server-side validation bypass (checks may not be replicated on server) |
| Database interaction (search, filtering, ordering) | SQL injection (CWE-89) |
| File upload or download functionality | Path traversal (CWE-22), stored cross-site scripting |
| Display of user-supplied data | Cross-site scripting (CWE-79, reflected and stored) |
Dynamic redirects (redirect, next, returnUrl parameters) | Open redirect (CWE-601), header injection |
| Social features (user profiles, messaging) | Username enumeration, stored cross-site scripting |
| Login functionality | Username enumeration, weak credential policies, brute-force susceptibility |
| Multi-step login or checkout workflows | Business logic flaws, step-skipping |
| Session tokens issued by server | Predictable token generation, insecure token handling |
| Access control (privilege levels, roles) | Horizontal privilege escalation (CWE-639), vertical privilege escalation (CWE-269) |
| User impersonation or "act as" functionality | Privilege escalation |
| HTTP-only communication (no TLS) | Session hijacking, credential interception |
| Off-site links (third-party resources in page) | Query string parameter leakage via Referer header |
| Integration with external systems (payment processors, APIs) | Session shortcutting, access control bypass at integration boundaries |
| Verbose error messages | Information leakage (CWE-209) — internal structure, stack traces, SQL errors |
| Email interaction (contact forms, notification triggers) | Email injection (CWE-93), command injection |
| Native code components or plugins | Buffer overflow (CWE-121) |
| Third-party application components | Known CVEs for identified component and version |
| Identifiable web server software | Configuration weaknesses, known software bugs for identified version |
Prioritize the map by combining two factors: likelihood (how often this vulnerability class appears in this technology) and impact (what an attacker gains if it is present). Authentication bypass, SQL injection, and access control flaws are typically highest priority.
links will miss large portions of the application.Referer, User-Agent, X-Forwarded-For, and Host headers are processed by many applications for logging, analytics, access control, and content personalization. Treating them as read-only is a testing oversight. WHY: headers that are logged are often concatenated into queries or log entries without sanitization — the same classes of injection that affect form fields apply to header values that are processed server-side.Scenario: External penetration test of an e-commerce platform
Trigger: "I'm starting a pentest on a client's shopping site. I have authorization and they've given me a staging environment URL."
Process:
robots.txt discloses /admin/ and /staging-api/ — both unlinked from the visible application/admin/ — finds Admin, AdminLogin, Dashboard, ExportUsers, BulkDelete; ExportUsers returns 302 to login, BulkDelete returns 200 directlyServer: Apache/2.4.41, X-Powered-By: PHP/7.3, session cookie named PHPSESSID, /shop/ uses REST-style URLsX-Forwarded-For header processed for free-shipping threshold check/admin/BulkDelete (no auth required, 200 response) → access control bypass; REST URL product IDs → path traversal; X-Forwarded-For processed → injection; checkout discount_code parameter → business logic; search sort parameter → SQL injection candidateOutput: Prioritized test plan with BulkDelete access control bypass as P0, SQL injection in sort parameter as P1, X-Forwarded-For injection as P2.
Scenario: Security review of an internal HR application source code
Trigger: "Can you review our internal HR application for security issues? Here's the repo."
Process:
routes.py and urls.py — identifies 84 endpoints including /api/admin/export-all-employees not referenced in the frontendmanager_id parameter that accepts arbitrary integers and appears in a raw string format SQL queryrequirements.txt for outdated dependencies; find django==2.2.0 (end-of-life, multiple known CVEs)X-Employee-Level header used in authorization logic without validation# TODO: add auth check comment above three API endpointsmanager_id, missing authentication on three endpoints, authorization bypass via X-Employee-Level spoofing, Django version vulnerabilitiesOutput: Written security assessment with four critical/high findings and remediation guidance.
Scenario: Bug bounty reconnaissance on a SaaS product
Trigger: "I want to map the attack surface of this SaaS product before I start testing — it's in my bug bounty scope."
Process:
site:target.com in Google; Wayback Machine reveals /v1/api/ endpoints from 3 years ago — test whether they still respondUser-Agent triggers a different code path with less aggressive rate limiting on the login endpointOutput: Attack surface map covering 6 subdomains, 312 endpoints, 4 distinct user roles, legacy API still live, and login brute-force vector on mobile code path.
This skill is licensed under CC-BY-SA-4.0.
Source: BookForge — The Web Application Hacker's Handbook: Finding and Exploiting Security Flaws by Dafydd Stuttard, Marcus Pinto.
This skill is standalone. Browse more BookForge skills: bookforge-skills
共 1 个版本
暂无安全检测报告