1. Insight
Insight
The problem this article addresses and why it matters.
URLs are JSON in disguise
A URL like https://example.com/oauth/callback?state=eyJrIjoidXNlcl81NDIxIiwiZXhwIjoxNzQ3OTI5ODkyfQ&redirect_uri=https%3A%2F%2Fapp.example.com%2Fhome%3Ftab%3Ddashboard&code=auth_xyz789 is a serialised structure containing a base64-encoded JSON state token, a URL-encoded redirect URI that itself has a query parameter, and an opaque authorisation code. To understand what just happened in an OAuth flow, you need to decode every layer.
Standard URL parsing (new URL(href)) handles the surface: protocol, host, path, query parameters as a flat key-value map. It doesn't decode the contents of query parameter values. A debugging session that starts with "why isn't the redirect working?" ends with 20 minutes of manually base64-decoding the state parameter, URL-decoding the redirect_uri, then realising the inner redirect_uri has its own query parameter that needed a separate decode.
Why a deep-parsing tool
The tool in this article does the deep work automatically. Pass deepQueryParse: true and the tool recursively decodes every query parameter value, detecting whether each value is plain text, base64, URL-encoded, JWT, or a nested JSON / URL. The output gives you the surface-level parse (protocol, host, path) plus a deepQuery map where each parameter is annotated with its detected type and parsed contents.
It also runs security flags across the whole URL: credentials in the query string (a known anti-pattern that exposes secrets in log files), open-redirect patterns (redirect_uri pointing at a different host), excessively long parameter values (a DoS vector for parsers that don't bound their input).
What this article delivers
End-to-end walks of parsing an OAuth callback URL, a webhook delivery URL with embedded tracking parameters, and a URL containing credentials in the query string. We cover the security-flag output, the cases where deep parsing surfaces structure the developer didn't realise was there, and the cases where the deep parser is the wrong tool (a query parameter value that's structurally similar to JSON but is actually a serialised proprietary format).
2. Intent
Intent
What you will be able to do after reading.
By the end of this article you will be able to:
- Parse any URL into protocol, host, path segments, flat query parameters, and hash fragment
- Use deep-query mode to recursively decode base64-encoded, URL-encoded, JWT, JSON-shaped, and nested-URL query parameter values
- Read the security flags output that surfaces credentials in query strings, open-redirect patterns, and excessively long parameter values
- Trace OAuth callbacks, webhook deliveries, and analytics-instrumented URLs back to their original semantic structure
- Recognise the cases where the deep parser misinterprets a value (a base64-shaped opaque ID that isn't actually base64-encoded JSON)
The Examples section walks through an OAuth callback, a webhook URL with credentials, and a deeply-nested redirect chain.
3. Examples
Examples
Annotated code and worked scenarios.
Before / after: parsing an OAuth callback
urlParser({
url: 'https://example.com/oauth/callback?state=eyJrIjoidXNlcl81NDIxIiwiZXhwIjoxNzQ3OTI5ODkyfQ&redirect_uri=https%3A%2F%2Fapp.example.com%2Fhome%3Ftab%3Ddashboard&code=auth_xyz789',
decode: true,
deepQueryParse: true,
});
// valid: true
// protocol: 'https:'
// host: 'example.com'
// port: null
// path: '/oauth/callback'
// pathSegments: ['oauth', 'callback']
// query: {
// state: 'eyJrIjoidXNlcl81NDIxIiwiZXhwIjoxNzQ3OTI5ODkyfQ',
// redirect_uri: 'https://app.example.com/home?tab=dashboard',
// code: 'auth_xyz789',
// }
// hash: ''
// deepQuery: {
// state: {
// rawValue: 'eyJrIjoidXNlcl81NDIxIiwiZXhwIjoxNzQ3OTI5ODkyfQ',
// decodedValue: '{"k":"user_5421","exp":1747929892}',
// type: 'base64',
// parsed: { k: 'user_5421', exp: 1747929892 },
// },
// redirect_uri: {
// rawValue: 'https%3A%2F%2Fapp.example.com%2Fhome%3Ftab%3Ddashboard',
// decodedValue: 'https://app.example.com/home?tab=dashboard',
// type: 'url',
// parsed: { protocol: 'https:', host: 'app.example.com', path: '/home', query: { tab: 'dashboard' } },
// },
// code: {
// rawValue: 'auth_xyz789',
// decodedValue: 'auth_xyz789',
// type: 'plain',
// },
// }
// securityFlags: []The flat parse shows the surface URL. The deepQuery field is where the value is — the state parameter is base64-decoded and shown to be a JSON object with a user ID and expiry; the redirect_uri is URL-decoded and recursively parsed (it has its own query.tab parameter).
Debugging "why isn't the redirect working?" now takes 30 seconds: read the deep parse output and check if the inner redirect URL's host is what you expect.
Before / after: credentials in query string
urlParser({
url: 'https://api.example.com/v1/users?api_key=sk_live_abc123def456&limit=100',
decode: true,
});
// query: { api_key: 'sk_live_abc123def456', limit: '100' }
// securityFlags: [
// {
// issue: 'API key detected in query string. Query parameters appear in server access logs, browser history, and Referer headers — credentials should be in Authorization header instead.',
// severity: 'critical',
// param: 'api_key',
// },
// ]The security flag catches the anti-pattern: API keys in query strings get logged by every HTTP middleware on the path. The fix is Authorization: Bearer ... headers. The flag surfaces immediately rather than waiting for the credential to leak into a log aggregator.
Before / after: open redirect detection
urlParser({
url: 'https://example.com/redirect?to=https%3A%2F%2Fevil.com%2Fphishing',
decode: true,
deepQueryParse: true,
});
// deepQuery: {
// to: {
// rawValue: 'https%3A%2F%2Fevil.com%2Fphishing',
// decodedValue: 'https://evil.com/phishing',
// type: 'url',
// parsed: { protocol: 'https:', host: 'evil.com', ... },
// },
// }
// securityFlags: [
// {
// issue: 'Redirect URI points at a different host than the URL origin (example.com → evil.com). Verify this is intentional; open-redirect patterns enable phishing.',
// severity: 'warning',
// param: 'to',
// },
// ]The cross-host redirect surfaces as a warning. Not every cross-host redirect is malicious — OAuth callbacks legitimately redirect to a different host — but it's a flag worth surfacing for review.
Before / after: webhook URL with embedded JWT
urlParser({
url: 'https://webhook.example.com/event?token=eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redacted',
deepQueryParse: true,
});
// deepQuery: {
// token: {
// rawValue: 'eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redacted',
// decodedValue: '<JWT token>',
// type: 'jwt',
// parsed: {
// header: { alg: 'HS256' },
// payload: { user: 1, exp: 1747929892 },
// signaturePresent: true,
// },
// },
// }
// securityFlags: [
// {
// issue: 'JWT detected in query string. Token expiry: 2026-05-22T15:24:52Z (4 days remaining). Like other credentials, JWTs in query parameters leak via logs and Referer headers.',
// severity: 'high',
// param: 'token',
// },
// ]The parser decodes the JWT — alg, payload claims, signature presence — and the security flag surfaces the leakage risk. Useful for incident response when investigating "did this JWT actually fire?" without manually base64-decoding the segments.
When humans use this
The dominant workflow is debugging an unexpected URL. An OAuth callback that's redirecting wrong, a webhook URL with an unfamiliar state parameter, a tracking URL that turned out to encode an entire JSON payload. Paste the URL, scan the deepQuery output for the layer that holds the actual answer.
When agents use this
Three patterns:
- OAuth-flow debugging agent. An agent diagnosing why an OAuth integration fails calls the parser on the callback URL, reads the
stateandredirect_urifrom thedeepQueryoutput, and verifies them against the expected values. Most OAuth issues are visible at this layer. - Security-audit agent. A scheduled agent scans log files for URLs and runs each through the parser with
deepQueryParse: true. URLs withsecurityFlagsare aggregated into a daily report. Credentials in query strings, open-redirect patterns, and unusually long parameter values all surface. - Webhook-test harness. An agent generating test webhook deliveries constructs URLs with intentionally-structured
stateparameters and verifies the receiving handler parses them correctly via this tool.
Edge cases
Internationalised domain names (IDN)
URLs with non-ASCII characters in the host (https://日本.example) are decoded to Unicode in the host field. The Punycode-encoded form (https://xn--wgv71a.example) is preserved separately. Both are valid representations; the tool's output gives both so the consumer picks the right one.
Fragment-encoded auth tokens
OAuth 2.0 implicit flow places the access token in the URL fragment (#access_token=...). The tool parses the fragment as if it were a query string when decodeFragment: true (default false). Useful for capturing tokens delivered via the implicit flow when you control the client-side parsing.
Percent-encoding in the path
Path segments can contain percent-encoded characters (/products/screws%2Fbolts). The tool decodes them by default; pass decode: false to preserve the raw form. Some path schemes use percent-encoding as a meaningful separator (the %2F was meant to be a literal slash, not a path separator).
Maximum length
URLs over 8192 characters are rejected by most servers and parsers. The tool surfaces a warning at 4096 (the practical limit) and returns INPUT_TOO_LARGE at 65536 (the spec ceiling for some implementations).
4. Documentation
Documentation
Reference signatures, edge cases, and lookup tables.
Input parameters
Field | Type | Required | Default | Description |
|---|---|---|---|---|
|
| ✓ | — | The URL to parse |
|
| ✗ |
| Percent-decode path segments and query values |
|
| ✗ |
| Recursively decode query parameter values |
|
| ✗ |
| Parse the URL fragment as a query string (OAuth implicit flow) |
Output shape
{
valid: boolean;
protocol: string; // 'https:'
host: string;
port: string | null;
path: string;
pathSegments: string[];
query: Record<string, string>; // flat decoded params
hash: string;
deepQuery?: Record<string, { // when deepQueryParse: true
rawValue: string;
decodedValue: string;
type: 'plain' | 'base64' | 'jwt' | 'url' | 'json';
parsed?: object; // structured parse when type is jwt / url / json / base64-of-json
}>;
securityFlags: Array<{
issue: string;
severity: 'critical' | 'high' | 'warning' | 'info';
param?: string;
}>;
}Deep-parse type detection heuristics
Type | Trigger |
|---|---|
| Three base64url segments separated by dots, decodes to JSON object in segment 1 |
| All base64url chars, decodes to readable text or JSON |
| Decoded value parses as a URL (starts with |
| Decoded value parses as JSON (starts with |
| None of the above |
Common security flags
Severity | Trigger |
|---|---|
| Known credential patterns in query ( |
| URL contains basic auth ( |
| JWT in query parameter (token leakage via logs / Referer) |
| API key or Bearer token in query parameter |
| Open-redirect pattern (cross-host redirect_uri) |
| Parameter value over 2KB (potential DoS or abuse) |
| Hash fragment contains tokens (implicit-flow tokens leak via Referer) |
Error codes
Code | When it fires | Recovery |
|---|---|---|
|
| Provide a URL |
| URL parse failed (no scheme, invalid host) | Verify the URL is well-formed (try in a browser) |
| URL exceeds 65536 chars | Truncate or use a URL shortener (e.g. |
| URL uses a scheme this tool doesn't parse ( | Use a scheme-specific parser |
When NOT to use this tool
For URL building (taking parameters and producing a URL), use the runtime's URL constructor (new URL() in Node / browsers). The parser is for the reverse direction.
For URL canonicalisation (resolving relative URLs, removing duplicate slashes, sorting query parameters), use a canonicalisation library. The parser preserves the input form; it doesn't normalise.
For very high-throughput URL parsing (millions per second, e.g. log processing), inline the URL parsing in your processor with a streaming-friendly library. This tool's HTTP overhead dominates for short URLs.
Performance notes
Typical execution: under 3ms for URLs under 1KB. Deep query parsing adds 5-20ms depending on parameter count and nesting depth. The tool is deterministic — same input + same parameters always produce byte-identical output — so REST responses are Edge-Cache eligible.
The deep-query heuristics are tuned for common patterns (OAuth state, webhook tokens, redirect URIs). Domain-specific encoded values (a proprietary serialisation that looks base64-shaped but isn't) may be misclassified; the type field surfaces the heuristic decision so the consumer can override.