1. Insight
Insight
The problem this article addresses and why it matters.
Base64 strings are opaque until they aren't
You receive a base64-encoded string from an API response, a webhook payload, or a log entry. What's inside? Standard base64 decoders just give you bytes. The bytes might be JSON, might be a JWT, might be a PEM-encoded certificate, might be the contents of an image file. Without knowing what's inside, you can't know what to do next.
The tool in this article adds an identify mode on top of standard encode/decode. Paste any base64 string, get back: which variant it uses (standard, URL-safe, MIME), what the decoded content actually is (JSON, JWT, image, PEM certificate, protobuf, plain text), and a preview of the decoded content. If it's a JWT, the header and payload come back parsed.
This solves the daily "what is this opaque blob?" question that comes up in API debugging, log analysis, and forensic work.
What this article delivers
Three modes walked end-to-end: standard encode/decode against UTF-8 text, the identify mode against representative payloads (JSON, JWT, PEM, image), and the variant detection that handles standard vs URL-safe vs MIME base64 automatically.
2. Intent
Intent
What you will be able to do after reading.
By the end of this article you will be able to:
- Encode and decode UTF-8 strings to and from base64 in standard, URL-safe, or MIME variants
- Use identify mode to determine which variant an opaque base64 string uses, what the decoded content is, and a preview of the content
- Recognise JWT-shaped base64 inputs and parse the header / payload into structured objects
- Distinguish JSON, JWT, image, PEM certificate, protobuf, and plain-text content automatically from decoded bytes
- Handle the URL-safe variant (
-/_instead of+//) without per-call manual translation
The Examples section walks through encode, decode, and identify against representative payloads.
3. Examples
Examples
Annotated code and worked scenarios.
Before / after: identify an opaque blob
You found this in a log file:
eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redactedbase64Codec({
input: 'eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redacted',
mode: 'identify',
});
// output: '<JWT — decoded below>'
// variant: 'url-safe'
// contentType: 'jwt'
// contentPreview: '{"alg":"HS256"}{"user":1,"exp":1747929892}<signature>'
// jwtParts: {
// header: { alg: 'HS256' },
// payload: { user: 1, exp: 1747929892 },
// signaturePresent: true,
// }Three pieces of information in one call: the variant (URL-safe), the content type (JWT), and the structured parse. No manual base64-decode then JSON-parse then header-and-payload-extract — the tool collapses all three steps.
Before / after: standard encode/decode
base64Codec({
input: 'Hello, world!',
mode: 'encode',
variant: 'standard',
});
// output: 'SGVsbG8sIHdvcmxkIQ=='
// variant: 'standard'
// inputBytes: 13
// outputBytes: 20Same content, URL-safe variant:
base64Codec({
input: 'Hello, world!',
mode: 'encode',
variant: 'url-safe',
});
// output: 'SGVsbG8sIHdvcmxkIQ' // no padding, no '+' or '/'URL-safe drops the padding = characters and uses - / _ instead of + / /. Used in JWT signatures, OAuth state tokens, and anywhere a base64 string lands in a URL.
Before / after: decoding without knowing the variant
base64Codec({
input: 'SGVsbG8sIHdvcmxkIQ', // could be either variant
mode: 'decode',
});
// output: 'Hello, world!'
// variant: 'url-safe' (auto-detected from the absence of '+', '/', '=')The decoder auto-detects the variant from the input characters. Useful when consuming base64 from an unknown source.
Before / after: identifying decoded content types
Identify mode classifies the decoded bytes:
Decoded content |
|
|---|---|
Starts with |
|
Three base64 segments separated by |
|
|
|
PNG, JPEG, GIF magic bytes |
|
Looks like protobuf wire format |
|
Printable ASCII / UTF-8 |
|
Everything else |
|
When the content is a PEM certificate, the preview includes the PEM headers so the consumer can identify the cert type. When it's an image, the preview shows the image dimensions and format.
When humans use this
The dominant use is the "what is this?" workflow on opaque payloads — pasting from a log, an API response, or a captured webhook. The JWT detection is the highest-frequency hit (most modern systems use JWT-shaped base64 for auth tokens).
When agents use this
Two patterns:
- Payload classification. An agent receiving an opaque base64 blob from an upstream system runs identify mode first to determine what to do with the content. Branch on
contentType: JSON gets parsed, JWT gets validated, image gets passed to a vision tool. - Log analysis. An agent processing log entries with embedded base64 strings classifies each and extracts the meaningful structure (JWT payload, JSON object). The same pipeline handles heterogeneous logs.
Edge cases
Padding mismatch
Standard base64 requires padding (=) to a 4-byte boundary; URL-safe omits padding. Decoders should accept both. The tool's decoder is lenient — accepts padded and unpadded inputs in either variant. Output encoding is canonical for the chosen variant.
Truncated input
Truncated base64 (the producer cut the string before the end of an encoded byte) returns INPUT_MALFORMED. The decoder doesn't attempt partial-byte recovery.
Non-base64 characters
Input with characters outside the base64 alphabet (whitespace within the string, line breaks from MIME variant) is handled by stripping whitespace before decoding (MIME variant explicitly allows wrapped lines). Other invalid characters trigger INPUT_MALFORMED.
4. Documentation
Documentation
Reference signatures, edge cases, and lookup tables.
Input parameters
Field | Type | Required | Default | Description |
|---|---|---|---|---|
|
| ✓ | — | The string to encode, decode, or identify |
|
| ✓ | — | Workflow selector |
|
| ✗ | auto-detect for decode | Base64 variant |
|
| ✗ |
| Encoding of the source string for encode mode |
Output shape
{
output: string;
variant: 'standard' | 'url-safe' | 'mime';
inputBytes: number;
outputBytes: number;
contentType?: string; // identify mode
contentPreview?: string; // identify mode — first 200 chars
jwtParts?: { // identify mode + JWT input
header: object;
payload: object;
signaturePresent: boolean;
};
}Variant differences
Variant | Padding | Special chars |
|---|---|---|
standard |
|
|
url-safe | omitted |
|
mime |
|
|
Identify mode content classification
Trigger | contentType |
|---|---|
First byte is |
|
Three base64 segments separated by |
|
Starts with |
|
Magic bytes |
|
Magic bytes |
|
Magic bytes |
|
Looks like protobuf wire format |
|
Printable UTF-8 / ASCII |
|
None of the above |
|
Base64 vs hex vs base32: which encoding to use
The three common binary-to-text encodings differ in compactness, alphabet, and case sensitivity. The choice rarely matters for tiny payloads but compounds at scale and affects which channels the encoded data can safely travel through.
Property | Base64 | Hex | Base32 |
|---|---|---|---|
Bits per character | 6 | 4 | 5 |
Overhead vs raw bytes | +33% | +100% | +60% |
Alphabet size | 64 | 16 | 32 |
Case-sensitive | Yes ( | No (most parsers) | No |
URL-safe by default | No ( | Yes | Yes |
Manual transcription | Hard (case + similar chars) | Easy | Easy |
Typical use | API payloads, JWT, embeds | Hashes, color codes, MAC addresses | TOTP secrets, ULIDs |
Use base64 when you want maximum compactness and the payload travels through systems that preserve case (HTTP headers, JSON bodies, XML). Use the URL-safe variant (- / _ instead of + / /) when the payload sits in a URL path or query parameter.
Use hex when the data is transcribed or read by humans (cryptographic hashes, debugging output, error codes) or when the parsing layer is case-insensitive (CSS colors, MAC addresses, IPv6 literals). The 2× size cost is irrelevant for short fingerprints and the human-readability is worth it.
Use base32 when you need URL-safety + case-insensitivity + better-than-hex compactness. TOTP shared secrets (RFC 6238) are the canonical example — they get printed in QR codes that may degrade and typed by humans into authenticator apps. ULID ids use Crockford-base32 for the same reasons.
For payloads larger than a few KB the size difference becomes meaningful: a 1MB blob is 1.33MB in base64 vs 2MB in hex. For payloads under 100 bytes the overhead is irrelevant — pick the alphabet that fits the channel.
Error codes
Code | When it fires | Recovery |
|---|---|---|
|
| Provide a non-empty string |
| Decode mode and input is not valid base64 | Verify the input is a base64 string |
|
| Use one of the three documented variants |
When NOT to use this tool
For binary file encoding (image → base64 for embedding), use image_to_base64 — it handles format conversion, resizing, and LLM-vision message blocks alongside the encoding. This tool is the generic base64 codec; the image-specific tool is the right surface for image workflows.
For non-base64 binary-to-text encodings (base32, base58, base85, hex), use dedicated encoders. Base64 is the most common but not the only option.
Performance notes
Typical execution: under 2ms for inputs under 100KB. Identify mode adds 1-3ms for the content-type detection. Deterministic — same input + same variant produce byte-identical output, so REST responses are Edge-Cache eligible.