obfus.link
Encoders

Base64 with auto-detect: identifying JSON, JWTs, PEM certs, and image data in opaque payloads

Encode and decode base64 in standard, URL-safe, or MIME variants. Identify mode auto-detects the variant, classifies the decoded content type (JSON, JWT, PEM, image, plain text), and returns parsed JWT segments when applicable.

The Base64 Codec encodes and decodes UTF-8 in standard, URL-safe, or MIME variants. Identify mode classifies decoded content as JSON, JWT (with parsed header and payload), PEM certificate, image (with detected MIME type), protobuf, plain text, or binary. Auto-detection of the variant means decoders accept any common base64 form.

1. Insight

Insight

The problem this article addresses and why it matters.

Base64 strings are opaque until they aren't

You receive a base64-encoded string from an API response, a webhook payload, or a log entry. What's inside? Standard base64 decoders just give you bytes. The bytes might be JSON, might be a JWT, might be a PEM-encoded certificate, might be the contents of an image file. Without knowing what's inside, you can't know what to do next.

The tool in this article adds an identify mode on top of standard encode/decode. Paste any base64 string, get back: which variant it uses (standard, URL-safe, MIME), what the decoded content actually is (JSON, JWT, image, PEM certificate, protobuf, plain text), and a preview of the decoded content. If it's a JWT, the header and payload come back parsed.

This solves the daily "what is this opaque blob?" question that comes up in API debugging, log analysis, and forensic work.

What this article delivers

Three modes walked end-to-end: standard encode/decode against UTF-8 text, the identify mode against representative payloads (JSON, JWT, PEM, image), and the variant detection that handles standard vs URL-safe vs MIME base64 automatically.

2. Intent

Intent

What you will be able to do after reading.

By the end of this article you will be able to:

  • Encode and decode UTF-8 strings to and from base64 in standard, URL-safe, or MIME variants
  • Use identify mode to determine which variant an opaque base64 string uses, what the decoded content is, and a preview of the content
  • Recognise JWT-shaped base64 inputs and parse the header / payload into structured objects
  • Distinguish JSON, JWT, image, PEM certificate, protobuf, and plain-text content automatically from decoded bytes
  • Handle the URL-safe variant (- / _ instead of + / /) without per-call manual translation

The Examples section walks through encode, decode, and identify against representative payloads.

3. Examples

Examples

Annotated code and worked scenarios.

Before / after: identify an opaque blob

You found this in a log file:

eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redacted
base64Codec({
  input: 'eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoxLCJleHAiOjE3NDc5Mjk4OTJ9.signature_redacted',
  mode:  'identify',
});

// output:          '<JWT — decoded below>'
// variant:         'url-safe'
// contentType:     'jwt'
// contentPreview:  '{"alg":"HS256"}{"user":1,"exp":1747929892}<signature>'
// jwtParts: {
//   header:  { alg: 'HS256' },
//   payload: { user: 1, exp: 1747929892 },
//   signaturePresent: true,
// }

Three pieces of information in one call: the variant (URL-safe), the content type (JWT), and the structured parse. No manual base64-decode then JSON-parse then header-and-payload-extract — the tool collapses all three steps.

Before / after: standard encode/decode

base64Codec({
  input:    'Hello, world!',
  mode:     'encode',
  variant:  'standard',
});

// output:      'SGVsbG8sIHdvcmxkIQ=='
// variant:     'standard'
// inputBytes:  13
// outputBytes: 20

Same content, URL-safe variant:

base64Codec({
  input:    'Hello, world!',
  mode:     'encode',
  variant:  'url-safe',
});

// output: 'SGVsbG8sIHdvcmxkIQ'  // no padding, no '+' or '/'

URL-safe drops the padding = characters and uses - / _ instead of + / /. Used in JWT signatures, OAuth state tokens, and anywhere a base64 string lands in a URL.

Before / after: decoding without knowing the variant

base64Codec({
  input: 'SGVsbG8sIHdvcmxkIQ',  // could be either variant
  mode:  'decode',
});

// output:  'Hello, world!'
// variant: 'url-safe'  (auto-detected from the absence of '+', '/', '=')

The decoder auto-detects the variant from the input characters. Useful when consuming base64 from an unknown source.

Before / after: identifying decoded content types

Identify mode classifies the decoded bytes:

Decoded content

contentType

Starts with { or [, parses as JSON

'json'

Three base64 segments separated by .

'jwt'

-----BEGIN CERTIFICATE----- ...

'pem-certificate'

PNG, JPEG, GIF magic bytes

'image/png', 'image/jpeg', 'image/gif'

Looks like protobuf wire format

'protobuf'

Printable ASCII / UTF-8

'plain-text'

Everything else

'binary'

When the content is a PEM certificate, the preview includes the PEM headers so the consumer can identify the cert type. When it's an image, the preview shows the image dimensions and format.

When humans use this

The dominant use is the "what is this?" workflow on opaque payloads — pasting from a log, an API response, or a captured webhook. The JWT detection is the highest-frequency hit (most modern systems use JWT-shaped base64 for auth tokens).

When agents use this

Two patterns:

  • Payload classification. An agent receiving an opaque base64 blob from an upstream system runs identify mode first to determine what to do with the content. Branch on contentType: JSON gets parsed, JWT gets validated, image gets passed to a vision tool.
  • Log analysis. An agent processing log entries with embedded base64 strings classifies each and extracts the meaningful structure (JWT payload, JSON object). The same pipeline handles heterogeneous logs.

Edge cases

Padding mismatch

Standard base64 requires padding (=) to a 4-byte boundary; URL-safe omits padding. Decoders should accept both. The tool's decoder is lenient — accepts padded and unpadded inputs in either variant. Output encoding is canonical for the chosen variant.

Truncated input

Truncated base64 (the producer cut the string before the end of an encoded byte) returns INPUT_MALFORMED. The decoder doesn't attempt partial-byte recovery.

Non-base64 characters

Input with characters outside the base64 alphabet (whitespace within the string, line breaks from MIME variant) is handled by stripping whitespace before decoding (MIME variant explicitly allows wrapped lines). Other invalid characters trigger INPUT_MALFORMED.

4. Documentation

Documentation

Reference signatures, edge cases, and lookup tables.

Input parameters

Field

Type

Required

Default

Description

input

string

The string to encode, decode, or identify

mode

'encode' | 'decode' | 'identify'

Workflow selector

variant

'standard' | 'url-safe' | 'mime'

auto-detect for decode

Base64 variant

inputEncoding

'utf-8' | 'ascii' | 'binary'

'utf-8'

Encoding of the source string for encode mode

Output shape

{
  output:          string;
  variant:         'standard' | 'url-safe' | 'mime';
  inputBytes:      number;
  outputBytes:     number;
  contentType?:    string;     // identify mode
  contentPreview?: string;     // identify mode — first 200 chars
  jwtParts?: {                 // identify mode + JWT input
    header:  object;
    payload: object;
    signaturePresent: boolean;
  };
}

Variant differences

Variant

Padding

Special chars

standard

= to 4-byte boundary

+ / /

url-safe

omitted

- / _

mime

= + 76-char line wrap

+ / /

Identify mode content classification

Trigger

contentType

First byte is { or [, parses as JSON

json

Three base64 segments separated by . (header + payload + signature)

jwt

Starts with -----BEGIN

pem-certificate

Magic bytes 89 50 4E 47

image/png

Magic bytes FF D8 FF

image/jpeg

Magic bytes 47 49 46 38

image/gif

Looks like protobuf wire format

protobuf

Printable UTF-8 / ASCII

plain-text

None of the above

binary

Base64 vs hex vs base32: which encoding to use

The three common binary-to-text encodings differ in compactness, alphabet, and case sensitivity. The choice rarely matters for tiny payloads but compounds at scale and affects which channels the encoded data can safely travel through.

Property

Base64

Hex

Base32

Bits per character

6

4

5

Overhead vs raw bytes

+33%

+100%

+60%

Alphabet size

64

16

32

Case-sensitive

Yes (Aa)

No (most parsers)

No

URL-safe by default

No (+, /)

Yes

Yes

Manual transcription

Hard (case + similar chars)

Easy

Easy

Typical use

API payloads, JWT, embeds

Hashes, color codes, MAC addresses

TOTP secrets, ULIDs

Use base64 when you want maximum compactness and the payload travels through systems that preserve case (HTTP headers, JSON bodies, XML). Use the URL-safe variant (- / _ instead of + / /) when the payload sits in a URL path or query parameter.

Use hex when the data is transcribed or read by humans (cryptographic hashes, debugging output, error codes) or when the parsing layer is case-insensitive (CSS colors, MAC addresses, IPv6 literals). The 2× size cost is irrelevant for short fingerprints and the human-readability is worth it.

Use base32 when you need URL-safety + case-insensitivity + better-than-hex compactness. TOTP shared secrets (RFC 6238) are the canonical example — they get printed in QR codes that may degrade and typed by humans into authenticator apps. ULID ids use Crockford-base32 for the same reasons.

For payloads larger than a few KB the size difference becomes meaningful: a 1MB blob is 1.33MB in base64 vs 2MB in hex. For payloads under 100 bytes the overhead is irrelevant — pick the alphabet that fits the channel.

Error codes

Code

When it fires

Recovery

INPUT_EMPTY

input empty

Provide a non-empty string

INPUT_MALFORMED

Decode mode and input is not valid base64

Verify the input is a base64 string

INPUT_INVALID_TYPE

variant value outside the supported set

Use one of the three documented variants

When NOT to use this tool

For binary file encoding (image → base64 for embedding), use image_to_base64 — it handles format conversion, resizing, and LLM-vision message blocks alongside the encoding. This tool is the generic base64 codec; the image-specific tool is the right surface for image workflows.

For non-base64 binary-to-text encodings (base32, base58, base85, hex), use dedicated encoders. Base64 is the most common but not the only option.

Performance notes

Typical execution: under 2ms for inputs under 100KB. Identify mode adds 1-3ms for the content-type detection. Deterministic — same input + same variant produce byte-identical output, so REST responses are Edge-Cache eligible.

Try it now

Base64 Codec

Encode, decode, and identify base64 with auto-detection

FAQ

Frequently asked questions

What's the difference between standard and URL-safe?

Standard uses + and / as the special characters and = for padding. URL-safe uses - and _ instead, and typically omits padding. URL-safe is required when the base64 string lands in a URL path or query parameter; standard works everywhere else.

How does identify mode classify content?

Magic-byte detection for images, JSON.parse for JSON, three-segment dot structure for JWT, BEGIN-marker detection for PEM, printable-character ratio for plain text. The classification is heuristic; an opaque blob that happens to look like JSON gets classified as JSON.

Can I encode binary?

For arbitrary binary input the tool accepts inputEncoding: 'binary' and treats the string as raw bytes. For image-specific binary encoding (PNG, JPEG, vision-API formatting), use image_to_base64 — it handles format conversion and provider-specific output blocks.

Why does my JWT identify as base64 instead of JWT?

JWT detection needs the three-segment dot structure. A token with the signature removed (header.payload only) is classified as base64. Pass the full three-segment token for JWT-specific parsing.