1. Insight
Insight
The problem this article addresses and why it matters.
The "respond in JSON" tax
Every team using an LLM in production has hit the same wall: you prompt the model to "respond in JSON only, no prose," and it does — until the day it doesn't. Sometimes the model wraps the JSON in a ```json fence. Sometimes it adds a single explanatory sentence above. Sometimes — and this is the one that pages you at 2am — it returns a structurally valid object with a trailing comma that breaks JSON.parse. OpenAI's Structured Outputs and Anthropic's tool-use schema cover the happy path. They don't cover legacy models, fine-tuned variants, fallback paths, or the dozen smaller providers that don't expose schema-enforced decoding.
The naïve fix is try { JSON.parse(raw) } catch { ... retry the LLM call ... }. That works most of the time. It also doubles your token cost on a non-trivial fraction of requests and pushes a one-shot parse error into a multi-second retry loop that compounds when chained agents are sharing the same upstream provider's rate limit.
Why surface-level regex stripping isn't enough
The common quick-fix is raw.replace(/^```json\n/, '').replace(/```$/, ''). It catches the markdown fence case. It does nothing for trailing commas, single-quoted strings, unquoted property names, smart quotes (" instead of "), comments (// like this), or truncated responses (model hit max_tokens mid-object). Each of those failure modes is rare. The point is that they're not rare together: across a production volume of 100k calls/day, you'll see every variant before the end of the week.
What this article delivers
The llm_to_json_cleaner tool handles the full set: markdown fence stripping, trailing-comma repair, comment removal, smart-quote normalisation, unquoted-key promotion, and (in aggressive mode) structural completion of truncated responses. We'll walk through the repair pipeline, show how to wire the optional Zod schema validation pass to make the cleaner double as a contract enforcement step, and explain the repairConfidence score that tells you how much the tool had to guess.
2. Intent
Intent
What you will be able to do after reading.
By the end of this article you will be able to:
- Parse LLM output that fails
JSON.parsewithout a round-trip retry to the model - Choose between conservative and aggressive repair strategies based on confidence in the upstream response
- Layer Zod schema validation onto the cleaner so a single call gives you both parsed JSON and structured field errors
- Use the
repairConfidencescore to gate auto-acceptance vs human review in agent pipelines - Identify the failure modes that the cleaner cannot recover from and what to do when you hit them
The Examples section walks through each repair category with the before/after that triggers it.
3. Examples
Examples
Annotated code and worked scenarios.
Before / after: the markdown-fence wrapper
The most common output from a chat-style model:
Before:
Sure, here's the JSON you asked for:
```json
{
"user_id": 4291,
"email": "alice@example.com",
"verified": true
}
```
Let me know if you need anything else!JSON.parse throws Unexpected token S in JSON at position 0.
After:
const result = llmToJsonCleaner({
raw: llmResponse,
strict: false,
repairStrategy: 'conservative',
});
result.json
// { user_id: 4291, email: 'alice@example.com', verified: true }
result.transformations
// [
// { type: 'strip_prose_preamble', position: 0 },
// { type: 'strip_markdown_fence', position: 27 },
// { type: 'strip_prose_trailer', position: 145 },
// ]
result.repairConfidence // 1.0 — only wrapping removed, content untouchedThe transformations array names every change so you can audit what was stripped. The repairConfidence: 1.0 signals that the inner content was preserved exactly.
Before / after: trailing commas + comments + smart quotes
Sometimes a model emits JSON that looks fine to a human and breaks on the parser:
Before:
{
// Q4 results
"revenue": 1_240_000,
"growth": "0.23",
"regions": ["NA", "EU", "APAC",],
}Three issues: the // comment, the underscore numeric separator (valid JS, not JSON), the trailing comma after "APAC". The model produced something a Node REPL would accept.
After:
llmToJsonCleaner({
raw: malformed,
strict: false,
repairStrategy: 'conservative',
});
// json: {
// revenue: 1240000,
// growth: "0.23",
// regions: ["NA", "EU", "APAC"],
// }
// transformations: [
// { type: 'strip_line_comment', position: 6 },
// { type: 'normalise_numeric_sep', position: 27 },
// { type: 'strip_trailing_comma', position: 88 },
// ]
// repairConfidence: 0.94Confidence is below 1.0 because the cleaner changed characters (normalising the numeric separator) rather than only stripping wrapping.
Before / after: truncated response
When the model hits max_tokens mid-output, you get a response that ends abruptly:
Before:
{
"users": [
{ "id": 1, "name": "Alice" },
{ "id": 2, "name": "Bob" },
{ "id": 3, "name": "CharlierepairStrategy: 'conservative' returns json: null with transformations: [] and repairConfidence: 0.0 — the truncation is too risky to guess at.
repairStrategy: 'aggressive' attempts structural completion:
llmToJsonCleaner({
raw: truncated,
strict: false,
repairStrategy: 'aggressive',
});
// json: {
// users: [
// { id: 1, name: 'Alice' },
// { id: 2, name: 'Bob' },
// ] // <-- Charlie dropped, the truncated entry is removed
// }
// transformations: [
// { type: 'drop_truncated_element', position: 79 },
// { type: 'close_array', position: 109 },
// { type: 'close_object', position: 110 },
// ]
// repairConfidence: 0.41repairConfidence: 0.41 is the signal that the result is significantly synthesised. Agents typically gate on < 0.6 to route to a re-query or a human-in-the-loop step.
Before / after: validation pass alongside repair
The targetSchema parameter accepts a Zod-compatible JSON Schema. The cleaner repairs the input, then validates the cleaned object against your schema in the same call:
const result = llmToJsonCleaner({
raw: llmResponse,
strict: true,
repairStrategy: 'conservative',
targetSchema: {
type: 'object',
properties: {
user_id: { type: 'number' },
email: { type: 'string', format: 'email' },
verified: { type: 'boolean' },
},
required: ['user_id', 'email', 'verified'],
},
});
result.json // parsed + cleaned
result.schemaErrors // [{ path: 'email', message: 'invalid email format' }, ...]A single round-trip replaces what would otherwise be clean → JSON.parse → Zod.parse as three separate steps in a pipeline, each with its own error path.
When humans use this
A developer iterating on a prompt sees the cleaner's transformations log and uses it to spot which output patterns the model produces most often. That informs prompt tweaks (e.g. "do not wrap your response in markdown") that reduce the rate of repair-needed responses over time. The repairConfidence score also surfaces in the web UI as a coloured badge — green for ≥ 0.95, amber for 0.6-0.94, red for < 0.6 — giving an at-a-glance sense of how trustworthy a given response was.
When agents use this
This is the highest-value tool in the grid for agentic workflows because LLM output failures are the most expensive failure mode in an agent pipeline. A single re-query to a frontier model can cost 10-100× a llm_to_json_cleaner call. The pattern in production:
- Agent receives raw LLM output it expects to be JSON.
- Agent calls
llm_to_json_cleanerwithstrict: false,repairStrategy: 'conservative', and the expected schema. - If
repairConfidence ≥ 0.95ANDschemaErrorsis empty — accept and continue. - If
repairConfidence ≥ 0.6ANDschemaErrorsis empty — log and continue (the cleaner had to repair, but the schema check passed, so the result is trustworthy). - If
repairConfidence < 0.6ORschemaErrorsnon-empty — retry the LLM call once (only once), now with the schema errors injected into the prompt.
This converts a 5-10% LLM-output retry rate into a 0.5-1% retry rate, with the savings going straight to the bottom-line token budget.
Edge cases
JSON inside markdown inside JSON
Nested wrapping (markdown fence inside a JSON string value) is unsupported. The cleaner unwraps the outer fence. The inner content is preserved verbatim, including any inner fence the model emitted as data. If you need recursive unwrapping, pipe the inner value back through the cleaner with the same parameters.
Mixed-case keys
If the model returns {"User_ID": 1} and your schema expects user_id, the cleaner does not rename keys. That's a schema transformation, not a JSON repair. Use the schema validation pass to surface the mismatch as a schemaErrors entry, then handle it explicitly in your agent logic.
Empty input
llmToJsonCleaner({ raw: '' }) returns INPUT_EMPTY. Empty input is not "missing JSON" — the cleaner won't synthesise structure from nothing.
4. Documentation
Documentation
Reference signatures, edge cases, and lookup tables.
Input parameters
Field | Type | Required | Default | Description |
|---|---|---|---|---|
|
| ✓ | — | The LLM's response, exactly as received |
|
| ✗ |
| When true, return error on any unrepairable input. When false, do best-effort and surface |
|
| ✗ | — | Zod-compatible JSON Schema. When provided, the cleaner runs a validation pass and surfaces structured |
|
| ✗ |
| Conservative: strip wrapping, fix commas/comments only. Aggressive: also infer closing brackets and complete truncated responses |
Output shape
{
json: object | null; // parsed result, null only when strict + unrepairable
cleaned: string; // the post-repair JSON string
transformations: Array<{
type: string; // 'strip_markdown_fence' | 'strip_trailing_comma' | ...
description: string;
position?: number; // character offset in raw input
}>;
schemaErrors?: Array<{ // only when targetSchema provided
path: string; // e.g. 'data.users[0].email'
message: string;
}>;
repairConfidence: number; // 0.0 to 1.0 — see below
}Repair confidence scale
Score | Meaning |
|---|---|
1.0 | Only wrapping (markdown fence, prose) stripped; inner content untouched |
0.95 – 0.99 | Minor non-destructive fixes (trailing commas, comments) |
0.8 – 0.94 | Character-level changes (smart quotes, numeric separators) |
0.6 – 0.79 | Structural changes (unquoted key promotion, single-quoted string conversion) |
< 0.6 | Aggressive structural completion (truncation repair, inferred closures) — gate before auto-accepting |
Error codes
Code | When it fires | Recovery |
|---|---|---|
|
| Provide a non-empty input |
|
| Pre-truncate; LLM responses rarely need to be this large |
| Strict mode + unrepairable input | Either switch to non-strict mode or retry the LLM call with a clearer prompt |
| Prompt-injection pattern detected | Sanitise the LLM response source upstream; do not retry with the same input |
When NOT to use this tool
If your model supports schema-enforced decoding (OpenAI Structured Outputs, Anthropic tool-use with input_schema), use that first. Native schema enforcement is structurally better than repair-after-the-fact. The cleaner is for: legacy models without schema support, fallback paths when the schema-aware call fails, and provider switching where you can't guarantee the new provider has the same enforcement.
If the input might contain a JSON payload mixed with non-JSON content the user actually wants to read (e.g. an LLM-generated email with a JSON block at the bottom), the cleaner's prose-stripping behaviour is too aggressive. Extract the JSON range manually before passing to the cleaner.
Performance notes
Typical execution: under 5ms for inputs below 50KB. Strict-mode validation pass adds 2-10ms depending on schema depth. Memory usage is bounded by the input size (no streaming yet — single-pass parser). The repair is deterministic: same input + same parameters always produces byte-identical output, which means the result is eligible for Edge Cache when called via the REST endpoint.