Markdown tables from CSV, TSV, JSON, and HTML — bidirectional — obfus.link

1. Insight

Insight

The problem this article addresses and why it matters.

Tables are the worst-formatted part of every markdown file

Markdown tables have a syntax that looks easy until you write one. The headers are separated from the body by a row of pipes and dashes. Columns need alignment indicators (:---, :---:, ---:). Cell content can't contain unescaped pipes, can't include newlines, can't have variable widths. The result is that almost every markdown table in a README, a documentation page, or a Slack message is either hand-formatted (and slightly misaligned) or generated from another tool that did the formatting work.

The problem is that "another tool" is usually a spreadsheet or a database export — and the conversion path from CSV / TSV / JSON / HTML to a properly-aligned markdown table is one of those small, repetitive jobs developers do over and over without thinking about whether it should be automated.

Why a bidirectional converter

The tool in this article handles both directions. Import mode reads CSV, TSV, JSON arrays of objects, or HTML <table> elements and produces a properly-aligned markdown table. Export mode reverses the direction: paste a markdown table, get CSV / TSV / JSON output for downstream consumption.

The bidirectional design matters for round-trip workflows: pull data from a database to CSV, convert to markdown for documentation, then six months later convert the markdown back to CSV when someone needs to regenerate the dashboard the docs were summarising. The roundtrip is lossless for the structural data (cell contents) and idempotent on repeated passes.

What this article delivers

End-to-end walks of importing from CSV (with quoted fields containing commas), JSON arrays of objects (with different keys per object), and HTML tables (with cell-spanning). Export mode walks the reverse — markdown to CSV with proper quoting, markdown to JSON with column-keys-as-object-keys. We cover the edge cases (uneven rows, missing headers, cell content with markdown formatting) and the formats not supported (Excel .xlsx, structured tables like LaTeX \tabular).

2. Intent

Intent

What you will be able to do after reading.

By the end of this article you will be able to:

Import CSV, TSV, JSON arrays of objects, or HTML tables into properly-aligned markdown tables
Export markdown tables back to CSV, TSV, or JSON for downstream consumption
Choose alignment per column when generating from headers + rows directly
Read the import warnings that flag uneven rows, missing headers, or malformed quoting
Identify the table-format edge cases (cell-spanning, multi-line cells, markdown-formatted cell content) and how the converter handles them

The Examples section walks through each format conversion direction against a representative table.

3. Examples

Examples

Annotated code and worked scenarios.

Before / after: CSV to markdown

A CSV export from a spreadsheet:

name,role,start_date,salary
"Alice Chen",Engineer,2024-03-15,$120000
"Bob Williams",Manager,2023-08-01,"$140,000"
"Carol O'Brien",Designer,2025-01-10,$95000

markdownTableGenerator({
  mode:        'import',
  importData:  csv,
  importFormat: 'csv',
});

// markdown:
// | name           | role     | start_date  | salary    |
// |----------------|----------|-------------|-----------|
// | Alice Chen     | Engineer | 2024-03-15  | $120000   |
// | Bob Williams   | Manager  | 2023-08-01  | $140,000  |
// | Carol O'Brien  | Designer | 2025-01-10  | $95000    |
//
// stats: { columns: 4, rows: 3 }
// importWarnings: []

The CSV parser handles the quoted "Bob Williams" with a comma inside the salary, the apostrophe in "Carol O'Brien", and the variable column widths. The output is column-aligned for readability — even if the markdown renderer doesn't need the alignment, humans reading the raw markdown do.

Before / after: JSON array to markdown

[
  { "user_id": 1, "email": "alice@example.com", "verified": true,  "tier": "pro" },
  { "user_id": 2, "email": "bob@example.com",   "verified": false, "tier": "free" },
  { "user_id": 3, "email": "carol@example.com", "verified": true                  }
]

markdownTableGenerator({
  mode:         'import',
  importData:   JSON.stringify(arr),
  importFormat: 'json',
});

// markdown:
// | user_id | email               | verified | tier |
// |---------|---------------------|----------|------|
// | 1       | alice@example.com   | true     | pro  |
// | 2       | bob@example.com     | false    | free |
// | 3       | carol@example.com   | true     |      |
//
// stats: { columns: 4, rows: 3 }
// importWarnings: ['Row 3 missing "tier" field — rendered as empty cell.']

The third row is missing tier. The converter detects this against the union of all keys seen in the array, renders an empty cell, and surfaces the omission as a warning so the developer can decide whether to fix the source data or accept the empty cell.

Before / after: HTML table to markdown

<table>
  <thead>
    <tr><th>Tier</th><th>Per call</th><th>Limit</th></tr>
  </thead>
  <tbody>
    <tr><td>Tier 3</td><td>$0.025</td><td>3/day</td></tr>
    <tr><td>Tier 2</td><td>$0.015</td><td>5/day</td></tr>
    <tr><td>Tier 1</td><td>$0.008</td><td>8/day</td></tr>
  </tbody>
</table>

markdownTableGenerator({
  mode:         'import',
  importData:   html,
  importFormat: 'html',
});

// markdown:
// | Tier   | Per call | Limit |
// |--------|----------|-------|
// | Tier 3 | $0.025   | 3/day |
// | Tier 2 | $0.015   | 5/day |
// | Tier 1 | $0.008   | 8/day |

HTML tables with <thead> and <tbody> are handled. Cell content that includes inline markdown (<strong>, <em>, <code>) is preserved as the corresponding markdown syntax (**...**, *...*, `...`).

Before / after: markdown to JSON (export)

| user_id | email               | verified |
|---------|---------------------|----------|
| 1       | alice@example.com   | true     |
| 2       | bob@example.com     | false    |

markdownTableGenerator({
  mode:          'export',
  markdownInput: mdTable,
  exportFormat:  'json',
});

// exported: '[
//   {"user_id":"1","email":"alice@example.com","verified":"true"},
//   {"user_id":"2","email":"bob@example.com","verified":"false"}
// ]'

Note: every cell becomes a string in the JSON output. The converter doesn't coerce types because markdown is type-less — "1" could be a string-shaped ID or a number, and the wrong guess corrupts data. Post-process the JSON in the consumer where the schema is known.

Before / after: generate from headers + rows directly

markdownTableGenerator({
  mode:    'generate',
  headers: ['Effect', 'Best for', 'Browser support'],
  rows: [
    ['Glassmorphism', 'Floating cards', 'All modern, Safari needs prefix'],
    ['Neumorphism',   'Tactile controls', 'All modern'],
    ['Aurora',        'Hero backgrounds', 'All modern; animation costs CPU'],
  ],
  alignment: ['left', 'left', 'left'],
});

// markdown:
// | Effect        | Best for          | Browser support                  |
// |---------------|-------------------|----------------------------------|
// | Glassmorphism | Floating cards    | All modern, Safari needs prefix  |
// | Neumorphism   | Tactile controls  | All modern                       |
// | Aurora        | Hero backgrounds  | All modern; animation costs CPU  |

Useful when generating tables programmatically — the alignment, column widths, and pipe placement are handled automatically.

When humans use this

The most common workflow is import mode: paste CSV or JSON from another tool, copy the resulting markdown table into a README or documentation file. The export mode is used less frequently but matters for round-trip: re-extracting structured data from documentation when the original source is lost.

When agents use this

Two production patterns:

Documentation-generation agent. An agent that auto-generates README files for a tool / library / data set pulls structured data from the source (database, JSON file, API), runs it through import mode to produce markdown tables, and embeds the tables in the generated markdown. The alignment guarantees readability without per-tool format logic.
Data-extraction agent. An agent that needs to extract a table from existing documentation (a competitor's docs, a third-party API reference) pulls the markdown table, runs export mode to JSON, and processes the structured data downstream.

Edge cases

Cells containing markdown formatting

A cell like **bold** text is preserved literally — the converter doesn't try to render it. When the markdown is rendered (by GitHub, MDN, or a static site generator), the bold renders correctly. When converting back to CSV the markdown syntax is preserved as part of the cell content; consumers that need plain text should strip markdown afterward.

Cells containing pipes

A cell containing | would break the markdown table syntax. The converter escapes pipes as \| in the output. The reverse direction (markdown to CSV) unescapes them.

Multi-line cells

Markdown tables can't represent multi-line cells natively. Some renderers (GFM in particular) support <br> inside cells. The converter preserves <br> from HTML imports and translates newlines in JSON / CSV imports to <br> in the markdown output. Round-trip is lossy: the <br> becomes a literal <br> in CSV / JSON export, not a real newline.

Cell-spanning (rowspan / colspan)

HTML tables can have rowspan / colspan. Markdown can't. The converter flattens spanning cells by duplicating the content into each spanned position and surfaces the lossy conversion as a warning. For markdown-incompatible tables, use HTML inside the markdown (most renderers accept inline HTML).

Empty tables

A markdown table with no rows still has the header + alignment-row structure. The converter accepts this (returns an array of headers with empty rows). Export mode on an empty table returns an empty CSV / JSON array, with the headers as the only output for CSV.

4. Documentation

Documentation

Reference signatures, edge cases, and lookup tables.

Input parameters

Field	Type	Required	Default	Description
`mode`	`'generate' \| 'import' \| 'export'`	✓	—	Workflow selector
`headers`	`string[]`	for generate mode	—	Column headers
`rows`	`string[][]`	for generate mode	—	Row data; each row is an array of cells
`alignment`	`('left' \| 'center' \| 'right')[]`	✗	left	Per-column alignment for generate mode
`importData`	`string`	for import mode	—	Source CSV / TSV / JSON / HTML
`importFormat`	`'csv' \| 'tsv' \| 'json' \| 'html' \| 'auto'`	✗	`'auto'`	Source format (auto-detected when set to `auto`)
`exportFormat`	`'csv' \| 'tsv' \| 'json'`	for export mode	—	Target format
`markdownInput`	`string`	for export mode	—	Markdown table to convert

Output shape

{
  markdown:    string;        // generated / imported table
  exported?:   string;        // when mode: 'export'
  stats: {
    columns:   number;
    rows:      number;
  };
  importWarnings?: string[];  // uneven rows, missing fields, format-specific issues
}

Import format detection (auto mode)

Heuristic	Detected format
Starts with `[` or `{`	`json`
Starts with `<table` (case-insensitive)	`html`
First line contains tabs but no commas	`tsv`
First line contains commas	`csv`
Otherwise	error — pass `importFormat` explicitly

Alignment indicators

Markdown alignment uses these patterns in the separator row:

Alignment	Pattern
`left`	`:---`
`center`	`:---:`
`right`	`---:`
default (no preference)	`---`

Error codes

Code	When it fires	Recovery
`INPUT_EMPTY`	Empty input for the chosen mode	Provide the required input
`INPUT_MALFORMED`	JSON / HTML parse failed	Verify the source is well-formed
`INPUT_INVALID_TYPE`	Generate mode with mismatched row / header column counts	Ensure every row has the same column count as headers
`UNSUPPORTED_FORMAT`	Auto-detect failed to identify the format	Pass `importFormat` explicitly

When NOT to use this tool

For Excel .xlsx files, use a dedicated library (xlsx, exceljs). The .xlsx format is a ZIP archive of XML files; markdown converters work on plain text inputs.

For very wide tables (more than 20 columns), markdown is the wrong rendering target — the table won't fit on most screens. Use HTML tables with horizontal scrolling or a table component in a real UI framework.

For tables with rich formatting (cell backgrounds, conditional styling, headers spanning multiple rows), use HTML directly inside the markdown. The converter handles HTML import but the inverse (markdown to styled-HTML) isn't part of the scope.

Performance notes

Typical execution: under 5ms for tables under 100 rows. The HTML parser is the most expensive case (10-30ms for 500-row tables). The tool is deterministic — same input + same parameters always produce byte-identical output — so REST responses are Edge-Cache eligible.

The CSV parser handles RFC 4180-compliant inputs including embedded commas, quotes, and newlines (when quoted). Non-standard CSV variants (TSV with quoted tabs, pipe-separated values) may need pre-processing.