obfus.link
Converters

Bytes, CSS units, LLM tokens: developer-specific unit conversion

Convert data sizes (with IEC vs SI base toggle), CSS units (with configurable base font and viewport), LLM tokens (with per-model tokenizer ratios for GPT-4, Claude, Llama 3, Gemini), time, and frequency.

The Dev Unit Converter handles developer-specific unit categories generic converters don't cover. Data sizes with IEC (1024-based) vs SI (1000-based) toggle. CSS units with configurable base font size and viewport. LLM tokens with model-specific tokenizer ratios for GPT-4, Claude, Llama 3, Gemini. Plus time and frequency units.

1. Insight

Insight

The problem this article addresses and why it matters.

Developer units don't fit generic converters

A standard unit converter handles miles to kilometres, Fahrenheit to Celsius, ounces to grams. Developers don't need any of that. Developers need: bytes between IEC binary (KiB, MiB, 1024-based) and SI decimal (KB, MB, 1000-based), CSS units between px / rem / em / vh / vw with a configurable base font size and viewport, and — increasingly — LLM tokens between models with different tokenizer ratios.

The unit-conversion question developers actually have at 11pm on a deploy night is "how many tokens is this prompt for GPT-4 vs Claude vs Llama 3?" — and no generic converter answers it.

Why a developer-specific converter

The tool in this article ships three unit categories generic converters don't cover:

  • Data sizes with IEC vs SI base toggle. The infamous 1 KB = 1000 bytes vs 1 KiB = 1024 bytes ambiguity is a parameter, not an argument.
  • CSS units with configurable root font size and viewport dimensions. 1rem is 16px in default Tailwind, 14px in some design systems; the tool takes the system's base size as a parameter.
  • LLM tokens with model-specific tokenizer ratios. 1000 tokens is about 750 English words for GPT-4 but 680 for Claude — the ratios are different and matter for cost estimation.

What this article delivers

Conversion walks across the three categories with attention to the parameters that matter (IEC vs SI base, baseFontSize for CSS, tokenModel for tokens). We cover the LLM token math that's the most-used feature, the CSS viewport conversions for responsive design, and the IEC/SI confusion that costs every storage team time.

2. Intent

Intent

What you will be able to do after reading.

By the end of this article you will be able to:

  • Convert data sizes between bytes / KB / MB / GB / TB / PB in either IEC (1024-based) or SI (1000-based) form
  • Convert CSS units between px / rem / em / vh / vw / % / pt / ch with configurable base font size and viewport
  • Convert between tokens / characters / words / pages with model-specific tokenizer ratios for GPT-4, Claude, Llama 3, Gemini
  • Convert time units (ms / seconds / minutes / hours / days) with human-readable output
  • Pick the right base (IEC vs SI) for data sizes based on which side of "GB or GiB" your downstream expects

The Examples section walks through each category against representative developer scenarios.

3. Examples

Examples

Annotated code and worked scenarios.

Before / after: LLM tokens to words for cost estimation

A 1500-token GPT-4 prompt — how many words is that?

devUnitConverter({
  value:      1500,
  from:       'tokens',
  to:         'words',
  category:   'tokens',
  tokenModel: 'gpt-4',
});

// result:    1125
// formatted: '≈ 1,125 words'
// formula:   '1500 tokens × 0.75 words/token (GPT-4) = 1125 words'
// note:      'Token-to-word ratios are approximate estimates'

Same input, Claude:

devUnitConverter({
  value:      1500,
  from:       'tokens',
  to:         'words',
  category:   'tokens',
  tokenModel: 'claude-sonnet',
});

// result: 1020  (Claude packs more characters per token than GPT-4 — 0.68 words/token)

The per-model ratio matters for cost estimation. If your prompt is 1500 words, that's 2000 GPT-4 tokens but ~2200 Claude tokens — different cost at different rates per million tokens.

Before / after: data sizes with base toggle

A file system reports 32 GB. That's the SI definition (32 × 10^9 bytes). Your container's disk allocation is 32 GiB (32 × 2^30 bytes). They're different by 3.7%.

devUnitConverter({
  value: 32,
  from:  'GB',
  to:    'bytes',
  category: 'data',
  binaryBase: 'si',
});

// result:    32000000000
// formatted: '32,000,000,000 bytes'

devUnitConverter({
  value: 32,
  from:  'GiB',
  to:    'bytes',
  category: 'data',
  binaryBase: 'iec',
});

// result:    34359738368
// formatted: '34,359,738,368 bytes'

The 3.7% difference matters when you're sizing a 32 GiB volume against a 32 GB disk plan and finding out the volume doesn't fit.

Before / after: CSS rem to px with custom base

devUnitConverter({
  value:    2.5,
  from:     'rem',
  to:       'px',
  category: 'css',
  baseFontSize: 16,
});

// result:    40
// formatted: '40px'

// With a different design system's base:
devUnitConverter({
  value:    2.5,
  from:     'rem',
  to:       'px',
  category: 'css',
  baseFontSize: 14,
});

// result: 35

The baseFontSize parameter is what makes the CSS conversion useful — generic px/rem converters assume 16px, but real codebases vary.

Before / after: viewport-relative units

devUnitConverter({
  value:    100,
  from:     'vh',
  to:       'px',
  category: 'css',
  viewportHeight: 1080,
});

// result: 1080  (100% of viewport height at 1080px tall)

devUnitConverter({
  value:    50,
  from:     'vw',
  to:       'px',
  category: 'css',
  viewportWidth: 1440,
});

// result: 720  (50% of viewport width at 1440px wide)

Useful when verifying a design layout works at specific viewport sizes — render 50vw mentally as 720px on a 1440px viewport without manually multiplying.

When humans use this

Developers estimating LLM prompt costs use the token converter daily — "is this prompt close to the 4k token limit?" The CSS converter shows up in design-system discussions when teams switch base font sizes (10px-based design systems vs 16px-based vs Tailwind's 16px default). The data-size converter solves the recurring "wait, is this GB or GiB?" question that appears in every storage planning conversation.

When agents use this

Two patterns:

  • Cost-estimation agent. An agent costing out LLM workflows converts word counts to tokens per model, then multiplies by the per-token cost. Pre-call cost projection prevents surprise bills when the prompt is larger than expected.
  • Design-token converter. An agent generating CSS from design specs (Figma exports, design-system tokens) converts unit values to the target codebase's preferred unit. Rem-based codebases get rem, px-based codebases get px.

Edge cases

Tokenizer ratio drift

The per-model ratios are approximations. Actual token counts depend on the specific text — code tokenises differently than prose, non-Latin scripts tokenise differently than English. For accurate token counts on a specific input, use the model's actual tokenizer (tiktoken for OpenAI, Anthropic SDK's count_tokens for Claude). The converter is for estimates, not exact counts.

IEC vs SI in the wild

Marketing materials almost always use SI (GB as 10^9 bytes — bigger number, looks better). Technical contexts often use IEC (GiB) for clarity. The disk you buy is SI; the disk your OS reports is sometimes IEC, sometimes SI depending on the OS. The tool surfaces the base in the output so the consumer can verify.

CSS unit ambiguity

em is relative to the parent's font size, not the root's. rem is relative to the root only. The tool's CSS conversion treats both as relative to the configured base font — accurate for rem, an approximation for em (where the parent's font size is the actual reference). For pixel-exact em conversion, use a browser DevTools inspector.

4. Documentation

Documentation

Reference signatures, edge cases, and lookup tables.

Input parameters

Field

Type

Required

Default

Description

value

number

The amount to convert

from

string

Source unit (e.g. 'tokens', 'GB', 'rem')

to

string

Target unit

category

'data' | 'tokens' | 'css' | 'time' | 'frequency'

Conversion category

tokenModel

'gpt-4' | 'gpt-4o' | 'claude-sonnet' | 'claude-opus' | 'llama-3' | 'gemini'

for tokens category

Tokenizer ratio model

baseFontSize

number

for CSS px↔rem

16

Root font size in px

viewportWidth

number

for vw

Viewport width in px

viewportHeight

number

for vh

Viewport height in px

binaryBase

'iec' | 'si'

for data

'si'

1024-based (IEC: KiB/MiB) vs 1000-based (SI: KB/MB)

Output shape

{
  result:    number;
  formatted: string;     // '1.44 GB' or '≈ 1,125 words'
  formula:   string;     // human-readable derivation
  note?:     string;     // disclaimers when the conversion is approximate
}

LLM tokenizer ratios (words per token)

Model

Ratio

Source

gpt-4

0.75

OpenAI's published estimate (1000 tokens ≈ 750 words English)

gpt-4o

0.77

Updated tokenizer (BPE with larger vocabulary)

claude-sonnet

0.68

Anthropic's character-per-token average for English

claude-opus

0.68

Same tokenizer family as Sonnet

llama-3

0.72

SentencePiece, Meta's published average

gemini

0.73

Google's documented estimate

Supported units by category

  • data: bytes, KB/KiB, MB/MiB, GB/GiB, TB/TiB, PB/PiB
  • tokens: tokens, characters, words, pages (1 page = 500 words)
  • css: px, rem, em, vh, vw, %, pt, ch
  • time: ms, seconds, minutes, hours, days
  • frequency: Hz, kHz, MHz, GHz, rpm

Error codes

Code

When it fires

Recovery

INPUT_EMPTY

value is 0 or missing

Provide a non-zero value

INPUT_INVALID_TYPE

from or to unit not in the category

Use a unit from the category's supported set

UNSUPPORTED_FORMAT

category value outside supported set

Use one of the five documented categories

When NOT to use this tool

For exact LLM token counts on a specific input, use the model's actual tokenizer (tiktoken for OpenAI, count_tokens in Anthropic SDK). This tool's per-model ratios are approximations calibrated to English prose; code and non-English text tokenise differently.

For real-time pixel-exact CSS layout debugging, use browser DevTools — the actual rendered pixel value depends on the live DOM context (font inheritance, viewport state, transforms). This tool gives the spec-level conversion.

Performance notes

Typical execution: under 1ms. Pure math, no I/O, fully deterministic. REST responses are Edge-Cache eligible.

Try it now

Dev Unit Converter

Convert data, tokens, CSS, and time units with LLM tokenizer ratios

FAQ

Frequently asked questions

How accurate are the LLM token ratios?

Approximations calibrated to English prose. Code tokenises differently than prose; non-English scripts tokenise differently than English. For exact counts on specific inputs, use the model's actual tokenizer (tiktoken for OpenAI, count_tokens in Anthropic SDK). The converter is for estimates, not exact counts.

When should I use IEC vs SI for data sizes?

Storage hardware and marketing use SI (a "1 TB" drive is 10^12 bytes). Operating systems and memory subsystems often use IEC (a "1 GiB" memory block is 2^30 bytes). They differ by ~7% at the gigabyte scale, ~10% at the terabyte scale. Match the convention of your downstream system.

Why does my rem-to-px conversion differ from my browser?

The tool defaults to 16px base font size. Many design systems use 14px or 10px bases. Pass baseFontSize explicitly to match your codebase's root font size. Browser DevTools shows the actual computed value at the live DOM context, which is the source of truth for rendered pixels.

What's the conversion accuracy?

Pure math, no floating-point drift in the supported ranges. The token estimates are the only approximate output — every other category produces exact conversions.