Tokenizer
Paste a prompt or document and see an estimated token count for GPT and Claude models, computed in your browser by a bundled tokenizer. No API key, no upload, no sign-up — your prompt never leaves your device.
What is the Tokenizer?
The Tokenizer is a free token counter that estimates how many tokens your text will use with GPT and Claude models. Large language models do not read words the way people do — they read tokens, the sub-word chunks a model is actually billed and limited by. Paste a prompt, an article, or a whole document and this tool shows the estimated token count instantly, so you know whether it fits a model's context window and roughly what an API call will cost. The counting is done by a tokenizer library bundled into the page, with no API key and no upload.
The token counter runs entirely in your browser, so your prompts never reach OpenAI, Anthropic, or our servers. That matters when the text is a confidential system prompt, an unreleased product description, a customer record, or proprietary source you would never paste into a random online counter. The tokenizer and its data are downloaded to your device and run locally — nothing is uploaded, logged, or saved.
How to use it
- Paste your prompt or document into the box above.
- Read the estimated token count for the GPT family and the Claude family — they appear side by side.
- Compare the count to your model's context window, and to your budget if you pay per token.
- Trim or split the text if you are over the limit, then copy it back out and send it.
The workflow this is built for is "estimate before you send": check the token cost of a prompt locally so you do not hit a context-limit error or an unexpected bill on the real API call. The numbers update as you edit, so you can watch the count drop while you trim.
The method behind the count
Tokenization splits text into sub-word units using byte-pair encoding (BPE) — the same family of algorithm a model uses internally. A few plain rules explain why the token number rarely matches the word number:
- A common English word is usually a single token, but a rare or long word splits into several pieces.
- Spaces and punctuation are not free — leading spaces and marks like commas consume tokens of their own.
- As a rule of thumb, English text averages about 0.75 words per token (roughly 4 characters per token), so ~1,000 tokens is about ~750 words — but the real ratio depends on the text.
- Different model families ship different tokenizers, so the same text produces a different token count on GPT versus Claude. This page reports each.
The GPT number uses the o200k_base encoding (current GPT-4o,
GPT-4.1, GPT-4o-mini, and o-series models). The Claude number uses the
well-known cl100k_base encoding as a close approximation, since
Anthropic does not publish a browser-ready tokenizer; that same encoding is
also what legacy GPT-3.5 and GPT-4 used. Both are close estimates,
not a billing guarantee — treat them as a reliable planning number
rather than an exact match to a provider's live tokenizer.
Examples
- "Hello, world!" tokenizes to more tokens than its two words suggest, because the comma, the exclamation mark, and the leading space each take a token.
- "antidisestablishmentarianism" is one word but splits into several tokens — a clear case of why token count outruns word count for long or rare words.
- A 750-word article lands at roughly 1,000 tokens, handy for checking that a document plus the model's reply will fit inside an 8K, 32K, or 128K context window.
Common use cases
- Prompt engineers checking that a system prompt plus the expected response fits a model's context window.
- Developers estimating API cost up front, since pricing is billed per token.
- RAG and pipeline builders trimming retrieved chunks to stay under a token budget per call.
- Anyone comparing models who wants to see how the same text tokenizes differently across GPT and Claude.
- Privacy-sensitive teams who need a token count for confidential text without sending it to a third-party service.
Why use this tokenizer
Most token counters either run a vendor widget or quietly send your text to a server to count it. This one bundles a real BPE tokenizer and runs it client-side: there is no API key to paste, no network call, and no cost — and because the text is never transmitted, it is safe for confidential prompts and proprietary documents. It also shows tokens right next to words and characters, so the gap between "how many words" and "how many tokens" is visible at a glance instead of being a mystery.
It belongs to a small, focused text toolkit. To analyse plain prose, the Word Counter adds sentences, paragraphs, and reading time. For a strict character limit, the Character Counter is purpose-built. And to tidy a messy prompt before you count it, the Text Formatter strips stray whitespace and line breaks in one step.
Frequently asked questions
What is a token and why isn't it the same as a word?
Large language models break text into tokens, which are sub-word chunks rather than whole words. A common English word is often one token, but longer or rarer words split into several, and spaces and punctuation also consume tokens. As a rough guide, English text averages about 0.75 words per token, so 1,000 tokens is roughly 750 words — but the exact ratio depends on the text.
Are the GPT and Claude token counts exact?
The counts are close estimates produced by a bundled tokenizer running in your browser. Each model family uses its own tokenizer, and vendors occasionally update them, so for billing-critical decisions treat the number as a reliable estimate rather than an exact match to a specific provider's live tokenizer.
Is my prompt sent to OpenAI, Anthropic, or any server?
No. The tokenizer library is bundled into the page and runs entirely in your browser, so your prompt or document is never sent to any AI provider or to our servers. This makes it safe for confidential prompts and proprietary text.
Why do I care about token count?
Token count determines whether your text fits inside a model's context window and is the basis for API pricing, which is billed per token. Estimating tokens before you send a request helps you avoid context-limit errors and control cost.