✂️ Chunks · Paragraphs · Code blocks

Semantic Chunker — Split Long Text for LLMs Locally, Without Cutting Mid-Thought

Split a long document or codebase into LLM-sized chunks — without ever cutting mid-paragraph or mid-function.

Long text or code to split

How it works

Paste a long document or codebase and set a maximum chunk size. The chunker splits on paragraph breaks and never inside a fenced code block, falling back to sentence boundaries only when a block is oversized. Each chunk is numbered so you can paste them into an LLM in order, preserving context.

Why chunk text for LLMs?

A prompt longer than a model context window has to be split — but a naive character cut lands in the middle of a sentence, a JSON object or a function, and the model loses the thread. Semantic chunking breaks only at paragraph boundaries and keeps fenced code blocks intact, so each numbered chunk is self-contained. Paste them in order and the model follows the document as if it were never split.

FAQ

Is my text uploaded?
No. Chunking runs entirely in your browser in JavaScript — your document never leaves your device. The page only sends an anonymous usage counter (the tool name and the input size), never the content.
How does it keep chunks coherent?
It splits on double line breaks (paragraphs) and keeps fenced code blocks whole. Sentence and hard splits are used only as a last resort for blocks larger than the chunk size.
Is there a size limit?
Only your device memory. With no server you can chunk documents of many megabytes; large inputs are processed without freezing the page.
What chunk size should I use?
Set it below your model context window, leaving room for the reply — for example 8,000 to 12,000 characters per chunk for a typical chat model.

Related Tools