How to use TOON with LLMs
If you have ever pasted a large JSON array into ChatGPT or Claude, you have likely felt the pain of the context window closing in. JSON is fantastic for web APIs, but for Large Language Models (LLMs), it is incredibly wasteful. Repeating field names like "id":, "name":, and "timestamp": for every single record isn't just redundant; it burns through tokens that cost real money and valuable context space.
This is where TOON (Table Object Notation) shines. It isn't just a data format; it is a strategy for optimizing LLM interactions. By stripping away the syntax tax of JSON and adding explicit structure headers, TOON allows you to pass more data to your models and get more reliable structured outputs in return.
The Token Economics of TOON
Why bother switching formats? The math is simple. In a standard JSON array of objects, the schema is repeated for every row. If you have a list of 50 users, you are paying for the field names 50 times.
TOON eliminates this redundancy by declaring the schema once in the header. The data follows in a dense, stream-lined format. In practice, this typically results in a 30-60% reduction in token usage for uniform arrays compared to formatted JSON. When you are dealing with massive context windows or high-volume API calls, that efficiency translates directly to lower bills and lower latency.
Sending Data: The "Show, Don't Tell" Rule
When you need an LLM to analyze data, your prompt strategy is crucial. Beginners often write long paragraphs explaining the data format. With TOON, you don't need to do that.
LLMs are pattern-matching engines. They intuitively understand TOON because it looks like a hybrid of YAML and CSV—formats they have seen billions of times during training.
To send data, simply wrap it in a fenced code block. You can label it toon, but even if the model’s syntax highlighter doesn’t officially support it, the model understands the structure immediately.
Input Example
Instead of describing the schema, just provide the block:
The header users[3]{id,name,role,lastLogin} tells the model everything it needs to know: the entity type, the count (3 rows), and the order of fields. The indentation handles the hierarchy. This "self-documenting" nature frees up your prompt to focus on the actual logic task rather than syntax parsing instructions.
Here is the user activity log. The data is in TOON format (2-space indent, explicit headers).
```toon
users[3]{id,name,role,lastLogin}:
1,Alice,admin,2025-01-15T10:30:00Z
2,Bob,user,2025-01-14T15:22:00Z
3,Charlie,user,2025-01-13T09:45:00Z
Task: Analyze the logs and identify which users haven't logged in within the last 24 hours.
Generating Reliable Output
Getting an LLM to read data is easy; getting it to generate valid structured data is the hard part. Models love to hallucinate, truncate JSON, or forget closing braces.
TOON adds a layer of safety through its header syntax, specifically the [N] count. When you ask a model to output TOON, you are asking it to commit to a structure before it generates the data.
Prompting for Generation
To get the best results, provide the header format you expect and instruct the model to fill the rows.
By asking the model to calculate [N], you force a "chain of thought" process where the model must plan the output size. This seemingly small constraint significantly reduces the likelihood of the model cutting off halfway through a list.
Task: Return a list of active users with the role "user".
Format: Use TOON. Set the [N] value in the header to match the exact number of rows you generate.
Expected format:
users[N]{id,name,role,lastLogin}:
Validating with Strict Mode
When you receive the response from the LLM, you shouldn't just trust it. This is where the TOON library’s strict mode becomes a superpower for production applications.
If you are using the TypeScript library, decoding with strict mode validates that the generated rows match the header count:
This allows you to programmatically catch "lazy" model outputs or network truncations immediately, rather than discovering bad data downstream in your application.
import { decode } from '@toon-format/toon';
try {
// If the model says [5] but provides 4 rows, this throws an error.
const data = decode(modelOutput, { strict: true });
console.log('Valid data received:', data);
} catch (error) {
console.error('Model hallucination or truncation detected:', error.message);
}
Advanced Optimization: The Tab Trick
If you are obsessed with optimization (and in the world of LLMs, you probably should be), you can squeeze out even more efficiency by choosing your delimiters wisely.
Commas are standard, but tabs (\t) are often represented as a single token in many tokenizer vocabularies. Furthermore, tabs rarely appear inside natural text fields, which reduces the need for escape characters (like wrapping strings in quotes).
You can encode your data using tabs before sending it to the model:
Just remember to inform the model in the prompt: "Data is tab-separated TOON." This creates a hyper-compact representation that is incredibly easy for the model to parse and generate.
const toonPrompt = encode(data, { delimiter: '\t' });
A Complete Workflow Example
Let’s look at a real-world scenario: filtering system logs. You want to send raw logs to the model and get a structured list of errors back.
The Prompt:
The Model Output:
System logs in TOON format (tab-separated):
```toon
events[4 ]{id level message timestamp}:
1 error Connection timeout 2025-01-15T10:00:00Z
2,warn,Slow query,2025-01-15T10:05:00Z
3 info User login 2025-01-15T10:10:00Z
4 error Database error 2025-01-15T10:15:00Z
Task: Extract all events with level 'error'. Return the result as valid TOON with an updated header count.
The Result:
events[2 ]{id level message timestamp}:
1 error Connection timeout 2025-01-15T10:00:00Z
4 error Database error 2025-01-15T10:15:00Z
The model correctly filtered the list and, crucially, updated the header to events[2]. By decoding this response, you get a clean, type-safe array ready for your application logic.
Summary
TOON bridges the gap between human readability and machine efficiency. It respects the cost constraints of LLMs while providing the structure required for robust software development.
- Keep it small: Use 2-5 rows in your examples; the model will generalize.
- Be explicit: Define headers clearly so the model knows the schema.
- Validate strictly: Use the format’s metadata to catch generation errors.
By moving away from JSON for your prompt payloads, you aren't just saving tokens—you are building a more reliable AI pipeline.