TOON (Token-Oriented Object Notation) is a data serialization format designed specifically for LLM prompts to be highly efficient. It dramatically reduces token usage by 30-60% compared to JSON while remaining structured and human-readable. By using a tabular format for arrays and minimal syntax for objects, TOON makes your data cheaper and faster to process with AI models.

What's the difference between TOON and JSON?

The key difference is token efficiency. JSON is verbose, with brackets, quotes, and commas that consume tokens. TOON is a more compact syntax designed for LLMs, representing arrays as tables with headers and using minimal punctuation. This efficiency directly translates to significant cost savings on your LLM API bills, especially for large or repeated datasets.

How much can I save with TOON?

You can typically expect to save 30-60% on LLM tokens compared to using JSON. For large datasets or frequent API calls, this translates directly into significant cost savings. Data with repeated structures, like API responses or database results, often sees savings at the higher end of this range (40-60%).

Is TOON compatible with all LLMs?

Yes. TOON is a simple text format that works flawlessly with all major large language models, including those from OpenAI (GPT-4), Anthropic (Claude), Google (Gemini), and Meta (LLaMA). Since any LLM can process plain text, they can all be instructed to understand and parse the TOON format with a simple instruction in your prompt.

Can I convert TOON back to JSON?

Absolutely. TOON is fully and losslessly reversible. Our converter tool supports bidirectional conversion, meaning you can convert TOON back to the exact original JSON structure without any data loss. This allows you to use TOON for efficiency and then convert back to JSON for compatibility with other tools.

What types of data work best with TOON?

TOON can represent any valid JSON data, but it delivers the highest token savings (40-60%) on uniform tabular data. This includes database query results, API responses with lists of objects, analytics data, or product catalogs. While TOON fully supports nested objects and arrays, the token reduction is most dramatic with flatter, more repetitive data structures.

Is my data safe when using this converter?

100% safe. All conversion from JSON to TOON (and back) happens locally in your browser. Your data is never sent to any server, never stored, and never seen by us. The converter even works offline once the page has loaded, guaranteeing your information remains private.

Yes, completely free. Both this TOON converter and the underlying TOON format specification are open and free to use without any limits, file size restrictions, or premium features. It's an open-source effort to make working with LLMs more efficient for everyone.

如何与法学硕士一起使用 TOON

法学硕士

及时工程

如果您曾经将大型 JSON 数组粘贴到 ChatGPT 或 Claude 中，您可能会感受到上下文窗口关闭的痛苦。JSON 对于 Web API 来说非常棒，但对于大型语言模型 (LLM) 来说，这是非常浪费的。为每个记录重复诸如“id”:、“name”: 和 “timestamp”:` 之类的字段名称不仅是多余的，而且还很重要。它会消耗掉需要真钱和有价值的上下文空间的代币。

这就是 TOON（表对象表示法）的闪光点。它不仅仅是一种数据格式；这是优化法学硕士互动的策略。通过消除 JSON 的语法负担并添加显式结构标头，TOON 允许您将更多数据传递到模型并获得更可靠的结构化输出作为回报。

TOON 的通证经济

为什么要费心切换格式呢？数学很简单。在标准 JSON 对象数组中，每一行都会重复该架构。如果您有 50 个用户的列表，则您需要为字段名称支付 50 次费用。

TOON 通过在标头中声明一次架构来消除这种冗余。数据采用密集、流线型的格式。实际上，与格式化 JSON 相比，这通常会导致统一数组的 30-60% 令牌使用量减少。当您处理大量上下文窗口或大量 API 调用时，这种效率会直接转化为更低的费用和更低的延迟。

发送数据：“显示，不要讲述”规则

当您需要法学硕士来分析数据时，您的及时策略至关重要。初学者经常会写很长的段落来解释数据格式。有了 TOON，您就不需要这样做。

LLM 是模式匹配引擎。他们直观地理解 TOON，因为它看起来像是 YAML 和 CSV 格式的混合体——他们在训练过程中已经见过数十亿次这种格式。

要发送数据，只需将其包装在受隔离的代码块中即可。您可以将其标记为“toon”，但即使模型的语法荧光笔不正式支持它，模型也会立即理解该结构。

输入示例

无需描述架构，只需提供块：

标头“users[3]{id,name,role,lastLogin}”告诉模型它需要知道的所有信息：实体类型、计数（3 行）和字段顺序。缩进处理层次结构。这种“自记录”性质使您的提示能够专注于实际的逻辑任务，而不是语法解析指令。

``MD 这是用户活动日志。数据采用 TOON 格式（2 个空格缩进，显式标题）。

用户[3]{id,名称,角色,lastLogin}: 1、爱丽丝，管理员，2025-01-15T10:30:00Z 2、鲍勃，用户，2025-01-14T15:22:00Z 3、查理，用户，2025-01-13T09:45:00Z

任务：分析日志并确定哪些用户在过去 24 小时内未登录。

生成可靠的输出

让法学硕士读取数据很容易；让它_生成_有效的结构化数据是困难的部分。模型喜欢产生幻觉、截断 JSON 或忘记右大括号。

TOON 通过其标头语法（特别是“[N]”计数）添加了一层安全性。当您要求模型输出 TOON 时，您是在要求它在生成数据之前提交到一个结构。

提示生成

为了获得最佳结果，请提供您期望的标题格式并指示模型填充行。

通过要求模型计算“[N]”，您可以强制执行“思维链”过程，模型必须规划输出大小。这个看似很小的约束显着降低了模型在列表中途被切断的可能性。

``MD 任务：返回角色为“user”的活动用户列表。格式：使用TOON。在标题中设置 [N] 值以匹配您生成的确切行数。

预期格式：用户[N]{id,名称,角色,lastLogin}:

使用严格模式进行验证

当您收到法学硕士的回复时，您不应该仅仅相信它。这就是 TOON 库的严格模式成为生产应用程序的超级力量的地方。

如果您使用的是 TypeScript 库，则使用严格模式进行解码会验证生成的行是否与标头计数匹配：

这允许您以编程方式立即捕获“惰性”模型输出或网络截断，而不是在应用程序下游发现错误数据。

``打字稿从'@toon-format/toon'导入{解码}；

尝试{ // 如果模型显示 [5] 但提供 4 行，则会引发错误。 const data = 解码(modelOutput, { strict: true }); console.log('收到的有效数据：', data); } 捕获（错误）{ console.error('检测到模型幻觉或截断：', error.message); }

高级优化：选项卡技巧

如果您痴迷于优化（在法学硕士的世界中，您可能应该如此），您可以通过明智地选择分隔符来获得更高的效率。

逗号是标准的，但制表符 (\t) 在许多分词器词汇表中通常表示为单个标记。此外，制表符很少出现在自然文本字段中，这减少了对转义字符的需要（例如将字符串括在引号中）。

在将数据发送到模型之前，您可以使用选项卡对数据进行编码：

只需记住在提示中告知模型：“数据是制表符分隔的 TOON。” 这将创建一个超紧凑的表示形式，对于模型来说非常容易解析和生成。

``打字稿 const toonPrompt = 编码(数据, { 分隔符: '\t' });

一个完整的工作流程示例

让我们看一个现实场景：过滤系统日志。您希望将原始日志发送到模型并返回结构化的错误列表。

提示：

模型输出：

``MD TOON 格式的系统日志（制表符分隔）：

事件[4]{id,级别,消息,时间戳}: 1、错误、连接超时，2025-01-15T10:00:00Z 2、警告、查询慢、2025-01-15T10:05:00Z 3,信息,用户登录,2025-01-15T10:10:00Z 4、错误，数据库错误025-01-15T10:15:00Z

任务：提取级别为“错误”的所有事件。返回结果作为有效的 TOON 并更新标题计数。

结果：

``香椿事件[2]{id,级别,消息,时间戳}: 1、错误、连接超时，2025-01-15T10:00:00Z 4、错误，数据库错误，2025-01-15T10:15:00Z

该模型正确地过滤了列表，最重要的是，将标题更新为“events[2]”。通过解码此响应，您可以获得一个干净的、类型安全的数组，为您的应用程序逻辑做好准备。

＃＃概括

TOON 弥合了人类可读性和机器效率之间的差距。它尊重法学硕士的成本限制，同时提供稳健的软件开发所需的结构。

保持较小： 在示例中使用 2-5 行；该模型将会泛化。

明确： 明确定义标头，以便模型了解架构。

严格验证： 使用格式的元数据来捕获生成错误。

通过放弃使用 JSON 作为提示负载，您不仅可以节省令牌，还可以构建更可靠的 AI 管道。