7 KiB
Token Compression Protocol (TCP)
This document defines an easy-to-implement protocol for compressing LLM prompts and responses while preserving the original encoding and a persistent context. It is designed to run as a local TCP service and integrate cleanly with browser extensions via a lightweight native-host bridge.
Goals
- Encode user prompts in base64.
- Accept model responses in base54.
- Decode responses back into the original encoding (utf-8, ascii, etc).
- Maintain a persistent, base64-rendered context across conversations.
- Provide token savings diagnostics via
**show savings**and**show total**. - Keep the wire format simple: newline-delimited JSON over TCP.
Server Overview
The TCP server lives at:
- Module:
python/helpers/token_compression_protocol.py - Default host:
127.0.0.1 - Default port:
7543 - Context dataset:
memory/token_compression_context.json
Run it locally:
python3 /workspace/python/helpers/token_compression_protocol.py
The server accepts one JSON object per line and returns one JSON object per line.
Base54 Alphabet
The response payload uses base54 with this alphabet:
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstyz
This avoids ambiguous characters (0, O, I, l) and a few lower-case letters to hit an even base of 54.
Transport Protocol (TCP)
Each request is a single JSON line terminated by \n (LF). Each response is
also a single JSON line terminated by \n.
Common Envelope
All responses include:
{
"ok": true,
"result": { ... }
}
Errors are returned as:
{
"ok": false,
"error": "error_code",
"detail": "optional detail"
}
Actions
prompt
Encode a user prompt to base64, update context, and return the new context.
Request:
{
"action": "prompt",
"conversation_id": "optional",
"text": "user prompt text",
"encoding": "utf-8",
"language": "en"
}
Notes:
- If
conversation_idis omitted, the server generates one. encodingandlanguageare stored and reused for the conversation.- If
**show savings**or**show total**is present intext, it is stripped before encoding and applied to the nextresponse.
Response:
{
"ok": true,
"result": {
"conversation_id": "uuid",
"encoding": "utf-8",
"language": "en",
"encoded_prompt_b64": "SGVsbG8=",
"context_b64": "dXNlcjogSGVsbG8=",
"context_tokens": { "raw": 2, "encoded": 2, "saved": 0 },
"prompt_tokens": { "raw": 2, "encoded": 2, "saved": 0 },
"savings_request": { "show_savings": true, "show_total": false }
}
}
response
Submit a base54 response payload from the model, decode it, and update context.
Request:
{
"action": "response",
"conversation_id": "uuid",
"payload_b54": "base54response"
}
Response (base form):
{
"ok": true,
"result": {
"conversation_id": "uuid",
"encoding": "utf-8",
"language": "en",
"response_b54": "base54response",
"decoded_text": "model response",
"response_tokens": { "raw": 4, "encoded": 3, "saved": 1 },
"context_b64": "dXNlcjogSGVsbG8K...==",
"context_tokens": { "raw": 6, "encoded": 5, "saved": 1 }
}
}
If **show savings** or **show total** was set in the last prompt, the
response includes a savings object, tagline, and decoded_text_with_tagline:
{
"ok": true,
"result": {
"...": "...",
"tagline": "Token savings (prompt/response/context/combined): 0/1/1/2.",
"decoded_text_with_tagline": "model response\nToken savings ...",
"savings": {
"prompt": { "raw": 2, "encoded": 2, "saved": 0 },
"response": { "raw": 4, "encoded": 3, "saved": 1 },
"context": { "raw": 6, "encoded": 5, "saved": 1 },
"combined_saved": 2,
"totals": {
"prompt": { "raw": 10, "encoded": 10, "saved": 0 },
"response": { "raw": 20, "encoded": 18, "saved": 2 },
"context": { "raw": 30, "encoded": 25, "saved": 5 },
"combined_saved": 7
}
}
}
}
context_get
Fetch the current base64 context for a conversation (or all conversations).
Request:
{
"action": "context_get",
"conversation_id": "uuid"
}
Response:
{
"ok": true,
"result": {
"conversation_id": "uuid",
"context_b64": "dXNlcjogSGVsbG8=",
"encoding": "utf-8",
"language": "en",
"context_tokens": { "raw": 2, "encoded": 2, "saved": 0 }
}
}
context_reset
Delete a conversation from the dataset.
Request:
{
"action": "context_reset",
"conversation_id": "uuid"
}
ping
Request:
{ "action": "ping" }
Response:
{ "ok": true, "result": { "message": "pong" } }
Browser Extension Integration
Browsers cannot open raw TCP sockets directly. The easiest integration pattern is a lightweight local bridge that the extension can message.
Option A: Native Messaging Host (Recommended)
Use the browser's native messaging API to launch a small helper process that connects to the TCP server and forwards JSON lines.
Flow:
- Extension sends a JSON message to the native host.
- Native host writes the JSON line to
127.0.0.1:7543. - Native host reads the JSON response line and returns it to the extension.
Advantages:
- Works in Chrome and Firefox.
- No CORS or HTTP server needed.
- Minimal bridging logic (just pass-through JSON lines).
Native host pseudo-code:
import json, socket, sys
def tcp_exchange(payload):
data = json.dumps(payload).encode("utf-8") + b"\n"
with socket.create_connection(("127.0.0.1", 7543)) as sock:
sock.sendall(data)
response = sock.recv(1024 * 1024).split(b"\n", 1)[0]
return json.loads(response.decode("utf-8"))
Option B: Local HTTP/WS Bridge
If you prefer fetch or WebSocket from the extension, run a local bridge that
translates HTTP/WS into the TCP line protocol:
POST /tcp-> send JSON line over TCP, return JSON responseGET /context/:conversation_id-> map tocontext_get
This is a thin shim and keeps the TCP protocol unchanged.
Example End-to-End Session
- Encode prompt:
{"action":"prompt","text":"Summarize this. **show savings**","encoding":"utf-8","language":"en"}
-
Send
encoded_prompt_b64to the model (outside TCP server). -
Encode model output to base54 (client-side), then send:
{"action":"response","conversation_id":"...","payload_b54":"..."}
- Receive decoded text plus savings tagline.
Data Persistence
Context is stored in memory/token_compression_context.json. The server keeps a
background thread that refreshes and persists the context every few seconds.
Security Notes
- Run the TCP server on
127.0.0.1only. - Treat
context_b64as sensitive; it contains full conversation history. - Use the native messaging approach if you need strict extension isolation.
Troubleshooting
missing_payload_b54: Ensure you send base54 for responses, not base64.invalid_base54: Check the alphabet and strip any non-base54 characters.unknown_conversation_id: Use theconversation_idreturned byprompt.