feat: add persistent tiktoken cache to reduce re-downloads (#171)
Some checks are pending
Development Build / extract-version (push) Waiting to run
Development Build / test-build-regular (push) Blocked by required conditions
Development Build / test-build-single (push) Blocked by required conditions
Development Build / summary (push) Blocked by required conditions

Configure tiktoken to cache tokenizer encodings in ./data/tiktoken-cache
instead of using system temp directory. This prevents re-downloading
encoding files on every container restart and improves startup time.

Changes:
- Add TIKTOKEN_CACHE_DIR configuration in config.py
- Set TIKTOKEN_CACHE_DIR environment variable in token_utils.py
- Bump version to 1.0.7
This commit is contained in:
Luis Novo 2025-10-19 14:50:52 -03:00 committed by GitHub
parent dd79d7a511
commit aa593c60bd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 23 additions and 12 deletions

View file

@ -11,3 +11,7 @@ LANGGRAPH_CHECKPOINT_FILE = f"{sqlite_folder}/checkpoints.sqlite"
# UPLOADS FOLDER
UPLOADS_FOLDER = f"{DATA_FOLDER}/uploads"
os.makedirs(UPLOADS_FOLDER, exist_ok=True)
# TIKTOKEN CACHE FOLDER
TIKTOKEN_CACHE_DIR = f"{DATA_FOLDER}/tiktoken-cache"
os.makedirs(TIKTOKEN_CACHE_DIR, exist_ok=True)