* linting * improve prompt devtools: test changes, show more information * some more polish for the new promp devtools * up default conversation gen length to 128 * openai client tweaks, talemate sets max_tokens on gpt-3.5 generations * support new openai embeddings (and default to text-embedding-3-small) * ux polish for character sheet and character state ux * actor instructions * experiment using # for context / instructions * fix bug where regenerating history would mess up time stamps * remove trailing ] * prevent client ctx from being unset * fix issue where sometimes you'd need to delete a client twice for it to disappear * upgrade dependencies * set 0.19.0 * fix performance degradation caused by circular loading animation * remove coqui studio support * fix issue when switching from unsaved creative mode to loading a scene * third party client / agent support * edit dialogue examples through character / actor editor * remove "edit dialogue" action from editor - replaced by character actor instructions * different icon for delete * prompt adjustment for acting instructions * adhoc context generation for character attributes and details * add adhoc generation for character description * contextual generation tweaks * contextual generation for dialogue examples fix some formatting issues * contextual generation for world entries * prepopulate initial recen scenarios with demo scenes add experimental holodeck scenario * scene info scene experimental * assortment of fixes for holodeck improvements * more holodeck fixes * refactor holodeck instructions * rename holodeck to simulation suite * better scene status messages * add new gpt-3.5-turbo model, better json response coercion for older models * allow exclusion of characters when persisting based on world state * better error handling of world state response * better error handling of world state response * more simulation suite fixes * progress color * world state character name mapping support * if neither quote nor asterisk is in message default to quotes * fix rerun of new paraphrase op * sim suite ping that ensure's characters are not aware of sim * fixes for better character name assessment simulation suite can now give the player character a proper name * fix bug with new status notifications * sim suite adjustments and fixes and tuning * sim suite tweaks * impl scene restore from file * prompting tweaks for reinforcement messages and acting instructions * more tweaks * dialogue prompt tweaks for rerun + rewrite * fix bug with character entry / exit with narration * linting * simsuite screenshots * screenshots
2.4 KiB
ChromaDB
Talemate uses ChromaDB to maintain long-term memory. The default embeddings used are really fast but also not incredibly accurate. If you want to use more accurate embeddings you can use the instructor embeddings or the openai embeddings. See below for instructions on how to enable these.
In my testing so far, instructor-xl has proved to be the most accurate (even more-so than openai)
Local instructor embeddings
If you want chromaDB to use the more accurate (but much slower) instructor embeddings add the following to config.yaml
:
Note: The xl
model takes a while to load even with cuda. Expect a minute of loading time on the first scene you load.
chromadb:
embeddings: instructor
instructor_device: cpu
instructor_model: hkunlp/instructor-xl
Instructor embedding models
hkunlp/instructor-base
(smallest / fastest)hkunlp/instructor-large
hkunlp/instructor-xl
(largest / slowest) - requires about 5GB of memory
You will need to restart the backend for this change to take effect.
NOTE - The first time you do this it will need to download the instructor model you selected. This may take a while, and the talemate backend will be un-responsive during that time.
Once the download is finished, if talemate is still un-responsive, try reloading the front-end to reconnect. When all fails just restart the backend as well. I'll try to make this more robust in the future.
GPU support
If you want to use the instructor embeddings with GPU support, you will need to install pytorch with CUDA support.
To do this on windows, run install-pytorch-cuda.bat
from the project directory. Then change your device in the config to cuda
:
chromadb:
embeddings: instructor
instructor_device: cuda
instructor_model: hkunlp/instructor-xl
OpenAI embeddings
First make sure your openai key is specified in the config.yaml
file
openai:
api_key: <your-key-here>
Then add the following to config.yaml
for chromadb:
chromadb:
embeddings: openai
openai_model: text-embedding-3-small
Note: As with everything openai, using this isn't free. It's way cheaper than their text completion though. ALSO - if you send super explicit content they may flag / ban your key, so keep that in mind (i hear they usually send warnings first though), and always monitor your usage on their dashboard.