talemate/docs/chromadb.md at 23f26b75da99f4494ac1651cc698f2cc03d65ebb

vrr/talemate

Fork 0

mirror of https://github.com/vegu-ai/talemate.git synced 2025-09-03 10:59:13 +00:00

FInalWombat 23f26b75da

Update chromadb.md

2023-09-18 11:18:52 +03:00

1.2 KiB

Raw Blame History

ChromaDB

If you want chromaDB to use the more accurate (but much slower) instructor embeddings add the following to config.yaml:

chromadb:
    embeddings: instructor
    instructor_device: cpu
    instructor_model: hkunlp/instructor-xl

You will need to restart the backend for this change to take effect.

Note that the first time you do this it will need to download the instructor model you selected. This may take a while, and the talemate backend will be un-responsive during that time.

Once the download is finished, if talemate is still un-responsive, try reloading the front-end to reconnect. When all fails just restart the backend as well.

GPU support

If you want to use the instructor embeddings with GPU support, you will need to install pytorch with CUDA support.

To do this on windows, run install-pytorch-cuda.bat from the project root. Then change your device in the config to cuda:

chromadb:
    embeddings: instructor
    instructor_device: cuda
    instructor_model: hkunlp/instructor-xl

Instructor embedding models:

hkunlp/instructor-base (smallest / fastest)
hkunlp/instructor-large
hkunlp/instructor-xl (largest / slowest) - requires about 5GB of GPU memory

1.2 KiB Raw Blame History

ChromaDB

GPU support

1.2 KiB

Raw Blame History