talemate/docs/tts.md
veguAI add4893939
Prep 0.19.0 (#67)
* linting

* improve prompt devtools: test changes, show more information

* some more polish for the new promp devtools

* up default conversation gen length to 128

* openai client tweaks, talemate sets max_tokens on gpt-3.5 generations

* support new openai embeddings (and default to text-embedding-3-small)

* ux polish for character sheet and character state ux

* actor instructions

* experiment using # for context / instructions

* fix bug where regenerating history would mess up time stamps

* remove trailing ]

* prevent client ctx from being unset

* fix issue where sometimes you'd need to delete a client twice for it to disappear

* upgrade dependencies

* set 0.19.0

* fix performance degradation caused by circular loading animation

* remove coqui studio support

* fix issue when switching from unsaved creative mode to loading a scene

* third party client / agent support

* edit dialogue examples through character / actor editor

* remove "edit dialogue" action from editor - replaced by character actor instructions

* different icon for delete

* prompt adjustment for acting instructions

* adhoc context generation for character attributes and details

* add adhoc generation for character description

* contextual generation tweaks

* contextual generation for dialogue examples
fix some formatting issues

* contextual generation for world entries

* prepopulate initial recen scenarios with demo scenes
add experimental holodeck scenario

* scene info
scene experimental

* assortment of fixes for holodeck improvements

* more holodeck fixes

* refactor holodeck instructions

* rename holodeck to simulation suite

* better scene status messages

* add new gpt-3.5-turbo model, better json response coercion for older models

* allow exclusion of characters when persisting based on world state

* better error handling of world state response

* better error handling of world state response

* more simulation suite fixes

* progress color

* world state character name mapping support

* if neither quote nor asterisk is in message default to quotes

* fix rerun of new paraphrase op

* sim suite ping that ensure's characters are not aware of sim

* fixes for better character name assessment
simulation suite can now give the player character a proper name

* fix bug with new status notifications

* sim suite adjustments and fixes and tuning

* sim suite tweaks

* impl scene restore from file

* prompting tweaks for reinforcement messages and acting instructions

* more tweaks

* dialogue prompt tweaks for rerun + rewrite

* fix bug with character entry / exit with narration

* linting

* simsuite screenshots

* screenshots
2024-02-06 00:40:55 +02:00

2.4 KiB

Talemate Text-to-Speech (TTS) Configuration

Talemate supports Text-to-Speech (TTS) functionality, allowing users to convert text into spoken audio. This document outlines the steps required to configure TTS for Talemate using different providers, including ElevenLabs, Coqui, and a local TTS API.

Configuring ElevenLabs TTS

To use ElevenLabs TTS with Talemate, follow these steps:

  1. Visit ElevenLabs and create an account if you don't already have one.
  2. Click on your profile in the upper right corner of the Eleven Labs website to access your API key.
  3. In the config.yaml file, under the elevenlabs section, set the api_key field with your ElevenLabs API key.

Example configuration snippet:

elevenlabs:
  api_key: <YOUR_ELEVENLABS_API_KEY>

Configuring Local TTS API

For running a local TTS API, Talemate requires specific dependencies to be installed.

Windows Installation

Run install-local-tts.bat to install the necessary requirements.

Linux Installation

Execute the following command:

pip install TTS

Model and Device Configuration

  1. Choose a TTS model from the Coqui TTS model list.
  2. Decide whether to use cuda or cpu for the device setting.
  3. The first time you run TTS through the local API, it will download the specified model. Please note that this may take some time, and the download progress will be visible in the Talemate backend output.

Example configuration snippet:

tts:
  device: cuda # or 'cpu'
  model: tts_models/multilingual/multi-dataset/xtts_v2

Voice Samples Configuration

Configure voice samples by setting the value field to the path of a .wav file voice sample. Official samples can be downloaded from Coqui XTTS-v2 samples.

Example configuration snippet:

tts:
  voices:
    - label: English Male
      value: path/to/english_male.wav
    - label: English Female
      value: path/to/english_female.wav

Saving the Configuration

After configuring the config.yaml file, save your changes. Talemate will use the updated settings the next time it starts.

For more detailed information on configuring Talemate, refer to the config.py file in the Talemate source code and the config.example.yaml file for a barebone configuration example.