Commit graph

932 commits

Author SHA1 Message Date
Concedo
cca4a934dd fix for chat templates and drafting 2025-01-23 11:49:40 +08:00
Concedo
0e74db7fd4 fixed another tts bug, clblast selection and quiet mode 2025-01-22 21:36:13 +08:00
Concedo
d109d6d8eb do another patch release for the new deepseek models 2025-01-21 08:24:48 +08:00
Concedo
5329df2bdf Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	CMakeLists.txt
#	cmake/build-info.cmake
#	examples/run/CMakeLists.txt
#	examples/run/run.cpp
#	examples/simple-chat/simple-chat.cpp
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-sampling.cpp
2025-01-21 00:25:07 +08:00
Concedo
02d5bb5b05 allow smaller gguf 2025-01-20 16:20:52 +08:00
Concedo
80965bbdd7 rewritten gguf metadata reader from scratch, analyze works now 2025-01-20 15:57:03 +08:00
Concedo
5c9714cf40 improve whisper to work on 8 bit and 32bit wav too, also support form data for language 2025-01-19 16:57:41 +08:00
Concedo
fa7e661133 various fixes 2025-01-18 23:52:39 +08:00
Concedo
96407502cd Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	README.md
#	examples/llama-bench/llama-bench.cpp
#	examples/llama.android/llama/src/main/cpp/llama-android.cpp
#	examples/llama.android/llama/src/main/java/android/llama/cpp/LLamaAndroid.kt
#	src/llama-vocab.cpp
#	tests/test-backend-ops.cpp
2025-01-17 23:13:50 +08:00
Concedo
e8570de0e6 improved tts default voices quality and sample rate 2025-01-17 18:45:16 +08:00
Concedo
8e3cad1aa2 added audio caching, as a hacky fix for ST TTS bug 2025-01-16 12:04:58 +08:00
Concedo
f8a9634aa2 better xtts and oai speech (+1 squashed commits)
Squashed commits:

[34b9c15f] better xtts and oai speech
2025-01-16 00:26:21 +08:00
Concedo
70ba616ecc browser launch 2025-01-15 17:41:14 +08:00
Concedo
e07de2ea92 try fix webbrowser again 2025-01-15 00:53:24 +08:00
Concedo
fec3246ca9 make mmap no longer default, archive class.py 2025-01-15 00:38:03 +08:00
Concedo
ed9f7a38ae add some built in voices 2025-01-15 00:17:17 +08:00
Concedo
0a6ccda203 better fallback browser support 2025-01-14 18:59:17 +08:00
Concedo
44720fb34c capabilities printout 2025-01-14 14:03:22 +08:00
Concedo
636beac6d2 added a nicer built in voice 2025-01-13 23:26:54 +08:00
Concedo
62e33d0bf7 added support for seeded tts voices 2025-01-13 19:11:34 +08:00
Concedo
b3de1598e7 Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS
tts is functional (+6 squashed commit)

Squashed commit:

[22396311] wip tts

[3a883027] tts not yet working

[0dcfab0e] fix silly bug

[a378d9ef] some long overdue cleanup

[fc5a6fb5] Wip tts

[39f50497] wip TTS integration
2025-01-13 14:23:25 +08:00
Concedo
12cdcf0abe improved browser opening 2025-01-11 22:53:43 +08:00
Concedo
93b2bebc2f add more options for context size 2025-01-10 19:08:42 +08:00
Concedo
0305841dd5 added a gguf file analyzer 2025-01-10 16:27:48 +08:00
Concedo
91b6e29af3 added multilingual support for whisper 2025-01-09 23:28:52 +08:00
Concedo
0cb599546e increase max supported llava images to 8 2025-01-09 22:12:06 +08:00
Concedo
c73d99ccac updated lite 2025-01-08 13:35:59 +08:00
Concedo
568e476997 added toggle for vae tiling, use custom memory buffer 2025-01-08 13:12:03 +08:00
Concedo
d752846116 fixed ask save file 2025-01-07 22:11:15 +08:00
Concedo
58791612d2 sse3 mode for noavx2 clblast, fixed metadata, added version command 2025-01-06 21:59:05 +08:00
Concedo
9b32482089 fixed bug in aesthetic ui 2025-01-05 18:04:02 +08:00
Concedo
1559d4d2fb fixed defective websearch 2025-01-04 16:47:38 +08:00
Concedo
e07e73aeb4 updated lite 2025-01-04 10:47:48 +08:00
Concedo
8de44d1e41 refactored some outputs 2024-12-30 22:30:27 +08:00
Concedo
5eb314a04b websearch length limits and caching 2024-12-30 18:30:54 +08:00
Concedo
3fea11675d websearch integrated into lite, changed to POST 2024-12-30 17:30:41 +08:00
Concedo
6026501ed2 websearch functional 2024-12-30 12:01:51 +08:00
Concedo
709dab6289 improved websearch endpoint 2024-12-29 19:39:16 +08:00
Concedo
5451a8e8a9 updated lite 2024-12-29 17:04:29 +08:00
Concedo
2de1975ca2 improve websearch api 2024-12-28 23:36:40 +08:00
Concedo
baaecd1c65 added a basic websearch proxy 2024-12-28 19:07:00 +08:00
Concedo
29afdb7c90 minor linting 2024-12-28 12:21:35 +08:00
kallewoof
23ec550835
PoC: add chat template heuristics (#1283)
* PoC: add chat template heuristics

The fallback chat template adapter of Vicuna is not ideal in some cases (e.g. a test against a sub-portion of the BBC news classification task on Kaggle gave an 82% accuracy with Vicuna and 88% with the official ChatML format for a q4_k_m Qwen 2.5 3B-Instruct gguf).

This PR adds a proof of concept simple heuristic which looks at the chat template and upgrades the adapter when it is able to.

* gemma 2 heuristic

* Phi 4, Llama 3.x heuristics

* better qwen vs generic heuristic

* cleanup

* mistral (generic) heuristic

* fix sys msg for mistral

* phi 3.5

* mistral v3

* cohere (aya expanse 32b based)

* only derive from chat template if AutoGuess

* add notes about alpaca fallbacks

* added AutoGuess.json dummy

* add mistral v7

* switch to using a json list with search strings
2024-12-28 12:15:23 +08:00
Concedo
5f8f483fae fixed typo (+1 squashed commits)
Squashed commits:

[b586d187] fixed typo
2024-12-23 21:57:34 +08:00
Concedo
13abf591d2 patch release for drafting fix 2024-12-23 11:40:02 +08:00
Concedo
4c56b7cada Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	README.md
#	examples/gbnf-validator/gbnf-validator.cpp
#	examples/llava/clip.cpp
#	examples/run/README.md
#	examples/run/run.cpp
#	examples/server/README.md
#	ggml/src/ggml-cpu/CMakeLists.txt
#	src/llama.cpp
#	tests/test-grammar-integration.cpp
#	tests/test-llama-grammar.cpp
2024-12-21 09:41:49 +08:00
Concedo
fc52a38a25 handle urls as config download in model param 2024-12-20 10:56:07 +08:00
Concedo
6089421423 always follow pci bus id 2024-12-18 00:46:48 +08:00
Concedo
60cd68a39d draft model sets gpu split instead of id, made mmq default for cli 2024-12-14 23:58:45 +08:00
Concedo
595cc6975f added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid 2024-12-13 17:11:59 +08:00