Concedo
5c9714cf40
improve whisper to work on 8 bit and 32bit wav too, also support form data for language
2025-01-19 16:57:41 +08:00
Concedo
fa7e661133
various fixes
2025-01-18 23:52:39 +08:00
Concedo
96407502cd
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# examples/llama-bench/llama-bench.cpp
# examples/llama.android/llama/src/main/cpp/llama-android.cpp
# examples/llama.android/llama/src/main/java/android/llama/cpp/LLamaAndroid.kt
# src/llama-vocab.cpp
# tests/test-backend-ops.cpp
2025-01-17 23:13:50 +08:00
Concedo
e8570de0e6
improved tts default voices quality and sample rate
2025-01-17 18:45:16 +08:00
Concedo
8e3cad1aa2
added audio caching, as a hacky fix for ST TTS bug
2025-01-16 12:04:58 +08:00
Concedo
f8a9634aa2
better xtts and oai speech (+1 squashed commits)
...
Squashed commits:
[34b9c15f] better xtts and oai speech
2025-01-16 00:26:21 +08:00
Concedo
70ba616ecc
browser launch
2025-01-15 17:41:14 +08:00
Concedo
e07de2ea92
try fix webbrowser again
2025-01-15 00:53:24 +08:00
Concedo
fec3246ca9
make mmap no longer default, archive class.py
2025-01-15 00:38:03 +08:00
Concedo
ed9f7a38ae
add some built in voices
2025-01-15 00:17:17 +08:00
Concedo
0a6ccda203
better fallback browser support
2025-01-14 18:59:17 +08:00
Concedo
44720fb34c
capabilities printout
2025-01-14 14:03:22 +08:00
Concedo
636beac6d2
added a nicer built in voice
2025-01-13 23:26:54 +08:00
Concedo
62e33d0bf7
added support for seeded tts voices
2025-01-13 19:11:34 +08:00
Concedo
b3de1598e7
Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS
...
tts is functional (+6 squashed commit)
Squashed commit:
[22396311] wip tts
[3a883027] tts not yet working
[0dcfab0e] fix silly bug
[a378d9ef] some long overdue cleanup
[fc5a6fb5] Wip tts
[39f50497] wip TTS integration
2025-01-13 14:23:25 +08:00
Concedo
12cdcf0abe
improved browser opening
2025-01-11 22:53:43 +08:00
Concedo
93b2bebc2f
add more options for context size
2025-01-10 19:08:42 +08:00
Concedo
0305841dd5
added a gguf file analyzer
2025-01-10 16:27:48 +08:00
Concedo
91b6e29af3
added multilingual support for whisper
2025-01-09 23:28:52 +08:00
Concedo
0cb599546e
increase max supported llava images to 8
2025-01-09 22:12:06 +08:00
Concedo
c73d99ccac
updated lite
2025-01-08 13:35:59 +08:00
Concedo
568e476997
added toggle for vae tiling, use custom memory buffer
2025-01-08 13:12:03 +08:00
Concedo
d752846116
fixed ask save file
2025-01-07 22:11:15 +08:00
Concedo
58791612d2
sse3 mode for noavx2 clblast, fixed metadata, added version command
2025-01-06 21:59:05 +08:00
Concedo
9b32482089
fixed bug in aesthetic ui
2025-01-05 18:04:02 +08:00
Concedo
1559d4d2fb
fixed defective websearch
2025-01-04 16:47:38 +08:00
Concedo
e07e73aeb4
updated lite
2025-01-04 10:47:48 +08:00
Concedo
8de44d1e41
refactored some outputs
2024-12-30 22:30:27 +08:00
Concedo
5eb314a04b
websearch length limits and caching
2024-12-30 18:30:54 +08:00
Concedo
3fea11675d
websearch integrated into lite, changed to POST
2024-12-30 17:30:41 +08:00
Concedo
6026501ed2
websearch functional
2024-12-30 12:01:51 +08:00
Concedo
709dab6289
improved websearch endpoint
2024-12-29 19:39:16 +08:00
Concedo
5451a8e8a9
updated lite
2024-12-29 17:04:29 +08:00
Concedo
2de1975ca2
improve websearch api
2024-12-28 23:36:40 +08:00
Concedo
baaecd1c65
added a basic websearch proxy
2024-12-28 19:07:00 +08:00
Concedo
29afdb7c90
minor linting
2024-12-28 12:21:35 +08:00
kallewoof
23ec550835
PoC: add chat template heuristics ( #1283 )
...
* PoC: add chat template heuristics
The fallback chat template adapter of Vicuna is not ideal in some cases (e.g. a test against a sub-portion of the BBC news classification task on Kaggle gave an 82% accuracy with Vicuna and 88% with the official ChatML format for a q4_k_m Qwen 2.5 3B-Instruct gguf).
This PR adds a proof of concept simple heuristic which looks at the chat template and upgrades the adapter when it is able to.
* gemma 2 heuristic
* Phi 4, Llama 3.x heuristics
* better qwen vs generic heuristic
* cleanup
* mistral (generic) heuristic
* fix sys msg for mistral
* phi 3.5
* mistral v3
* cohere (aya expanse 32b based)
* only derive from chat template if AutoGuess
* add notes about alpaca fallbacks
* added AutoGuess.json dummy
* add mistral v7
* switch to using a json list with search strings
2024-12-28 12:15:23 +08:00
Concedo
5f8f483fae
fixed typo (+1 squashed commits)
...
Squashed commits:
[b586d187] fixed typo
2024-12-23 21:57:34 +08:00
Concedo
13abf591d2
patch release for drafting fix
2024-12-23 11:40:02 +08:00
Concedo
4c56b7cada
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# README.md
# examples/gbnf-validator/gbnf-validator.cpp
# examples/llava/clip.cpp
# examples/run/README.md
# examples/run/run.cpp
# examples/server/README.md
# ggml/src/ggml-cpu/CMakeLists.txt
# src/llama.cpp
# tests/test-grammar-integration.cpp
# tests/test-llama-grammar.cpp
2024-12-21 09:41:49 +08:00
Concedo
fc52a38a25
handle urls as config download in model param
2024-12-20 10:56:07 +08:00
Concedo
6089421423
always follow pci bus id
2024-12-18 00:46:48 +08:00
Concedo
60cd68a39d
draft model sets gpu split instead of id, made mmq default for cli
2024-12-14 23:58:45 +08:00
Concedo
595cc6975f
added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid
2024-12-13 17:11:59 +08:00
Concedo
a11bba5893
cleanup, fix native build for arm (+28 squashed commit)
...
Squashed commit:
[d1f6a4154] bundle library
[947ab84b7] undo
[0f9aba8d8] test
[e9ac93873] test
[920438202] test
[1c6d98804
] Revert "quick test"
This reverts commit acf8ec8940
.
[acf8ec894
] quick test
[6a9937233
] undo
[5a263a5bd
] test
[ddfd82bca
] test
[0b30e45da
] test
[c3bfece55
] messed up
[2a4b37fe0
] Revert "test"
This reverts commit 80a1fcaeaf
.
[80a1fcaea
] test
[e2aa7d944
] test
[264d80200
] test
[f5b123173
] undo
[1ffacc484
] test
[63c0be926
] undo
[510e0377e
] ofast try fix
[4ac199b20
] try fix sigill
[1bc987ba2
] try fix illegal instruction
[7697252b1
] edit
[f87087b28
] check gcc ver
[e9dfe2cef
] try using qemu to do the pyinstaller
[b411192db
] revert
[25b5301e5
] try using qemu to do the pyinstaller
[58038cddc
] try using qemu to do the pyinstaller
2024-12-10 19:42:23 +08:00
Concedo
e9d2332dd8
improved tool calls and whisper
2024-12-06 14:34:31 +08:00
Concedo
836c06d91a
minor edit
2024-12-06 00:37:38 +08:00
Concedo
d0d1d922de
handle and fix temp paths to chat completions adapter
2024-12-05 17:22:35 +08:00
Concedo
2787fca6b4
refactored library selection, fixed ollama params
2024-12-05 16:47:52 +08:00
Concedo
52cc908f7f
default trim_stop to true, which trims any tokens after a stop sequence and the stop sequence itself. This is potentially a breaking change.
2024-12-03 22:44:10 +08:00