koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	5c9714cf40	improve whisper to work on 8 bit and 32bit wav too, also support form data for language	2025-01-19 16:57:41 +08:00
Concedo	fa7e661133	various fixes	2025-01-18 23:52:39 +08:00
Concedo	96407502cd	Merge branch 'upstream' into concedo_experimental # Conflicts: # README.md # examples/llama-bench/llama-bench.cpp # examples/llama.android/llama/src/main/cpp/llama-android.cpp # examples/llama.android/llama/src/main/java/android/llama/cpp/LLamaAndroid.kt # src/llama-vocab.cpp # tests/test-backend-ops.cpp	2025-01-17 23:13:50 +08:00
Concedo	e8570de0e6	improved tts default voices quality and sample rate	2025-01-17 18:45:16 +08:00
Concedo	8e3cad1aa2	added audio caching, as a hacky fix for ST TTS bug	2025-01-16 12:04:58 +08:00
Concedo	f8a9634aa2	better xtts and oai speech (+1 squashed commits) Squashed commits: [34b9c15f] better xtts and oai speech	2025-01-16 00:26:21 +08:00
Concedo	70ba616ecc	browser launch	2025-01-15 17:41:14 +08:00
Concedo	e07de2ea92	try fix webbrowser again	2025-01-15 00:53:24 +08:00
Concedo	fec3246ca9	make mmap no longer default, archive class.py	2025-01-15 00:38:03 +08:00
Concedo	ed9f7a38ae	add some built in voices	2025-01-15 00:17:17 +08:00
Concedo	0a6ccda203	better fallback browser support	2025-01-14 18:59:17 +08:00
Concedo	44720fb34c	capabilities printout	2025-01-14 14:03:22 +08:00
Concedo	636beac6d2	added a nicer built in voice	2025-01-13 23:26:54 +08:00
Concedo	62e33d0bf7	added support for seeded tts voices	2025-01-13 19:11:34 +08:00
Concedo	b3de1598e7	Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS tts is functional (+6 squashed commit) Squashed commit: [22396311] wip tts [3a883027] tts not yet working [0dcfab0e] fix silly bug [a378d9ef] some long overdue cleanup [fc5a6fb5] Wip tts [39f50497] wip TTS integration	2025-01-13 14:23:25 +08:00
Concedo	12cdcf0abe	improved browser opening	2025-01-11 22:53:43 +08:00
Concedo	93b2bebc2f	add more options for context size	2025-01-10 19:08:42 +08:00
Concedo	0305841dd5	added a gguf file analyzer	2025-01-10 16:27:48 +08:00
Concedo	91b6e29af3	added multilingual support for whisper	2025-01-09 23:28:52 +08:00
Concedo	0cb599546e	increase max supported llava images to 8	2025-01-09 22:12:06 +08:00
Concedo	c73d99ccac	updated lite	2025-01-08 13:35:59 +08:00
Concedo	568e476997	added toggle for vae tiling, use custom memory buffer	2025-01-08 13:12:03 +08:00
Concedo	d752846116	fixed ask save file	2025-01-07 22:11:15 +08:00
Concedo	58791612d2	sse3 mode for noavx2 clblast, fixed metadata, added version command	2025-01-06 21:59:05 +08:00
Concedo	9b32482089	fixed bug in aesthetic ui	2025-01-05 18:04:02 +08:00
Concedo	1559d4d2fb	fixed defective websearch	2025-01-04 16:47:38 +08:00
Concedo	e07e73aeb4	updated lite	2025-01-04 10:47:48 +08:00
Concedo	8de44d1e41	refactored some outputs	2024-12-30 22:30:27 +08:00
Concedo	5eb314a04b	websearch length limits and caching	2024-12-30 18:30:54 +08:00
Concedo	3fea11675d	websearch integrated into lite, changed to POST	2024-12-30 17:30:41 +08:00
Concedo	6026501ed2	websearch functional	2024-12-30 12:01:51 +08:00
Concedo	709dab6289	improved websearch endpoint	2024-12-29 19:39:16 +08:00
Concedo	5451a8e8a9	updated lite	2024-12-29 17:04:29 +08:00
Concedo	2de1975ca2	improve websearch api	2024-12-28 23:36:40 +08:00
Concedo	baaecd1c65	added a basic websearch proxy	2024-12-28 19:07:00 +08:00
Concedo	29afdb7c90	minor linting	2024-12-28 12:21:35 +08:00
kallewoof	23ec550835	PoC: add chat template heuristics (#1283 ) * PoC: add chat template heuristics The fallback chat template adapter of Vicuna is not ideal in some cases (e.g. a test against a sub-portion of the BBC news classification task on Kaggle gave an 82% accuracy with Vicuna and 88% with the official ChatML format for a q4_k_m Qwen 2.5 3B-Instruct gguf). This PR adds a proof of concept simple heuristic which looks at the chat template and upgrades the adapter when it is able to. * gemma 2 heuristic * Phi 4, Llama 3.x heuristics * better qwen vs generic heuristic * cleanup * mistral (generic) heuristic * fix sys msg for mistral * phi 3.5 * mistral v3 * cohere (aya expanse 32b based) * only derive from chat template if AutoGuess * add notes about alpaca fallbacks * added AutoGuess.json dummy * add mistral v7 * switch to using a json list with search strings	2024-12-28 12:15:23 +08:00
Concedo	5f8f483fae	fixed typo (+1 squashed commits) Squashed commits: [b586d187] fixed typo	2024-12-23 21:57:34 +08:00
Concedo	13abf591d2	patch release for drafting fix	2024-12-23 11:40:02 +08:00
Concedo	4c56b7cada	Merge branch 'upstream' into concedo_experimental # Conflicts: # README.md # examples/gbnf-validator/gbnf-validator.cpp # examples/llava/clip.cpp # examples/run/README.md # examples/run/run.cpp # examples/server/README.md # ggml/src/ggml-cpu/CMakeLists.txt # src/llama.cpp # tests/test-grammar-integration.cpp # tests/test-llama-grammar.cpp	2024-12-21 09:41:49 +08:00
Concedo	fc52a38a25	handle urls as config download in model param	2024-12-20 10:56:07 +08:00
Concedo	6089421423	always follow pci bus id	2024-12-18 00:46:48 +08:00
Concedo	60cd68a39d	draft model sets gpu split instead of id, made mmq default for cli	2024-12-14 23:58:45 +08:00
Concedo	595cc6975f	added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid	2024-12-13 17:11:59 +08:00
Concedo	a11bba5893	cleanup, fix native build for arm (+28 squashed commit) Squashed commit: [d1f6a4154] bundle library [947ab84b7] undo [0f9aba8d8] test [e9ac93873] test [920438202] test [`1c6d98804`] Revert "quick test" This reverts commit `acf8ec8940`. [`acf8ec894`] quick test [`6a9937233`] undo [`5a263a5bd`] test [`ddfd82bca`] test [`0b30e45da`] test [`c3bfece55`] messed up [`2a4b37fe0`] Revert "test" This reverts commit `80a1fcaeaf`. [`80a1fcaea`] test [`e2aa7d944`] test [`264d80200`] test [`f5b123173`] undo [`1ffacc484`] test [`63c0be926`] undo [`510e0377e`] ofast try fix [`4ac199b20`] try fix sigill [`1bc987ba2`] try fix illegal instruction [`7697252b1`] edit [`f87087b28`] check gcc ver [`e9dfe2cef`] try using qemu to do the pyinstaller [`b411192db`] revert [`25b5301e5`] try using qemu to do the pyinstaller [`58038cddc`] try using qemu to do the pyinstaller	2024-12-10 19:42:23 +08:00
Concedo	e9d2332dd8	improved tool calls and whisper	2024-12-06 14:34:31 +08:00
Concedo	836c06d91a	minor edit	2024-12-06 00:37:38 +08:00
Concedo	d0d1d922de	handle and fix temp paths to chat completions adapter	2024-12-05 17:22:35 +08:00
Concedo	2787fca6b4	refactored library selection, fixed ollama params	2024-12-05 16:47:52 +08:00
Concedo	52cc908f7f	default trim_stop to true, which trims any tokens after a stop sequence and the stop sequence itself. This is potentially a breaking change.	2024-12-03 22:44:10 +08:00

... 4 5 6 7 8 ...

1026 commits