Commit graph

1020 commits

Author SHA1 Message Date
Concedo
07a96d63fa try to ensure correct file extension 2025-04-03 20:13:53 +08:00
Concedo
6e086bd309 fixed savedatafile bug, try remove unneeded old clblast code path 2025-04-03 19:11:27 +08:00
Concedo
7f1003be44 warning for max tokens being too high 2025-04-02 18:58:38 +08:00
Concedo
fbf5c04c3c silly me 2025-04-02 00:51:05 +08:00
Concedo
30e3d24ead embd include name 2025-04-02 00:40:38 +08:00
Concedo
e37f27632f clear cpu flag manually for templates, added truncation for embeddings 2025-04-02 00:18:30 +08:00
Concedo
0fd94e19f3 made tool calls more robust and allowed tool call template customization 2025-04-01 19:16:45 +08:00
henk717
4291e1575b
Fix tool spec, this spec is kinda.... (#1458) 2025-04-01 10:39:02 +08:00
Concedo
c0adaabfa4 Revert "try fix owui"
This reverts commit 12e5b8abdb.
2025-04-01 00:27:31 +08:00
Concedo
12e5b8abdb try fix owui 2025-04-01 00:23:45 +08:00
Concedo
0ed95fcccc fixed l3 template, add index 2025-03-31 23:59:06 +08:00
Concedo
1ebadc515e add streaming support for oai tools (+2 squashed commit)
Squashed commit:

[4d080b37] qwen2.5vl surgery script

[4bebe7e5] add streaming support for oai tools
2025-03-31 16:49:15 +08:00
henk717
091eb367fc
More robust tool calling prompt (#1455)
* More robust tool checking prompt

* Inform UI we want a tool
2025-03-31 14:43:03 +08:00
Concedo
b4a8a5a278 Added CLI chat mode
minor cli fixes (+1 squashed commits)

Squashed commits:

[60af39a9] Added CLI chat mode
2025-03-26 21:01:58 +08:00
Concedo
2bdf1dacff embeddings done 2025-03-25 22:41:46 +08:00
Concedo
82f2654049 wip embeddings model 2025-03-25 00:18:02 +08:00
Concedo
3992fb79cc wip adding embeddings support 2025-03-24 18:01:23 +08:00
Concedo
b1641ee4a2 allow quant K without quant V but with a warning (+1 squashed commits)
Squashed commits:

[45408dd9] allow quant K without quant V but with a warning
2025-03-23 22:56:02 +08:00
Concedo
a20a29ddeb tool calling improved, auto now works 2025-03-22 17:44:55 +08:00
Concedo
350427dc3a adjust subprocess timeouts 2025-03-22 11:10:01 +08:00
InconsolableCellist
e31da5861a
1435: add timeout for vulkaninfo (#1436)
* Fix the Colab PR

* 1435: add timeout for vulkaninfo

There's a bug in vulkaninfo where it can hang, and this will prevent
koboldcpp from starting. This adds a 5 second timeout

* restoring colab.ipynb

* Formatting

---------

Co-authored-by: henk717 <henk@henk.tech>
2025-03-22 11:01:22 +08:00
Concedo
c1e58419c7 support for voice cloning is done (+2 squashed commit)
Squashed commit:

[e7301628] support for voice cloning is done

[1653c576] wip adding voice cloning
2025-03-21 22:28:59 +08:00
Concedo
a66d0f7743 safeguard for bad vision input 2025-03-20 22:20:38 +08:00
Concedo
9d5efd68b6 dont print full base64 for images 2025-03-20 21:20:04 +08:00
Concedo
0c90d2ebcf Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	cmake/common.cmake
#	docs/backend/SYCL.md
#	examples/main/README.md
#	examples/speculative/speculative.cpp
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	ggml/src/ggml-musa/CMakeLists.txt
#	ggml/src/ggml-sycl/CMakeLists.txt
#	ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
#	tests/test-backend-ops.cpp
2025-03-19 19:27:11 +08:00
Concedo
ddaa8d5a38 fixed saving path for savedata 2025-03-17 22:19:52 +08:00
Concedo
0cfd8d23cb handle symlinks (+1 squashed commits)
Squashed commits:

[fb8477b9] fixed makefile (+4 squashed commit)

Squashed commit:

[4a245bba] fixed a makefile issue

[d68eba69] alias usehipblas to usecublas

[a9ab0a7c] dynamic rocwmma selection

[fefe17c7] revert rocwmma
2025-03-17 21:03:30 +08:00
Concedo
6888f5495d allow quantkv with contextshift 2025-03-16 21:48:42 +08:00
Concedo
5ef1722d5f fix for sd 2025-03-16 17:02:42 +08:00
Concedo
0954e9e476 improve model estimation 2025-03-16 16:14:13 +08:00
Concedo
5d7c5e9e33 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	examples/tts/tts.cpp
2025-03-16 15:42:39 +08:00
Concedo
2401502cbd improvement to tool calling, allowing specific tools to be used 2025-03-16 15:20:08 +08:00
Concedo
9f7fd63160 revert unwanted change to tool calling 2025-03-16 01:35:48 +08:00
Concedo
e84596ec1a add config for default gen tokens and bos toggle 2025-03-15 19:53:06 +08:00
Concedo
7272165e0e verbosity 2025-03-15 12:13:04 +08:00
Concedo
4212f0b8e8 wip on multiple fixes 2025-03-15 10:50:36 +08:00
Concedo
d7498e7e8a added model switching to gguf in admin mode (auto guess layers) 2025-03-14 19:45:55 +08:00
Concedo
30cb77a900 rename replace_instruct_placeholders field 2025-03-14 18:37:12 +08:00
Concedo
782e1e193a replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else (+1 squashed commits)
Squashed commits:

[4a73c8d3] replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else
2025-03-14 18:18:32 +08:00
Concedo
6a1dd57435 gemma3 template, updated lite, fixed tool calling, reenable ctx shift for gemma3 2025-03-14 17:47:01 +08:00
Concedo
57c9523405 sd lora from url 2025-03-13 10:55:01 +08:00
Concedo
eb1809c105 add more perf stats 2025-03-12 18:58:27 +08:00
Concedo
c55bb9a63d use actual null instead of string "null" for finish_reason in openai responses 2025-03-07 15:18:33 +08:00
Concedo
ec43d2b147 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	common/common.cpp
#	examples/embedding/embedding.cpp
#	examples/json_schema_to_grammar.py
#	examples/llama.android/llama/src/main/cpp/llama-android.cpp
#	examples/llama.swiftui/README.md
#	examples/llama.swiftui/llama.swiftui.xcodeproj/project.pbxproj
#	examples/lookahead/lookahead.cpp
#	examples/parallel/parallel.cpp
#	examples/passkey/passkey.cpp
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	requirements.txt
#	requirements/requirements-all.txt
#	scripts/fetch_server_test_models.py
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-03-06 18:54:58 +08:00
Concedo
6b7d2349a7 Rewrite history to fix bad vulkan shader commits without increasing repo size
added dpe colab (+8 squashed commit)

Squashed commit:

[b8362da4] updated lite

[ed6c037d] move nsigma into the regular sampler stack

[ac5f61c6] relative filepath fixed

[05fe96ab] export template

[ed0a5a3e] nix_example.md: refactor (#1401)

* nix_example.md: add override example

* nix_example.md: drop graphics example, already basic nixos knowledge

* nix_example.md: format

* nix_example.md: Vulkan is disabled on macOS

Disabled in: 1ccd253acc

* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}

Fixes: https://github.com/LostRuins/koboldcpp/issues/1367

[675c62f7] AutoGuess: Phi 4 (mini) (#1402)

[4bf56982] phrasing

[b8c0df04] Add Rep Pen to Top N Sigma sampler chain (#1397)

- place after nsigma and before xtc (+3 squashed commit)

Squashed commit:

[87c52b97] disable VMM from HIP

[ee8906f3] edit description

[e85c0e69] Remove Unnecessary Rep Counting (#1394)

* stop counting reps

* fix range-based initializer

* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
50eae1ffeb added trycatch for ipv4 2025-02-26 00:45:06 +08:00
Concedo
12c501f723 fixed wrong file open mode 2025-02-24 15:14:02 +08:00
Concedo
ccd2dbe020 added support for server side save slots 2025-02-24 00:20:16 +08:00
Concedo
f2ac10c014 added nsigma to lite 2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp (#1384)
* Add N Sigma Sampler

* update nsigma sampler chain

* xtc position fix

* remove stray newline

---------

Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00