Commit graph

828 commits

Author SHA1 Message Date
Concedo
c55bb9a63d use actual null instead of string "null" for finish_reason in openai responses 2025-03-07 15:18:33 +08:00
Concedo
ec43d2b147 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	common/common.cpp
#	examples/embedding/embedding.cpp
#	examples/json_schema_to_grammar.py
#	examples/llama.android/llama/src/main/cpp/llama-android.cpp
#	examples/llama.swiftui/README.md
#	examples/llama.swiftui/llama.swiftui.xcodeproj/project.pbxproj
#	examples/lookahead/lookahead.cpp
#	examples/parallel/parallel.cpp
#	examples/passkey/passkey.cpp
#	ggml/CMakeLists.txt
#	ggml/src/CMakeLists.txt
#	ggml/src/ggml-cpu/CMakeLists.txt
#	requirements.txt
#	requirements/requirements-all.txt
#	scripts/fetch_server_test_models.py
#	tests/test-chat.cpp
#	tests/test-json-schema-to-grammar.cpp
2025-03-06 18:54:58 +08:00
Concedo
6b7d2349a7 Rewrite history to fix bad vulkan shader commits without increasing repo size
added dpe colab (+8 squashed commit)

Squashed commit:

[b8362da4] updated lite

[ed6c037d] move nsigma into the regular sampler stack

[ac5f61c6] relative filepath fixed

[05fe96ab] export template

[ed0a5a3e] nix_example.md: refactor (#1401)

* nix_example.md: add override example

* nix_example.md: drop graphics example, already basic nixos knowledge

* nix_example.md: format

* nix_example.md: Vulkan is disabled on macOS

Disabled in: 1ccd253acc

* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}

Fixes: https://github.com/LostRuins/koboldcpp/issues/1367

[675c62f7] AutoGuess: Phi 4 (mini) (#1402)

[4bf56982] phrasing

[b8c0df04] Add Rep Pen to Top N Sigma sampler chain (#1397)

- place after nsigma and before xtc (+3 squashed commit)

Squashed commit:

[87c52b97] disable VMM from HIP

[ee8906f3] edit description

[e85c0e69] Remove Unnecessary Rep Counting (#1394)

* stop counting reps

* fix range-based initializer

* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
50eae1ffeb added trycatch for ipv4 2025-02-26 00:45:06 +08:00
Concedo
12c501f723 fixed wrong file open mode 2025-02-24 15:14:02 +08:00
Concedo
ccd2dbe020 added support for server side save slots 2025-02-24 00:20:16 +08:00
Concedo
f2ac10c014 added nsigma to lite 2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp (#1384)
* Add N Sigma Sampler

* update nsigma sampler chain

* xtc position fix

* remove stray newline

---------

Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00
Concedo
41350df81f updated lite, added ability to export kcpps via CLI 2025-02-20 22:58:12 +08:00
Concedo
6fa50f78bf allow kcppt for config switching 2025-02-17 00:48:34 +08:00
Concedo
15ae98c9cd better error handling for downloads 2025-02-16 23:13:09 +08:00
Concedo
58380153b2 safer autoguess fix
verbose outputs (+3 squashed commit)

Squashed commit:

[7bbbfc10] fixed a retry history bug

[824b9bf7] another autoguess fix
2025-02-16 21:13:45 +08:00
Concedo
5a79dd57b9 add short delay before launching browser 2025-02-16 12:45:14 +08:00
Concedo
299d6ce0ed horde advertised max ctx 2025-02-16 11:59:08 +08:00
Concedo
f144b1f345 Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.devops/llama-cpp-cuda.srpm.spec
#	.devops/llama-cpp.srpm.spec
#	.devops/nix/package.nix
#	.devops/rocm.Dockerfile
#	.github/ISSUE_TEMPLATE/020-enhancement.yml
#	.github/ISSUE_TEMPLATE/030-research.yml
#	.github/ISSUE_TEMPLATE/040-refactor.yml
#	.github/ISSUE_TEMPLATE/config.yml
#	.github/pull_request_template.md
#	.github/workflows/bench.yml.disabled
#	.github/workflows/build.yml
#	.github/workflows/labeler.yml
#	CONTRIBUTING.md
#	Makefile
#	README.md
#	SECURITY.md
#	ci/README.md
#	common/CMakeLists.txt
#	docs/android.md
#	docs/backend/SYCL.md
#	docs/build.md
#	docs/cuda-fedora.md
#	docs/development/HOWTO-add-model.md
#	docs/docker.md
#	docs/install.md
#	docs/llguidance.md
#	examples/cvector-generator/README.md
#	examples/imatrix/README.md
#	examples/imatrix/imatrix.cpp
#	examples/llama.android/llama/src/main/cpp/CMakeLists.txt
#	examples/llama.swiftui/README.md
#	examples/llama.vim
#	examples/lookahead/README.md
#	examples/lookup/README.md
#	examples/main/README.md
#	examples/passkey/README.md
#	examples/pydantic_models_to_grammar_examples.py
#	examples/retrieval/README.md
#	examples/server/CMakeLists.txt
#	examples/server/README.md
#	examples/simple-cmake-pkg/README.md
#	examples/speculative/README.md
#	flake.nix
#	grammars/README.md
#	pyproject.toml
#	scripts/check-requirements.sh
2025-02-16 02:08:39 +08:00
Concedo
673e33ca03 correction 2025-02-16 00:55:14 +08:00
Concedo
f48bd3f919 added automatic recovery if bad config is loaded, will restore to known good config 2025-02-15 17:16:21 +08:00
Concedo
f723b08347 fixed adapter bug 2025-02-15 12:06:45 +08:00
Concedo
979088320d downloading fallbacks for aria2, added minimum size (+1 squashed commits)
Squashed commits:

[86b49095] downloading fallbacks for aria2, added minimum size
2025-02-15 00:18:28 +08:00
henk717
53486b6713
Download overhaul (#1369)
* Download overhaul

* Restore deblobbifier

* Cleanup

* Fix incorrect return
2025-02-14 11:40:18 +08:00
Concedo
6e6043fffe fixed autoguess breaking img gen 2025-02-14 11:34:43 +08:00
Concedo
71016db617 remove tts audio caching 2025-02-12 11:37:43 +08:00
Concedo
076e61effc fixed missing param 2025-02-09 16:02:59 +08:00
Concedo
ed8b881c68 rc 1.83.1 2025-02-09 13:20:17 +08:00
Concedo
fc50a29426 Merge branch 'concedo_experimental' of https://github.com/LostRuins/koboldcpp into concedo_experimental 2025-02-09 13:17:29 +08:00
Concedo
1cb42bf260 support running in single process mode without admin flag 2025-02-09 13:17:14 +08:00
Roman Garanin
c0a16b5d4f
Sort model configs in admin menu (#1357) 2025-02-09 12:53:52 +08:00
Concedo
e68a3cf1dc fixed some functions when no model is loaded 2025-02-08 11:15:26 +08:00
Concedo
b100bcb9e6 allow ssl with remote tunnel 2025-02-08 02:11:10 +08:00
Concedo
58e2b19d56 check against platform.machine() 2025-02-08 01:20:31 +08:00
FlippFuzz
5a0ed19c96
Remote Tunnel for ARM64 Linux (#1353)
* Update koboldcpp.py

* Fix style. Changed to double quotes to match.
2025-02-08 01:16:30 +08:00
Concedo
cf4d0085f6 more bugfixes for admin mode 2025-02-08 01:00:52 +08:00
Concedo
b246d83dca fixed some global reference 2025-02-07 14:44:47 +08:00
Concedo
8fef9f3fb5 reloading is working correctly. 2025-02-06 22:24:18 +08:00
Concedo
080d5e6495 new admin endpoints added 2025-02-06 15:19:55 +08:00
Concedo
2c71b1b428 reworking the admin controls 2025-02-05 23:54:07 +08:00
Concedo
c6cd5943cf removed admin panel 2025-02-05 23:40:59 +08:00
Concedo
95d0ef2173 this will probably be reverted since we are changing approach 2025-02-05 22:37:21 +08:00
Concedo
72f0fdfe87 wip on hypervisor 2025-02-05 00:25:22 +08:00
Concedo
7a5499e77b added one more backend for clblast noavx2 and clblast failsafe 2025-01-30 22:47:22 +08:00
Concedo
646df4b126 default to autoguess for chat completions adapter 2025-01-30 00:25:13 +08:00
Concedo
70f1d8d746 vision can set max res (+1 squashed commits)
Squashed commits:

[938fc655] vision can set max res
2025-01-30 00:19:49 +08:00
Concedo
558bc5c901 tts can now set a length limit 2025-01-28 22:06:59 +08:00
Concedo
6bf0b2d062 try casting the numeric fields read 2025-01-28 17:43:28 +08:00
Concedo
0e45d3bb7a quiet flags now set at load time 2025-01-25 16:46:56 +08:00
Concedo
bec231422a Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	common/CMakeLists.txt
#	docs/backend/SYCL.md
#	docs/build.md
#	docs/docker.md
#	examples/export-lora/export-lora.cpp
#	examples/main/README.md
#	examples/main/main.cpp
#	examples/run/README.md
#	examples/run/run.cpp
#	examples/server/README.md
#	examples/simple-chat/simple-chat.cpp
#	ggml/CMakeLists.txt
#	ggml/src/ggml-hip/CMakeLists.txt
#	src/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-chat-template.cpp
2025-01-25 14:16:50 +08:00
Concedo
cca4a934dd fix for chat templates and drafting 2025-01-23 11:49:40 +08:00
Concedo
0e74db7fd4 fixed another tts bug, clblast selection and quiet mode 2025-01-22 21:36:13 +08:00
Concedo
d109d6d8eb do another patch release for the new deepseek models 2025-01-21 08:24:48 +08:00
Concedo
5329df2bdf Merge branch 'upstream' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/server.yml
#	CMakeLists.txt
#	cmake/build-info.cmake
#	examples/run/CMakeLists.txt
#	examples/run/run.cpp
#	examples/simple-chat/simple-chat.cpp
#	tests/CMakeLists.txt
#	tests/test-backend-ops.cpp
#	tests/test-sampling.cpp
2025-01-21 00:25:07 +08:00