Concedo
b4a8a5a278
Added CLI chat mode
...
minor cli fixes (+1 squashed commits)
Squashed commits:
[60af39a9] Added CLI chat mode
2025-03-26 21:01:58 +08:00
Concedo
2bdf1dacff
embeddings done
2025-03-25 22:41:46 +08:00
Concedo
82f2654049
wip embeddings model
2025-03-25 00:18:02 +08:00
Concedo
3992fb79cc
wip adding embeddings support
2025-03-24 18:01:23 +08:00
Concedo
b1641ee4a2
allow quant K without quant V but with a warning (+1 squashed commits)
...
Squashed commits:
[45408dd9] allow quant K without quant V but with a warning
2025-03-23 22:56:02 +08:00
Concedo
a20a29ddeb
tool calling improved, auto now works
2025-03-22 17:44:55 +08:00
Concedo
350427dc3a
adjust subprocess timeouts
2025-03-22 11:10:01 +08:00
InconsolableCellist
e31da5861a
1435: add timeout for vulkaninfo ( #1436 )
...
* Fix the Colab PR
* 1435: add timeout for vulkaninfo
There's a bug in vulkaninfo where it can hang, and this will prevent
koboldcpp from starting. This adds a 5 second timeout
* restoring colab.ipynb
* Formatting
---------
Co-authored-by: henk717 <henk@henk.tech>
2025-03-22 11:01:22 +08:00
Concedo
c1e58419c7
support for voice cloning is done (+2 squashed commit)
...
Squashed commit:
[e7301628] support for voice cloning is done
[1653c576] wip adding voice cloning
2025-03-21 22:28:59 +08:00
Concedo
a66d0f7743
safeguard for bad vision input
2025-03-20 22:20:38 +08:00
Concedo
9d5efd68b6
dont print full base64 for images
2025-03-20 21:20:04 +08:00
Concedo
0c90d2ebcf
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# cmake/common.cmake
# docs/backend/SYCL.md
# examples/main/README.md
# examples/speculative/speculative.cpp
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-musa/CMakeLists.txt
# ggml/src/ggml-sycl/CMakeLists.txt
# ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
# tests/test-backend-ops.cpp
2025-03-19 19:27:11 +08:00
Concedo
ddaa8d5a38
fixed saving path for savedata
2025-03-17 22:19:52 +08:00
Concedo
0cfd8d23cb
handle symlinks (+1 squashed commits)
...
Squashed commits:
[fb8477b9] fixed makefile (+4 squashed commit)
Squashed commit:
[4a245bba] fixed a makefile issue
[d68eba69] alias usehipblas to usecublas
[a9ab0a7c] dynamic rocwmma selection
[fefe17c7] revert rocwmma
2025-03-17 21:03:30 +08:00
Concedo
6888f5495d
allow quantkv with contextshift
2025-03-16 21:48:42 +08:00
Concedo
5ef1722d5f
fix for sd
2025-03-16 17:02:42 +08:00
Concedo
0954e9e476
improve model estimation
2025-03-16 16:14:13 +08:00
Concedo
5d7c5e9e33
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# examples/tts/tts.cpp
2025-03-16 15:42:39 +08:00
Concedo
2401502cbd
improvement to tool calling, allowing specific tools to be used
2025-03-16 15:20:08 +08:00
Concedo
9f7fd63160
revert unwanted change to tool calling
2025-03-16 01:35:48 +08:00
Concedo
e84596ec1a
add config for default gen tokens and bos toggle
2025-03-15 19:53:06 +08:00
Concedo
7272165e0e
verbosity
2025-03-15 12:13:04 +08:00
Concedo
4212f0b8e8
wip on multiple fixes
2025-03-15 10:50:36 +08:00
Concedo
d7498e7e8a
added model switching to gguf in admin mode (auto guess layers)
2025-03-14 19:45:55 +08:00
Concedo
30cb77a900
rename replace_instruct_placeholders field
2025-03-14 18:37:12 +08:00
Concedo
782e1e193a
replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else (+1 squashed commits)
...
Squashed commits:
[4a73c8d3] replaced winclinfo.exe with a simplified simpleclinfo.exe that only provides device names and nothing else
2025-03-14 18:18:32 +08:00
Concedo
6a1dd57435
gemma3 template, updated lite, fixed tool calling, reenable ctx shift for gemma3
2025-03-14 17:47:01 +08:00
Concedo
57c9523405
sd lora from url
2025-03-13 10:55:01 +08:00
Concedo
eb1809c105
add more perf stats
2025-03-12 18:58:27 +08:00
Concedo
c55bb9a63d
use actual null instead of string "null" for finish_reason in openai responses
2025-03-07 15:18:33 +08:00
Concedo
ec43d2b147
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# README.md
# common/common.cpp
# examples/embedding/embedding.cpp
# examples/json_schema_to_grammar.py
# examples/llama.android/llama/src/main/cpp/llama-android.cpp
# examples/llama.swiftui/README.md
# examples/llama.swiftui/llama.swiftui.xcodeproj/project.pbxproj
# examples/lookahead/lookahead.cpp
# examples/parallel/parallel.cpp
# examples/passkey/passkey.cpp
# ggml/CMakeLists.txt
# ggml/src/CMakeLists.txt
# ggml/src/ggml-cpu/CMakeLists.txt
# requirements.txt
# requirements/requirements-all.txt
# scripts/fetch_server_test_models.py
# tests/test-chat.cpp
# tests/test-json-schema-to-grammar.cpp
2025-03-06 18:54:58 +08:00
Concedo
6b7d2349a7
Rewrite history to fix bad vulkan shader commits without increasing repo size
...
added dpe colab (+8 squashed commit)
Squashed commit:
[b8362da4] updated lite
[ed6c037d] move nsigma into the regular sampler stack
[ac5f61c6] relative filepath fixed
[05fe96ab] export template
[ed0a5a3e] nix_example.md: refactor (#1401 )
* nix_example.md: add override example
* nix_example.md: drop graphics example, already basic nixos knowledge
* nix_example.md: format
* nix_example.md: Vulkan is disabled on macOS
Disabled in: 1ccd253acc
* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}
Fixes: https://github.com/LostRuins/koboldcpp/issues/1367
[675c62f7] AutoGuess: Phi 4 (mini) (#1402 )
[4bf56982
] phrasing
[b8c0df04
] Add Rep Pen to Top N Sigma sampler chain (#1397 )
- place after nsigma and before xtc (+3 squashed commit)
Squashed commit:
[87c52b97
] disable VMM from HIP
[ee8906f3
] edit description
[e85c0e69
] Remove Unnecessary Rep Counting (#1394 )
* stop counting reps
* fix range-based initializer
* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
50eae1ffeb
added trycatch for ipv4
2025-02-26 00:45:06 +08:00
Concedo
12c501f723
fixed wrong file open mode
2025-02-24 15:14:02 +08:00
Concedo
ccd2dbe020
added support for server side save slots
2025-02-24 00:20:16 +08:00
Concedo
f2ac10c014
added nsigma to lite
2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp ( #1384 )
...
* Add N Sigma Sampler
* update nsigma sampler chain
* xtc position fix
* remove stray newline
---------
Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00
Concedo
41350df81f
updated lite, added ability to export kcpps via CLI
2025-02-20 22:58:12 +08:00
Concedo
6fa50f78bf
allow kcppt for config switching
2025-02-17 00:48:34 +08:00
Concedo
15ae98c9cd
better error handling for downloads
2025-02-16 23:13:09 +08:00
Concedo
58380153b2
safer autoguess fix
...
verbose outputs (+3 squashed commit)
Squashed commit:
[7bbbfc10] fixed a retry history bug
[824b9bf7] another autoguess fix
2025-02-16 21:13:45 +08:00
Concedo
5a79dd57b9
add short delay before launching browser
2025-02-16 12:45:14 +08:00
Concedo
299d6ce0ed
horde advertised max ctx
2025-02-16 11:59:08 +08:00
Concedo
f144b1f345
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/llama-cpp-cuda.srpm.spec
# .devops/llama-cpp.srpm.spec
# .devops/nix/package.nix
# .devops/rocm.Dockerfile
# .github/ISSUE_TEMPLATE/020-enhancement.yml
# .github/ISSUE_TEMPLATE/030-research.yml
# .github/ISSUE_TEMPLATE/040-refactor.yml
# .github/ISSUE_TEMPLATE/config.yml
# .github/pull_request_template.md
# .github/workflows/bench.yml.disabled
# .github/workflows/build.yml
# .github/workflows/labeler.yml
# CONTRIBUTING.md
# Makefile
# README.md
# SECURITY.md
# ci/README.md
# common/CMakeLists.txt
# docs/android.md
# docs/backend/SYCL.md
# docs/build.md
# docs/cuda-fedora.md
# docs/development/HOWTO-add-model.md
# docs/docker.md
# docs/install.md
# docs/llguidance.md
# examples/cvector-generator/README.md
# examples/imatrix/README.md
# examples/imatrix/imatrix.cpp
# examples/llama.android/llama/src/main/cpp/CMakeLists.txt
# examples/llama.swiftui/README.md
# examples/llama.vim
# examples/lookahead/README.md
# examples/lookup/README.md
# examples/main/README.md
# examples/passkey/README.md
# examples/pydantic_models_to_grammar_examples.py
# examples/retrieval/README.md
# examples/server/CMakeLists.txt
# examples/server/README.md
# examples/simple-cmake-pkg/README.md
# examples/speculative/README.md
# flake.nix
# grammars/README.md
# pyproject.toml
# scripts/check-requirements.sh
2025-02-16 02:08:39 +08:00
Concedo
673e33ca03
correction
2025-02-16 00:55:14 +08:00
Concedo
f48bd3f919
added automatic recovery if bad config is loaded, will restore to known good config
2025-02-15 17:16:21 +08:00
Concedo
f723b08347
fixed adapter bug
2025-02-15 12:06:45 +08:00
Concedo
979088320d
downloading fallbacks for aria2, added minimum size (+1 squashed commits)
...
Squashed commits:
[86b49095] downloading fallbacks for aria2, added minimum size
2025-02-15 00:18:28 +08:00
henk717
53486b6713
Download overhaul ( #1369 )
...
* Download overhaul
* Restore deblobbifier
* Cleanup
* Fix incorrect return
2025-02-14 11:40:18 +08:00
Concedo
6e6043fffe
fixed autoguess breaking img gen
2025-02-14 11:34:43 +08:00