Concedo
08e0745e7e
added singleinstance flag and local shutdown api
2025-05-31 11:37:32 +08:00
Concedo
7a7bdeab6d
json to gbnf endpoint added
2025-04-12 11:41:11 +08:00
Concedo
30e3d24ead
embd include name
2025-04-02 00:40:38 +08:00
Concedo
396875e1c4
update api docs and lite
2025-03-29 15:39:25 +08:00
Concedo
2bdf1dacff
embeddings done
2025-03-25 22:41:46 +08:00
Concedo
6a1dd57435
gemma3 template, updated lite, fixed tool calling, reenable ctx shift for gemma3
2025-03-14 17:47:01 +08:00
Concedo
6b7d2349a7
Rewrite history to fix bad vulkan shader commits without increasing repo size
...
added dpe colab (+8 squashed commit)
Squashed commit:
[b8362da4] updated lite
[ed6c037d] move nsigma into the regular sampler stack
[ac5f61c6] relative filepath fixed
[05fe96ab] export template
[ed0a5a3e] nix_example.md: refactor (#1401 )
* nix_example.md: add override example
* nix_example.md: drop graphics example, already basic nixos knowledge
* nix_example.md: format
* nix_example.md: Vulkan is disabled on macOS
Disabled in: 1ccd253acc
* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}
Fixes: https://github.com/LostRuins/koboldcpp/issues/1367
[675c62f7] AutoGuess: Phi 4 (mini) (#1402 )
[4bf56982
] phrasing
[b8c0df04
] Add Rep Pen to Top N Sigma sampler chain (#1397 )
- place after nsigma and before xtc (+3 squashed commit)
Squashed commit:
[87c52b97
] disable VMM from HIP
[ee8906f3
] edit description
[e85c0e69
] Remove Unnecessary Rep Counting (#1394 )
* stop counting reps
* fix range-based initializer
* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
ccd2dbe020
added support for server side save slots
2025-02-24 00:20:16 +08:00
Concedo
e8570de0e6
improved tts default voices quality and sample rate
2025-01-17 18:45:16 +08:00
Concedo
f8a9634aa2
better xtts and oai speech (+1 squashed commits)
...
Squashed commits:
[34b9c15f] better xtts and oai speech
2025-01-16 00:26:21 +08:00
Concedo
e07de2ea92
try fix webbrowser again
2025-01-15 00:53:24 +08:00
Concedo
636beac6d2
added a nicer built in voice
2025-01-13 23:26:54 +08:00
Concedo
91b6e29af3
added multilingual support for whisper
2025-01-09 23:28:52 +08:00
Concedo
c73d99ccac
updated lite
2025-01-08 13:35:59 +08:00
Concedo
3fea11675d
websearch integrated into lite, changed to POST
2024-12-30 17:30:41 +08:00
Concedo
52cc908f7f
default trim_stop to true, which trims any tokens after a stop sequence and the stop sequence itself. This is potentially a breaking change.
2024-12-03 22:44:10 +08:00
Concedo
bf28d956ae
ollama chat api done
2024-11-24 00:10:15 +08:00
Concedo
c0da7e4dcf
multiplayer activity tracking
2024-11-23 19:59:55 +08:00
Concedo
c2ca2ec2bc
updated docs, fixed a few issues with multiplayer
2024-11-21 18:16:13 +08:00
Concedo
e7897f3257
update docs
2024-11-17 11:43:49 +08:00
Concedo
df7c2b9923
renamed some labels
2024-11-11 19:40:47 +08:00
Concedo
f153a14daf
add common identity provider /.well-known/serviceinfo, updated docs
2024-11-04 21:29:26 +08:00
Concedo
6a27003a06
logprobs feature completed
2024-11-01 15:24:07 +08:00
Concedo
2da346a5ff
updated docs
2024-10-11 20:42:05 +08:00
Concedo
4f57862040
update docs
2024-08-31 22:52:09 +08:00
Concedo
efb8be013e
fixed swagger
2024-08-25 23:29:55 +08:00
Concedo
d775a419b2
updated lite with chat inject, added layer detect, added more console logging
2024-07-16 23:10:15 +08:00
Concedo
e69da9c9d8
strings rename kobold lite to koboldai lite
2024-06-13 20:00:28 +08:00
Concedo
a97f7d5f91
Merge branch 'upstream' into concedo_experimental
...
# Conflicts:
# .devops/full-cuda.Dockerfile
# .devops/full-rocm.Dockerfile
# .devops/full.Dockerfile
# .devops/main-cuda.Dockerfile
# .devops/main-intel.Dockerfile
# .devops/main-rocm.Dockerfile
# .devops/main.Dockerfile
# .devops/server-cuda.Dockerfile
# .devops/server-intel.Dockerfile
# .devops/server-rocm.Dockerfile
# .devops/server.Dockerfile
# .devops/tools.sh
# .github/workflows/docker.yml
# CMakeLists.txt
# Makefile
# README-sycl.md
# README.md
# ci/run.sh
# llama.cpp
# requirements.txt
# requirements/requirements-convert-hf-to-gguf-update.txt
# requirements/requirements-convert-hf-to-gguf.txt
# requirements/requirements-convert-legacy-llama.txt
# requirements/requirements-convert-llama-ggml-to-gguf.txt
# scripts/check-requirements.sh
# scripts/compare-llama-bench.py
# scripts/convert-gg.sh
# scripts/pod-llama.sh
# scripts/sync-ggml-am.sh
# scripts/sync-ggml.last
# scripts/sync-ggml.sh
# tests/CMakeLists.txt
# tests/test-backend-ops.cpp
# tests/test-tokenizer-0.sh
# tests/test-tokenizer-random.py
2024-06-02 12:28:38 +08:00
Concedo
a65e0800ab
update docs, added gui for whisper
2024-06-01 02:01:49 +08:00
Concedo
868446bd1a
replace sdconfig and hordeconfig
2024-05-09 22:43:50 +08:00
Concedo
0d1cd0171a
update docs
2024-05-06 21:17:11 +08:00
Concedo
6c3fd5b685
updated lite (+2 squashed commit)
...
Squashed commit:
[d10a731e] update lite
[2554b8e6] update docs
2024-04-28 10:40:57 +08:00
Concedo
c230b78906
refactored a lot of code, remove bantokens, move it to api
2024-04-27 17:57:13 +08:00
Concedo
b4d2031215
merged, added ability to render special tokens
2024-04-22 18:19:58 +08:00
Concedo
0061299cce
fixed quant tools not compiling, updated docs
2024-04-06 23:11:05 +08:00
Concedo
6c6ad93f01
added basic support for password protection (+2 squashed commit)
...
Squashed commit:
[ff91ca72] added basic support for password protection
[91b0b208] updated docs
2024-03-12 19:47:12 +08:00
Concedo
d59ec68753
added interrogate endpoint (+1 squashed commits)
...
Squashed commits:
[7bf96261] added interrogate endpoint
2024-03-11 18:50:18 +08:00
Concedo
d4a12133e7
added SD samplers endpoint
2024-03-04 14:26:49 +08:00
Concedo
0c59c1ed90
allow specifying width and height
2024-03-03 15:44:15 +08:00
Concedo
59c5448ac8
fixed colab (+1 squashed commits)
...
Squashed commits:
[1d1c686f] updated colab and docs
2024-03-02 10:09:07 +08:00
Concedo
f75e479db0
WIP on sdcpp integration
2024-02-29 00:40:07 +08:00
Concedo
488777114a
added json mode
2024-02-12 17:08:34 +08:00
Concedo
332c5e713b
json self format
2024-02-12 16:50:27 +08:00
Concedo
4cd571db89
vulkan multigpu, show uptime
2024-02-08 16:54:38 +08:00
Concedo
504300784f
updated lite
2024-02-03 21:11:06 +08:00
kalomaze
123bff9a0f
Full DynaTemp implementation + UI ( #600 )
...
* move Dynatemp changes to new branch
* fix float header
* Properly reintroduce variable expert count
Controllable through experts.txt
* first pass at DynaTemp UI
Checkbox partial implemented, Min and Max Temp implemented
* DynaTemp UI Checkbox
Trigger DynaTemp on checkbox
* DynaTemp UI checkbox edition
Hell Yeah! DynaTemp!
* Remove greedy dynatemp
* Fix race condition caused by debug print
* Fixed broken presets and miro
Fixes broken presets and mirostat
* Remove debug function + HHI temp
Also removed unnecessary softmax double precision
* Fix whitespace (?) for generate function
* epic upstream renaming scheme fix
* fix stupid indents
* Other cleanup
Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp
* Slight indent fix
* revert batch pyinstaller maker to mainline
and also delete experts.txt since adjustable routing is also being removed for the PR
* compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.
---------
Co-authored-by: Alexander Abushady <aabushady214@gmail.com>
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-06 11:13:16 +08:00
DebuggingLife46
e733a9e425
Add logit_bias to the OpenAI api ( #577 )
...
* Add logit_bias to the OpenAI api
* Cleanup and refactor, test in swagger.
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-12-27 00:26:19 +08:00
Concedo
2810151b98
update docs
2023-12-13 22:48:29 +08:00
Concedo
a012342a77
updated docs, shifted kv extra space to be subtracted from user's ctx value instead of added on load.
2023-11-30 14:19:40 +08:00