Commit graph

29 commits

Author SHA1 Message Date
Concedo
9f976e9c65 swa full used unless ctx shift and fast forward disabled 2025-05-21 22:47:45 +08:00
Concedo
6b7d2349a7 Rewrite history to fix bad vulkan shader commits without increasing repo size
added dpe colab (+8 squashed commit)

Squashed commit:

[b8362da4] updated lite

[ed6c037d] move nsigma into the regular sampler stack

[ac5f61c6] relative filepath fixed

[05fe96ab] export template

[ed0a5a3e] nix_example.md: refactor (#1401)

* nix_example.md: add override example

* nix_example.md: drop graphics example, already basic nixos knowledge

* nix_example.md: format

* nix_example.md: Vulkan is disabled on macOS

Disabled in: 1ccd253acc

* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}

Fixes: https://github.com/LostRuins/koboldcpp/issues/1367

[675c62f7] AutoGuess: Phi 4 (mini) (#1402)

[4bf56982] phrasing

[b8c0df04] Add Rep Pen to Top N Sigma sampler chain (#1397)

- place after nsigma and before xtc (+3 squashed commit)

Squashed commit:

[87c52b97] disable VMM from HIP

[ee8906f3] edit description

[e85c0e69] Remove Unnecessary Rep Counting (#1394)

* stop counting reps

* fix range-based initializer

* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
f2ac10c014 added nsigma to lite 2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp (#1384)
* Add N Sigma Sampler

* update nsigma sampler chain

* xtc position fix

* remove stray newline

---------

Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00
Concedo
f75bbb945f speculative decoding initial impl completed (+6 squashed commit)
Squashed commit:

[0a6306ca0] draft wip dont use (will be squashed)

[a758a1c9c] wip dont use (will be squashed)

[e1994d3ce] wip dont use

[f59690d68] wip

[77228147d] wip on spec decoding. dont use yet

[2445bca54] wip adding speculative decoding (+1 squashed commits)

Squashed commits:

[50e341bb7] wip adding speculative decoding
2024-11-30 10:41:10 +08:00
Concedo
3813f6c517 added new flag nofastforward allowing users to disable fast forwarding 2024-11-13 10:59:01 +08:00
Concedo
eee67281be move kcpp params out 2024-09-10 16:30:12 +08:00
Concedo
c08d7e5042 wip integration of llava 2024-03-10 11:18:47 +08:00
Concedo
693f3f0b00 try to use allocator for cuda ggml v3 2024-01-20 12:53:31 +08:00
Concedo
db14de5c32 fossilize ggml library ver 3, to support ggjtv3 2024-01-20 10:49:25 +08:00
Concedo
a6eb9b8010 Fix GPT2 not loading due to graph too small 2023-11-26 23:06:42 +08:00
Concedo
b8372d4466 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.gitignore
#	README.md
#	tests/CMakeLists.txt
2023-08-24 15:21:24 +08:00
Concedo
374fffb9c6 Reworking rope WIP 2023-07-19 00:54:41 +08:00
Concedo
1b71752a9f Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX 2023-06-22 00:43:25 +08:00
Concedo
6f82e17b7a added MPT support 2023-06-03 16:14:08 +08:00
Concedo
c048bcfec4 remove old filever checks (+7 squashed commit)
Squashed commit:

[b72627a] new format not working

[e568870] old ver works

[7053b77] compile errors fixed, fixing linkers

[4ae8889] add new ver

[ff82dfd] file format checks

[25b8aa8] refactoring type names

[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30 still merging in process 2023-05-20 15:58:33 +08:00
Concedo
00da2a5f4e neox is updated 2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4 clean and refactoring pass before supporting newer models for different arch 2023-05-17 11:23:29 +08:00
Concedo
105f818d45 integrated new version of rwkv from upstream 2023-05-03 23:26:39 +08:00
Concedo
032a171867 integrated q5 formats 2023-04-28 12:58:39 +08:00
Concedo
68898046c2 accidentally added the binaries onto repo again. 2023-04-22 00:41:19 +08:00
Concedo
763ad172c0 arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation 2023-04-17 17:31:45 +08:00
Concedo
d8e37bfe75 new gpt2 format supported 2023-04-08 17:35:36 +08:00
Concedo
14273fea7a integrated gpt2 support 2023-04-04 23:15:47 +08:00
Concedo
8dd8ab1659 Various enhancement and integration pygmalion.cpp 2023-04-03 00:04:43 +08:00
Concedo
9aabb0d9db massive refactor completed, GPT-J integrated 2023-04-02 17:03:30 +08:00
Concedo
b1f08813e3 added support for gpt4all original format 2023-04-02 00:53:46 +08:00
Concedo
6b86f5ea22 halfway refactoring, wip adding other model types 2023-04-01 01:13:05 +08:00