Concedo
9f976e9c65
swa full used unless ctx shift and fast forward disabled
2025-05-21 22:47:45 +08:00
Concedo
6b7d2349a7
Rewrite history to fix bad vulkan shader commits without increasing repo size
...
added dpe colab (+8 squashed commit)
Squashed commit:
[b8362da4] updated lite
[ed6c037d] move nsigma into the regular sampler stack
[ac5f61c6] relative filepath fixed
[05fe96ab] export template
[ed0a5a3e] nix_example.md: refactor (#1401 )
* nix_example.md: add override example
* nix_example.md: drop graphics example, already basic nixos knowledge
* nix_example.md: format
* nix_example.md: Vulkan is disabled on macOS
Disabled in: 1ccd253acc
* nix_examples.md: nixpkgs.config.cuda{Arches -> Capabilities}
Fixes: https://github.com/LostRuins/koboldcpp/issues/1367
[675c62f7] AutoGuess: Phi 4 (mini) (#1402 )
[4bf56982
] phrasing
[b8c0df04
] Add Rep Pen to Top N Sigma sampler chain (#1397 )
- place after nsigma and before xtc (+3 squashed commit)
Squashed commit:
[87c52b97
] disable VMM from HIP
[ee8906f3
] edit description
[e85c0e69
] Remove Unnecessary Rep Counting (#1394 )
* stop counting reps
* fix range-based initializer
* strike that - reverse it
2025-03-05 00:02:20 +08:00
Concedo
f2ac10c014
added nsigma to lite
2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp ( #1384 )
...
* Add N Sigma Sampler
* update nsigma sampler chain
* xtc position fix
* remove stray newline
---------
Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00
Concedo
f75bbb945f
speculative decoding initial impl completed (+6 squashed commit)
...
Squashed commit:
[0a6306ca0] draft wip dont use (will be squashed)
[a758a1c9c] wip dont use (will be squashed)
[e1994d3ce] wip dont use
[f59690d68] wip
[77228147d] wip on spec decoding. dont use yet
[2445bca54] wip adding speculative decoding (+1 squashed commits)
Squashed commits:
[50e341bb7] wip adding speculative decoding
2024-11-30 10:41:10 +08:00
Concedo
3813f6c517
added new flag nofastforward allowing users to disable fast forwarding
2024-11-13 10:59:01 +08:00
Concedo
eee67281be
move kcpp params out
2024-09-10 16:30:12 +08:00
Concedo
c08d7e5042
wip integration of llava
2024-03-10 11:18:47 +08:00
Concedo
693f3f0b00
try to use allocator for cuda ggml v3
2024-01-20 12:53:31 +08:00
Concedo
db14de5c32
fossilize ggml library ver 3, to support ggjtv3
2024-01-20 10:49:25 +08:00
Concedo
a6eb9b8010
Fix GPT2 not loading due to graph too small
2023-11-26 23:06:42 +08:00
Concedo
b8372d4466
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .gitignore
# README.md
# tests/CMakeLists.txt
2023-08-24 15:21:24 +08:00
Concedo
374fffb9c6
Reworking rope WIP
2023-07-19 00:54:41 +08:00
Concedo
1b71752a9f
Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX
2023-06-22 00:43:25 +08:00
Concedo
6f82e17b7a
added MPT support
2023-06-03 16:14:08 +08:00
Concedo
c048bcfec4
remove old filever checks (+7 squashed commit)
...
Squashed commit:
[b72627a] new format not working
[e568870] old ver works
[7053b77] compile errors fixed, fixing linkers
[4ae8889] add new ver
[ff82dfd] file format checks
[25b8aa8] refactoring type names
[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
a0cfed1e30
still merging in process
2023-05-20 15:58:33 +08:00
Concedo
00da2a5f4e
neox is updated
2023-05-17 14:56:54 +08:00
Concedo
90fe9096b4
clean and refactoring pass before supporting newer models for different arch
2023-05-17 11:23:29 +08:00
Concedo
105f818d45
integrated new version of rwkv from upstream
2023-05-03 23:26:39 +08:00
Concedo
032a171867
integrated q5 formats
2023-04-28 12:58:39 +08:00
Concedo
68898046c2
accidentally added the binaries onto repo again.
2023-04-22 00:41:19 +08:00
Concedo
763ad172c0
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
2023-04-17 17:31:45 +08:00
Concedo
d8e37bfe75
new gpt2 format supported
2023-04-08 17:35:36 +08:00
Concedo
14273fea7a
integrated gpt2 support
2023-04-04 23:15:47 +08:00
Concedo
8dd8ab1659
Various enhancement and integration pygmalion.cpp
2023-04-03 00:04:43 +08:00
Concedo
9aabb0d9db
massive refactor completed, GPT-J integrated
2023-04-02 17:03:30 +08:00
Concedo
b1f08813e3
added support for gpt4all original format
2023-04-02 00:53:46 +08:00
Concedo
6b86f5ea22
halfway refactoring, wip adding other model types
2023-04-01 01:13:05 +08:00