koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-09 08:34:37 +00:00

Author	SHA1	Message	Date
askmyteapot	1e72b65c38	GradientAI Auto ROPE Base calculation (#910 ) * GradientAI Auto ROPE Base calculation https://gradient.ai/blog/scaling-rotational-embeddings-for-long-context-language-models has a formula that better fits the ideal rope scaling. Tested with Lllama3, checked calculation is correct for llama2. Retains logic for not scaling rope if under trained CTX. * add in solar scaling logic Solar based models require the context values to be multiplied by 8. This is (i'm guessing) because the positions as based on a 32k context, but sliding window of 4k. * Update model_adapter.h adding in tensor count to identify solar models based on tensor count of 435. * Update model_adapter.cpp add in n_tensor count for solar identification * refactor and cleanup GradientAI rope scaling --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-06-13 18:12:00 +08:00
Concedo	47c42fd45c	fix for mamba processing	2024-03-13 13:27:46 +08:00
Concedo	f75e479db0	WIP on sdcpp integration	2024-02-29 00:40:07 +08:00
Concedo	762eeb6204	triage for opencl	2024-01-27 11:09:43 +08:00
Concedo	d9a7bd577a	gpu layer offloading disabled for phi models in clblast	2024-01-25 17:40:05 +08:00
Concedo	375003b458	always show reported arch	2023-12-22 11:15:07 +08:00
Concedo	8b919b5b57	allow customized rope to use model set values	2023-11-15 16:21:52 +08:00
Concedo	5db89b90b7	Merge branch 'master' into concedo_experimental # Conflicts: # .gitignore # CMakeLists.txt # Makefile # README.md # build.zig # ggml-opencl.cpp # tests/CMakeLists.txt # tests/test-double-float.cpp # tests/test-sampling.cpp	2023-10-25 23:58:15 +08:00
Concedo	839fc6dac8	handle freq_base_train	2023-10-24 23:44:22 +08:00
Concedo	c1ca1de2ac	fixed support for old falcon models	2023-10-18 17:20:44 +08:00
Concedo	7fb809b94b	fixed auto rope scaling (+1 squashed commits) Squashed commits: [b1767874] wip	2023-09-07 14:45:08 +08:00
Concedo	d4c22a8b02	updated lite, added autorope config based on trained ctxlen, hotfix for falcon gpu broken	2023-08-30 16:50:55 +08:00
Concedo	4b00916ac7	Merge branch 'master' into concedo_experimental # Conflicts: # .dockerignore # .github/workflows/build.yml # CMakeLists.txt # Makefile # README.md # flake.lock # flake.nix # tests/CMakeLists.txt	2023-08-28 14:19:05 +08:00
Concedo	bfdc596d58	gguf reader in file format detection	2023-08-23 19:19:52 +08:00
Concedo	39cc83e8c9	incomplete merge, compiles but generates rubbish	2023-08-22 23:12:47 +08:00
Concedo	3a7853d259	handle stablecode-completion-alpha-3b	2023-08-09 21:07:57 +08:00
Concedo	df9135e3a9	fixing memory bugs	2023-06-23 18:41:23 +08:00
Concedo	9b6c35b651	rwkv speed enhancements (batch processing), fixed a rwkv token processing bug	2023-06-13 16:02:12 +08:00
Concedo	6f82e17b7a	added MPT support	2023-06-03 16:14:08 +08:00
Concedo	5d9f5b28a6	rwkv integration completed	2023-05-28 00:48:56 +08:00
Concedo	01a0f206df	added support for starcoder, which is basically gpt2	2023-05-27 13:35:40 +08:00
Concedo	c048bcfec4	remove old filever checks (+7 squashed commit) Squashed commit: [b72627a] new format not working [e568870] old ver works [7053b77] compile errors fixed, fixing linkers [4ae8889] add new ver [ff82dfd] file format checks [25b8aa8] refactoring type names [931063b] still merging	2023-05-21 00:15:39 +08:00
Concedo	1225fab2ec	fix f16 format detection in neox	2023-05-20 11:26:50 +08:00
Concedo	f65bae760a	Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental # Conflicts: # ggml-opencl.cpp	2023-05-18 15:52:35 +08:00
Concedo	00da2a5f4e	neox is updated	2023-05-17 14:56:54 +08:00
Concedo	b692e4d2a4	wip	2023-05-14 17:21:07 +08:00
Concedo	8a5fe628df	recognize q8_0 as an older format as the new clblast doesnt work correctly with it	2023-05-14 11:06:23 +08:00
Concedo	05cf5f7d6e	partially working, but the blas matmul is broken	2023-05-13 11:35:38 +08:00
Concedo	5eec5d6ed9	Added backwards compatibility to an earlier version of NeoX.	2023-04-25 20:34:18 +08:00
Concedo	c454f8b848	Gpt NeoX / Pythia integration completed	2023-04-22 11:23:25 +08:00
Concedo	ef13443047	wip pythia integration	2023-04-22 01:08:23 +08:00
Concedo	45ec09d31b	fast forwarding for rwkv for unmodified contexts	2023-04-19 15:09:35 +08:00
Concedo	763ad172c0	arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation	2023-04-17 17:31:45 +08:00
Concedo	525184930d	added a kobold API compatible implementation of stopping sequences	2023-04-16 18:37:49 +08:00
Concedo	8dc06c7ab3	Fixed compile error in OSX	2023-04-15 01:13:56 +08:00
Concedo	c3b810868d	fixed an offset bug?	2023-04-15 00:30:00 +08:00
Concedo	a819f22cac	Merge branch 'master' into concedo # Conflicts: # CMakeLists.txt # Makefile # README.md # flake.nix	2023-04-14 21:40:33 +08:00
Concedo	adb4df78d6	Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation.	2023-04-14 21:24:16 +08:00
Concedo	18a154715e	added version label, improved file type checks	2023-04-10 01:03:09 +08:00
Concedo	d8e37bfe75	new gpt2 format supported	2023-04-08 17:35:36 +08:00
Concedo	14273fea7a	integrated gpt2 support	2023-04-04 23:15:47 +08:00
Concedo	52de932842	removed main.exe to reduce clutter, added support for rep pen in gptj	2023-04-04 20:43:13 +08:00
Concedo	8dd8ab1659	Various enhancement and integration pygmalion.cpp	2023-04-03 00:04:43 +08:00
Concedo	9aabb0d9db	massive refactor completed, GPT-J integrated	2023-04-02 17:03:30 +08:00

44 commits