koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-10 17:14:36 +00:00

Author	SHA1	Message	Date
Concedo	4b96c3bba8	try new batch api (not actually batching)	2024-11-14 13:47:26 +08:00
Concedo	3813f6c517	added new flag nofastforward allowing users to disable fast forwarding	2024-11-13 10:59:01 +08:00
Concedo	48e9372337	prevent outputting infinity to logprobs (+1 squashed commits) Squashed commits: [bcc5f8b92] prevent outputting infinity to logprobs	2024-11-13 00:09:53 +08:00
kallewoof	3c36bbdcd7	debug: display tokens that were dropped by XTC sampler when debugmode is enabled (#1201 )	2024-11-06 23:09:28 +08:00
Concedo	223c5f0844	clblast survived	2024-11-02 21:51:38 +08:00
Concedo	bbebc76817	fix top picks bug, lower input anti abuse thresholds (+1 squashed commits) Squashed commits: [a81d9b21] fix top picks bug, lower input anti abuse thresholds	2024-11-01 16:42:13 +08:00
Concedo	aa26a58085	added logprobs api and logprobs viewer	2024-11-01 00:22:15 +08:00
Concedo	90f5cd0f67	wip logprobs data	2024-10-30 00:59:34 +08:00
Concedo	94a5a27b85	Alone in the darkness They're coming for you I know they will try to catch me too Alone in the darkness They're calling for you There's nowhere to run for cover	2024-10-24 22:29:20 +08:00
Concedo	becd737e0f	slightly increase padding to handle longer gen amts	2024-10-23 22:58:41 +08:00
Maya	8bb220329c	Dynamic sizes for sequences (#1157 ) * Dynamic sizes for sequences * cleanup PR - move all dynamic fields to end of payload, ensure correct null handling to match existing behavior, add anti abuse limit of max 512 for dynamic fields * adjust anti abuse limits --------- Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>	2024-10-16 23:55:11 +08:00
Concedo	7f76425450	lower topk prefilter token amount to 3k	2024-10-16 20:39:41 +08:00
Concedo	cff72c5d26	remove unwanted print	2024-10-11 18:56:32 +08:00
Concedo	e692a79aab	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/docker.yml # CMakeLists.txt # CONTRIBUTING.md # docs/android.md # docs/docker.md # examples/embedding/embedding.cpp # examples/imatrix/imatrix.cpp # examples/infill/infill.cpp # examples/llama-bench/llama-bench.cpp # examples/main/README.md # examples/parallel/parallel.cpp # examples/perplexity/perplexity.cpp # examples/quantize-stats/quantize-stats.cpp # examples/save-load-state/save-load-state.cpp # examples/server/README.md # examples/simple/CMakeLists.txt # examples/speculative/speculative.cpp # flake.lock # ggml/src/CMakeLists.txt # ggml/src/ggml-blas.cpp # pocs/vdot/q8dot.cpp # pocs/vdot/vdot.cpp # scripts/debug-test.sh # scripts/sync-ggml.last # src/llama.cpp # tests/test-backend-ops.cpp # tests/test-chat-template.cpp # tests/test-quantize-fns.cpp # tests/test-quantize-perf.cpp # tests/test-tokenizer-0.cpp # tests/test-tokenizer-1-bpe.cpp # tests/test-tokenizer-1-spm.cpp	2024-10-11 11:59:59 +08:00
Maya	5c9650d68e	Fix access violation when using banned_phrases (#1154 )	2024-10-10 21:46:39 +08:00
Concedo	fe5479f286	unify antislop and token bans	2024-10-10 18:21:07 +08:00
Concedo	9b614d46bd	antislop sampler working	2024-10-09 16:33:04 +08:00
Concedo	36e9bac98f	wip anti slop sampler	2024-10-09 13:34:47 +08:00
Concedo	f78f8d3d45	wip anti slop	2024-10-07 23:18:13 +08:00
Concedo	65f3c68399	wip antislop	2024-10-07 20:19:22 +08:00
Concedo	740c5e01cb	added token delay feature	2024-10-07 19:45:51 +08:00
Concedo	3e8bb10e2d	wip on rewind function	2024-10-06 16:21:03 +08:00
Concedo	c38d1ecc8d	update templates, fix rwkv	2024-09-22 01:32:12 +08:00
Concedo	53bf0fb32d	removed openblas backend, merged into CPU (with llamafile for BLAS). GPU backend is now automatically selected when running from CLI unless noblas is specified.	2024-09-15 19:21:52 +08:00
Concedo	e44ddf26ef	Merge branch 'upstream' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .github/workflows/server.yml # CMakeLists.txt # Makefile # examples/embedding/embedding.cpp # examples/imatrix/imatrix.cpp # examples/llama-bench/llama-bench.cpp # examples/llava/MobileVLM-README.md # examples/parallel/parallel.cpp # examples/perplexity/perplexity.cpp # examples/quantize/CMakeLists.txt # examples/server/README.md # examples/speculative/speculative.cpp # tests/test-backend-ops.cpp	2024-09-13 16:17:24 +08:00
Concedo	7bdac9bc44	prevent shifting on rwkv	2024-09-11 20:22:45 +08:00
Concedo	eee67281be	move kcpp params out	2024-09-10 16:30:12 +08:00
Concedo	fc7fe2e7a0	allow rwkv6 to run although its broken	2024-09-09 20:50:58 +08:00
Concedo	b63158005f	All samplers moved to kcpp side	2024-09-09 18:14:11 +08:00
Concedo	12fd16bfd4	Merge commit '`df270ef745`' into concedo_experimental # Conflicts: # Makefile # common/CMakeLists.txt # common/common.h # common/sampling.cpp # common/sampling.h # examples/infill/infill.cpp # examples/llama-bench/llama-bench.cpp # examples/quantize-stats/quantize-stats.cpp # examples/server/server.cpp # include/llama.h # src/llama-sampling.cpp # src/llama-sampling.h # src/llama.cpp # tests/test-grammar-integration.cpp # tests/test-grammar-parser.cpp # tests/test-json-schema-to-grammar.cpp # tests/test-llama-grammar.cpp # tests/test-sampling.cpp	2024-09-09 17:10:08 +08:00
Concedo	c78690737c	fix for DRY segfault on unicode character substring tokenization	2024-09-08 18:25:00 +08:00
Concedo	d220495dd4	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/full-cuda.Dockerfile # .devops/llama-cli-cuda.Dockerfile # .devops/llama-server-cuda.Dockerfile # .devops/llama-server-intel.Dockerfile # .devops/llama-server-rocm.Dockerfile # .devops/llama-server-vulkan.Dockerfile # .devops/llama-server.Dockerfile # .github/workflows/docker.yml # docs/docker.md # examples/llama-bench/llama-bench.cpp # flake.lock # ggml/include/ggml.h # ggml/src/CMakeLists.txt # scripts/sync-ggml.last # src/llama.cpp # tests/test-backend-ops.cpp # tests/test-grad0.cpp # tests/test-rope.cpp	2024-08-30 10:37:39 +08:00
Concedo	b78a637da5	try to optimize context shifting	2024-08-26 23:07:31 +08:00
Concedo	cca3c4c78b	xtc fixes	2024-08-22 23:18:46 +08:00
Concedo	fc2545dc83	fixed a typo	2024-08-22 00:25:56 +08:00
Concedo	5bf527a6ae	added xtc sampler	2024-08-21 23:57:15 +08:00
Concedo	1a7ecd55e6	timing for init step, clip for vulkan	2024-08-21 18:14:53 +08:00
Concedo	cd69ab218e	fixed DRY	2024-08-21 17:01:28 +08:00
Concedo	6a4becb731	dry is still buggy because token indexes are wrong	2024-08-21 00:59:26 +08:00
Concedo	db6ef8d1e1	revert dry state reset	2024-08-20 22:22:21 +08:00
Concedo	c1ae350e5b	fixed race condition when generating	2024-08-20 20:17:55 +08:00
Concedo	e12ab53488	force clear some DRY state vars on new generation - not sure if this helps	2024-08-14 21:35:39 +08:00
Concedo	689a17d756	always prefilter to 5k logits	2024-08-12 22:27:06 +08:00
Concedo	729eb1e552	no fast forward for empty prompt	2024-07-27 16:29:35 +08:00
Concedo	eb5b4d0186	Merge branch 'upstream' into concedo_experimental # Conflicts: # Makefile # Package.swift # src/CMakeLists.txt # src/llama.cpp # tests/test-grammar-integration.cpp # tests/test-llama-grammar.cpp	2024-07-23 23:20:32 +08:00
Concedo	e2b36aa6cf	fixed dry loading seq when not in use, set kcppt to -1 layers by default	2024-07-22 15:44:34 +08:00
Concedo	0ecf13fc13	updated lite, extra error logging	2024-07-21 17:55:47 +08:00
Concedo	24b9616344	Merge branch 'upstream' into concedo_experimental # Conflicts: # .devops/full-cuda.Dockerfile # .devops/full-rocm.Dockerfile # .devops/full.Dockerfile # .devops/llama-cli-cuda.Dockerfile # .devops/llama-cli-intel.Dockerfile # .devops/llama-cli-rocm.Dockerfile # .devops/llama-cli-vulkan.Dockerfile # .devops/llama-cli.Dockerfile # .devops/llama-server-cuda.Dockerfile # .devops/llama-server-intel.Dockerfile # .devops/llama-server-rocm.Dockerfile # .devops/llama-server-vulkan.Dockerfile # .devops/llama-server.Dockerfile # CMakeLists.txt # CONTRIBUTING.md # Makefile # ggml/CMakeLists.txt # ggml/src/CMakeLists.txt # requirements.txt # src/llama.cpp # tests/test-backend-ops.cpp	2024-07-19 14:23:33 +08:00
Concedo	5988243aee	fix wrong order, fix llava debug mode failure	2024-07-17 15:30:19 +08:00
Concedo	d775a419b2	updated lite with chat inject, added layer detect, added more console logging	2024-07-16 23:10:15 +08:00

1 2 3 4 5 ...

304 commits