Commit graph

203 commits

Author SHA1 Message Date
Concedo
2b02cd75c7 reformat debug logging 2024-02-01 23:20:51 +08:00
Concedo
340fbbbb04 show warning if genamt >= ctxsize, show t/s values 2024-01-31 18:51:42 +08:00
Concedo
13dcf4b556 print seed 2024-01-31 14:42:47 +08:00
Concedo
21ab727e83 change split mode to rows 2024-01-30 22:30:08 +08:00
Concedo
ed09a854f0 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	README.md
#	ci/run.sh
#	ggml-opencl.cpp
#	tests/CMakeLists.txt
2024-01-27 11:45:07 +08:00
Concedo
762eeb6204 triage for opencl 2024-01-27 11:09:43 +08:00
Concedo
d9a7bd577a gpu layer offloading disabled for phi models in clblast 2024-01-25 17:40:05 +08:00
Concedo
08236ccc97 better abort handling, added support for dynatemp exponent 2024-01-23 16:56:12 +08:00
Concedo
5ff53507c4 fixed compile issues for cublas 2024-01-21 14:23:48 +08:00
Concedo
5639c1a520 units (+2 squashed commit)
Squashed commit:

[166979d9] units coversion

[038dd5d4] get rid of all warnings (+1 squashed commits)

Squashed commits:

[6efd1e1b] get rid of all warnings
2024-01-20 23:53:21 +08:00
Concedo
db14de5c32 fossilize ggml library ver 3, to support ggjtv3 2024-01-20 10:49:25 +08:00
kalomaze
123bff9a0f
Full DynaTemp implementation + UI (#600)
* move Dynatemp changes to new branch

* fix float header

* Properly reintroduce variable expert count

Controllable through experts.txt

* first pass at DynaTemp UI

Checkbox partial implemented, Min and Max Temp implemented

* DynaTemp UI Checkbox

Trigger DynaTemp on checkbox

* DynaTemp UI checkbox edition

Hell Yeah! DynaTemp!

* Remove greedy dynatemp

* Fix race condition caused by debug print

* Fixed broken presets and miro

Fixes broken presets and mirostat

* Remove debug function + HHI temp

Also removed unnecessary softmax double precision

* Fix whitespace (?) for generate function

* epic upstream renaming scheme fix

* fix stupid indents

* Other cleanup

Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp

* Slight indent fix

* revert batch pyinstaller maker to mainline

and also delete experts.txt since adjustable routing is also being removed for the PR

* compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.

---------

Co-authored-by: Alexander Abushady <aabushady214@gmail.com>
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-06 11:13:16 +08:00
Concedo
e49d398f73 use same struct size for cuda and non cuda (+1 squashed commits)
Squashed commits:

[6eee8e2f] use same struct size for cuda and non cuda
2024-01-03 16:05:54 +08:00
Concedo
94e68fe474 added field to show recent seed 2024-01-02 15:35:04 +08:00
Concedo
5e59112de8 prevent other calls when uninitialized 2023-12-28 12:04:53 +08:00
Concedo
2d5d82e915 addlocate gpt_params on heap instead to avoid rare segfault 2023-12-28 11:48:21 +08:00
DebuggingLife46
e733a9e425
Add logit_bias to the OpenAI api (#577)
* Add logit_bias to the OpenAI api

* Cleanup and refactor, test in swagger.

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-12-27 00:26:19 +08:00
Concedo
8823e8b06d added presence penalty into lite ui 2023-12-23 10:39:40 +08:00
Concedo
77463e0e9c batch size improvements 2023-12-22 15:27:40 +08:00
Concedo
3f863eed72 add presence penalty 2023-12-19 23:18:56 +08:00
Concedo
7469f202ea use lowvram flag for offload qkv 2023-12-08 18:16:14 +08:00
Concedo
ec21fa7712 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	Package.swift
#	README.md
#	ggml-cuda.cu
#	llama.cpp
#	llama.h
#	scripts/sync-ggml.sh
#	tests/CMakeLists.txt
2023-12-08 17:42:26 +08:00
Concedo
c7511526a2 noscript mode is done 2023-12-07 00:52:25 +08:00
Concedo
6570a2005b token count includes ids 2023-12-03 15:44:53 +08:00
Concedo
c142c5634a fixed segfault with clblast by reversing commit in issue https://github.com/ggerganov/llama.cpp/issues/4296 2023-12-03 00:56:00 +08:00
Concedo
12f66eaa1d adjust fragmentation fix 2023-12-02 15:59:08 +08:00
Concedo
a012342a77 updated docs, shifted kv extra space to be subtracted from user's ctx value instead of added on load. 2023-11-30 14:19:40 +08:00
Concedo
ba5c33319b Allocate a small amount of extra context for GGUF to deal with KV fragmentation causing issues in some scenarios. 2023-11-28 20:55:14 +08:00
Concedo
bffa78116d explore quiet mode 2023-11-26 23:57:27 +08:00
Concedo
a6eb9b8010 Fix GPT2 not loading due to graph too small 2023-11-26 23:06:42 +08:00
Concedo
eb42c73953 revert auto rope scaling for already-ropetuned models - just use their values 2023-11-24 14:20:36 +08:00
Concedo
4d7c14be73 fix stop seq escaping newline 2023-11-20 22:35:45 +08:00
Concedo
cf646fa809 try to scale custom roped models 2023-11-19 16:24:13 +08:00
Concedo
8b919b5b57 allow customized rope to use model set values 2023-11-15 16:21:52 +08:00
Concedo
be92cfa125 added preloadstory 2023-11-10 13:05:22 +08:00
Concedo
fb3bcac368 handle memory separately for kcpp 2023-11-07 17:15:14 +08:00
Concedo
1e7088a80b autopick cublas in gui if possible, better layer picking logic 2023-11-05 01:35:27 +08:00
Concedo
ae2cd56de8 kobold integration of min_p sampler (+1 squashed commits)
Squashed commits:

[8ad2e349] kobold integration for min_p sampler
2023-11-01 19:08:45 +08:00
Concedo
cc5b282350 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	build.zig
#	flake.lock
#	flake.nix
#	ggml.c
2023-10-31 20:44:04 +08:00
Concedo
9eba77c6a0 finally got something workable 2023-10-30 23:30:21 +08:00
Concedo
7f050b5d16 tweak numbers 2023-10-29 22:46:19 +08:00
Concedo
7924592a83 context shift feature done 2023-10-29 18:21:39 +08:00
Concedo
338d6c265d fixes to smartcontextpro 2023-10-29 10:42:37 +08:00
Concedo
20ef442c2a fixed for smartcontext 2023-10-28 19:09:22 +08:00
Concedo
15f525c580 revamped smart context for llama models 2023-10-28 12:59:08 +08:00
Concedo
0f46534866 wip 2023-10-26 21:58:51 +08:00
Concedo
5db89b90b7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	README.md
#	build.zig
#	ggml-opencl.cpp
#	tests/CMakeLists.txt
#	tests/test-double-float.cpp
#	tests/test-sampling.cpp
2023-10-25 23:58:15 +08:00
Concedo
839fc6dac8 handle freq_base_train 2023-10-24 23:44:22 +08:00
Concedo
cff75061fe fixed some old models failing due to tokenizer changes, update lite (+1 squashed commits)
Squashed commits:

[9dee81ec] fixed some old models failing due to tokenizer changes, update lite tooltip (+3 squashed commit)

Squashed commit:

[5ab95a79] fixes

[a561d5e2] fixed some old models failing due to tokenizer changes

[95e65daf] lite updates
2023-10-22 11:04:59 +08:00
kalomaze
ddce116ec9
Fix for Top K disabling (#480)
* Update gpttype_adapter.cpp

* use n_vocab instead of 32000 for when top k is off
2023-10-19 23:20:44 +08:00