Commit graph

90 commits

Author SHA1 Message Date
Concedo
10a1d628ad added new binding fields for quant k and quant v 2024-06-03 14:35:59 +08:00
Concedo
039cc392d1 pragma once should be on top 2024-06-02 21:36:50 +08:00
askmyteapot
47606f5511
Update expose.h - Compile error (#880)
Fixes compile issues on Windows with w64devkit
2024-06-02 21:35:02 +08:00
Concedo
f24aef8792 initial whisper integration 2024-05-29 23:13:11 +08:00
Concedo
44443edfda rep pen slope works (+1 squashed commits)
Squashed commits:

[535ad566] experiment with rep pen range
2024-05-15 17:20:57 +08:00
Concedo
5d15f8f76a vae test 2024-05-14 19:17:01 +08:00
Concedo
4807b66907 wip sd 2024-05-13 23:23:16 +08:00
Concedo
eff01660e4 re-added smart context due to people complaining 2024-05-11 17:25:03 +08:00
Concedo
dbe72b959e tidy up and refactor code to support old flags 2024-05-10 16:50:53 +08:00
Concedo
173c7272d5 EOS bypass mode added 2024-05-06 18:01:49 +08:00
Concedo
c65448d17a add flash attention toggle 2024-04-30 21:29:11 +08:00
Concedo
c230b78906 refactored a lot of code, remove bantokens, move it to api 2024-04-27 17:57:13 +08:00
Concedo
4ec8a9c57b expose stop reason in generation 2024-04-27 01:12:12 +08:00
Concedo
b4d2031215 merged, added ability to render special tokens 2024-04-22 18:19:58 +08:00
Concedo
743687020d fixed img2img 2024-04-06 17:29:44 +08:00
Concedo
7968bdebbb added more stats in perf 2024-03-16 16:53:48 +08:00
Concedo
d943c739a8 wip submitting of llava image to backend 2024-03-10 17:14:27 +08:00
Concedo
c08d7e5042 wip integration of llava 2024-03-10 11:18:47 +08:00
Concedo
3f475970fa quiet mode for sd 2024-03-08 19:35:29 +08:00
Concedo
0c59c1ed90 allow specifying width and height 2024-03-03 15:44:15 +08:00
Concedo
80011ed8aa KCPP SD: add warn and step restriction., updated lite, handle quant mode 2024-03-01 16:41:19 +08:00
Concedo
3463688a0e image generation is fully working over api (+1 squashed commits)
Squashed commits:

[c98ab0b4] single image generation is working now
2024-03-01 14:43:44 +08:00
Concedo
5a44d4de2b refactor and clean identifiers for sd, fix cmake 2024-02-29 18:28:45 +08:00
Concedo
66134bb36e ui for loading SD models done 2024-02-29 17:08:22 +08:00
Concedo
524ba12abd refactor - do not use a copy buffer to store generation outputs, instead return a cpp allocated ptr 2024-02-29 14:02:20 +08:00
Concedo
f75e479db0 WIP on sdcpp integration 2024-02-29 00:40:07 +08:00
Concedo
35111ce01a row split mode is now a toggle 2024-02-09 18:35:58 +08:00
Concedo
4cd571db89 vulkan multigpu, show uptime 2024-02-08 16:54:38 +08:00
Alexander Abushady
4cb956c7db
Quadratic Sampling UI (#652)
* Quadratic Sampling UI

Kalomaze's Quadratic Sampling, now has a UI within KCPP.

* remove debug prints

* cleanup, add smooth sampler to dynatemp

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-02-04 16:26:27 +08:00
Concedo
2a4a7241e6 Merge branch 'vulkan_test' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	llama.cpp
2024-01-25 23:01:44 +08:00
Concedo
08236ccc97 better abort handling, added support for dynatemp exponent 2024-01-23 16:56:12 +08:00
kalomaze
123bff9a0f
Full DynaTemp implementation + UI (#600)
* move Dynatemp changes to new branch

* fix float header

* Properly reintroduce variable expert count

Controllable through experts.txt

* first pass at DynaTemp UI

Checkbox partial implemented, Min and Max Temp implemented

* DynaTemp UI Checkbox

Trigger DynaTemp on checkbox

* DynaTemp UI checkbox edition

Hell Yeah! DynaTemp!

* Remove greedy dynatemp

* Fix race condition caused by debug print

* Fixed broken presets and miro

Fixes broken presets and mirostat

* Remove debug function + HHI temp

Also removed unnecessary softmax double precision

* Fix whitespace (?) for generate function

* epic upstream renaming scheme fix

* fix stupid indents

* Other cleanup

Reintroduce unused rep pen function, move temp functions first before entropy dynamic temp

* Slight indent fix

* revert batch pyinstaller maker to mainline

and also delete experts.txt since adjustable routing is also being removed for the PR

* compact dynatemp into a single value dynatemp_range. This is a float which represents the allowed deviation from the min and max temperature when using dynatemp. Thus, if we want a value of dynatemp_min=0.3, dynatemp_max=0.5, then we would simply set temperature=0.4 and dynatemp_range=0.1. Functionally dynatemp would operate the same, but it would simplify usage and make it a single easy to adjust value.

---------

Co-authored-by: Alexander Abushady <aabushady214@gmail.com>
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-01-06 11:13:16 +08:00
Concedo
94e68fe474 added field to show recent seed 2024-01-02 15:35:04 +08:00
DebuggingLife46
e733a9e425
Add logit_bias to the OpenAI api (#577)
* Add logit_bias to the OpenAI api

* Cleanup and refactor, test in swagger.

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2023-12-27 00:26:19 +08:00
Concedo
77463e0e9c batch size improvements 2023-12-22 15:27:40 +08:00
Concedo
3f863eed72 add presence penalty 2023-12-19 23:18:56 +08:00
Concedo
ec21fa7712 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	.gitignore
#	CMakeLists.txt
#	Makefile
#	Package.swift
#	README.md
#	ggml-cuda.cu
#	llama.cpp
#	llama.h
#	scripts/sync-ggml.sh
#	tests/CMakeLists.txt
2023-12-08 17:42:26 +08:00
Concedo
6570a2005b token count includes ids 2023-12-03 15:44:53 +08:00
Concedo
bffa78116d explore quiet mode 2023-11-26 23:57:27 +08:00
Concedo
be92cfa125 added preloadstory 2023-11-10 13:05:22 +08:00
Concedo
fb3bcac368 handle memory separately for kcpp 2023-11-07 17:15:14 +08:00
Concedo
ae2cd56de8 kobold integration of min_p sampler (+1 squashed commits)
Squashed commits:

[8ad2e349] kobold integration for min_p sampler
2023-11-01 19:08:45 +08:00
Concedo
7924592a83 context shift feature done 2023-10-29 18:21:39 +08:00
Concedo
d10470a1e3 Breaking Change: Remove deprecated commands 2023-10-03 17:16:09 +08:00
Concedo
bc841ec302 flag to retain grammar, fix makefile (+2 squashed commit)
Squashed commit:

[d5cd3f28] flag to retain grammar, fix makefile

[b3352963] updated lite to v73
2023-10-01 14:39:56 +08:00
Concedo
eb86cd4027 bump token limits 2023-09-27 01:26:00 +08:00
Concedo
8c453d1e4e added grammar sampling 2023-09-18 23:02:00 +08:00
Concedo
89495c0716 handle token unbanning over api 2023-08-30 10:51:49 +08:00
Concedo
18bb0ab127 up ver, support 16k ctx 2023-08-04 21:47:17 +08:00
Concedo
46682e5cb3 added mmq launch flag 2023-08-01 17:57:13 +08:00