Commit graph

143 commits

Author SHA1 Message Date
Concedo
6ce85c54d6 not working correctly 2025-06-02 22:12:10 +08:00
Concedo
8e1ebc55b5 dropped support for lora base as upstream no longer uses it. If provided it will be silently ignored 2025-06-02 12:49:53 +08:00
Concedo
51dc1cf920 added scale for text lora 2025-06-02 00:13:42 +08:00
Concedo
c4df151298 experimental swa flag 2025-05-23 21:33:26 +08:00
Concedo
38a8778f24 wip cfg scale 2025-05-06 23:06:25 +08:00
Concedo
13cee48740 embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)
Squashed commits:

[b9b695217] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)

Squashed commits:

[90b5d389d] embed aria2c for windows, add slowness check with highpriority recommendation (+1 squashed commits)

Squashed commits:

[fbbaa989f] embed aria2c for windows
2025-05-06 18:56:02 +08:00
Concedo
f59b5eb561 added toggle for guidance 2025-05-05 22:21:46 +08:00
Concedo
9cd6a1add2 allow mmproj to be run on cpu 2025-04-21 21:03:10 +08:00
Concedo
2ed6850c0b added override tensor 2025-04-20 20:56:17 +08:00
Concedo
c67510718e kv override option (+1 squashed commits)
Squashed commits:

[e615fc01] kv override option
2025-04-17 14:22:30 +08:00
Concedo
27f575dc83 inpaining support completed, invert mask added 2025-04-09 23:50:17 +08:00
Concedo
23339ace9b inpainting works in kcpp! 2025-04-09 23:01:05 +08:00
Concedo
e37f27632f clear cpu flag manually for templates, added truncation for embeddings 2025-04-02 00:18:30 +08:00
Concedo
2bdf1dacff embeddings done 2025-03-25 22:41:46 +08:00
Concedo
3992fb79cc wip adding embeddings support 2025-03-24 18:01:23 +08:00
Concedo
c1e58419c7 support for voice cloning is done (+2 squashed commit)
Squashed commit:

[e7301628] support for voice cloning is done

[1653c576] wip adding voice cloning
2025-03-21 22:28:59 +08:00
Concedo
e84596ec1a add config for default gen tokens and bos toggle 2025-03-15 19:53:06 +08:00
Concedo
eb1809c105 add more perf stats 2025-03-12 18:58:27 +08:00
Concedo
f2ac10c014 added nsigma to lite 2025-02-21 15:11:24 +08:00
EquinoxPsychosis
2740af3660
add top n sigma sampler from llama.cpp (#1384)
* Add N Sigma Sampler

* update nsigma sampler chain

* xtc position fix

* remove stray newline

---------

Co-authored-by: CasualAutopsy <casual_autopsy@outlook.com>
2025-02-21 14:31:42 +08:00
Concedo
71016db617 remove tts audio caching 2025-02-12 11:37:43 +08:00
Concedo
70f1d8d746 vision can set max res (+1 squashed commits)
Squashed commits:

[938fc655] vision can set max res
2025-01-30 00:19:49 +08:00
Concedo
558bc5c901 tts can now set a length limit 2025-01-28 22:06:59 +08:00
Concedo
0e45d3bb7a quiet flags now set at load time 2025-01-25 16:46:56 +08:00
Concedo
fa7e661133 various fixes 2025-01-18 23:52:39 +08:00
Concedo
e8570de0e6 improved tts default voices quality and sample rate 2025-01-17 18:45:16 +08:00
Concedo
8e3cad1aa2 added audio caching, as a hacky fix for ST TTS bug 2025-01-16 12:04:58 +08:00
Concedo
b3de1598e7 Fixed some GGUFv1 loading bugs, long overdue cleanup for compiling, integrated TTS
tts is functional (+6 squashed commit)

Squashed commit:

[22396311] wip tts

[3a883027] tts not yet working

[0dcfab0e] fix silly bug

[a378d9ef] some long overdue cleanup

[fc5a6fb5] Wip tts

[39f50497] wip TTS integration
2025-01-13 14:23:25 +08:00
Concedo
91b6e29af3 added multilingual support for whisper 2025-01-09 23:28:52 +08:00
Concedo
0cb599546e increase max supported llava images to 8 2025-01-09 22:12:06 +08:00
Concedo
568e476997 added toggle for vae tiling, use custom memory buffer 2025-01-08 13:12:03 +08:00
Concedo
60cd68a39d draft model sets gpu split instead of id, made mmq default for cli 2024-12-14 23:58:45 +08:00
Concedo
595cc6975f added new flags --moeexperts --failsafe --draftgpulayers and --draftgpuid 2024-12-13 17:11:59 +08:00
Concedo
e9d2332dd8 improved tool calls and whisper 2024-12-06 14:34:31 +08:00
Concedo
32ac3153e4 default speculative set to 8. added more adapter fields 2024-11-30 16:18:27 +08:00
Concedo
e0c59486ee default to 12 tokens drafted 2024-11-30 11:52:07 +08:00
Concedo
b21d0fe3ac customizable speculative size 2024-11-30 11:28:19 +08:00
Concedo
f75bbb945f speculative decoding initial impl completed (+6 squashed commit)
Squashed commit:

[0a6306ca0] draft wip dont use (will be squashed)

[a758a1c9c] wip dont use (will be squashed)

[e1994d3ce] wip dont use

[f59690d68] wip

[77228147d] wip on spec decoding. dont use yet

[2445bca54] wip adding speculative decoding (+1 squashed commits)

Squashed commits:

[50e341bb7] wip adding speculative decoding
2024-11-30 10:41:10 +08:00
Concedo
3813f6c517 added new flag nofastforward allowing users to disable fast forwarding 2024-11-13 10:59:01 +08:00
Concedo
ccbd630a42 allow custom t5, clipl and clipg 2024-11-06 19:05:48 +08:00
Concedo
aa26a58085 added logprobs api and logprobs viewer 2024-11-01 00:22:15 +08:00
Concedo
90f5cd0f67 wip logprobs data 2024-10-30 00:59:34 +08:00
Maya
8bb220329c
Dynamic sizes for sequences (#1157)
* Dynamic sizes for sequences

* cleanup PR - move all dynamic fields to end of payload, ensure correct null handling to match existing behavior, add anti abuse limit of max 512 for dynamic fields

* adjust anti abuse limits

---------

Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
2024-10-16 23:55:11 +08:00
Concedo
1d40303050 increase again 2024-10-14 22:09:26 +08:00
Concedo
5ad826b82a updated lite (+2 squashed commit)
Squashed commit:

[31a99e1f] bump baned phrase a bit more again

[c999736b] small fix
2024-10-11 11:05:04 +08:00
Concedo
a3b104a422 further increase some limits 2024-10-10 22:27:28 +08:00
Concedo
d75cbd671d alias banned_tokens with banned_strings from ST
increase max bans to 32 for now
2024-10-10 21:52:46 +08:00
Concedo
fe5479f286 unify antislop and token bans 2024-10-10 18:21:07 +08:00
Concedo
65f3c68399 wip antislop 2024-10-07 20:19:22 +08:00
Concedo
5bf527a6ae added xtc sampler 2024-08-21 23:57:15 +08:00