vrr/kvcache-ai-ktransformers

mirror of https://github.com/kvcache-ai/ktransformers.git synced 2025-09-06 12:40:02 +00:00

Author	SHA1	Message	Date
qiyuxinlin	b17ab8653c	update speed test	2025-04-22 07:38:05 +00:00
qiyuxinlin	03a65d6bea	roll back ktransformers backend, add max_tokens, max_completion_tokens param	2025-04-21 12:55:37 +00:00
wang jiahao	a1162eea01	Merge pull request #1158 from Creeper-MZ/function_call Some checks failed Book-CI / test (push) Failing after 9s Details Deploy / deploy (ubuntu-latest) (push) Failing after 2s Details Deploy / deploy (windows-latest) (push) Has been cancelled Details Deploy / deploy (macos-latest) (push) Has been cancelled Details Update Function call	2025-04-19 16:31:37 +08:00
Creeper-MZ	133ba746e9	优化提示词，解决部分Deepseek r1的兼容性优化提示词，解决部分Deepseek r1的兼容性 fix non stream	2025-04-19 01:20:27 -04:00
Creeper-MZ	4fb19bfcae	Update chat.py	2025-04-17 09:19:14 -04:00
Yuhao Tsui	8ce34b3b5c	Modify the performance calculation module Modify the performance data calculation module from estimation to retrieving from `raw_usage`.	2025-04-17 16:57:53 +08:00
wang jiahao	6e4da83d4b	Merge pull request #978 from cyhasuka/main Some checks failed Deploy / deploy (ubuntu-latest) (push) Failing after 3s Details Book-CI / test (push) Has been cancelled Details Deploy / deploy (macos-latest) (push) Has been cancelled Details Deploy / deploy (windows-latest) (push) Has been cancelled Details Feat: Support Non-streaming chat in Ollama backend	2025-04-17 14:34:35 +08:00
Creeper-MZ	cb266c98d4	Fix a bug	2025-04-16 23:31:33 -04:00
Creeper-MZ	6bc2e85343	Update chat.py	2025-04-16 15:54:23 -04:00
Creeper-MZ	88f688e2c8	更改token注入逻辑，减少token注入量，防止遗忘 Update chat.py Update chat.py Update chat.py	2025-04-16 15:52:24 -04:00
Creeper-MZ	a7e8d7c1af	updata function_call	2025-04-13 23:48:51 -04:00
Yuhao Tsui	84164f584c	Update completions.py	2025-03-26 15:39:46 +08:00
Yuhao Tsui	e5694f91c0	Merge branch 'kvcache-ai:main' into main	2025-03-10 09:10:28 +08:00
BITcyman	299c4dca64	[update] support openai chat completion api	2025-03-07 08:51:09 +00:00
Yuhao Tsui	d050d8655f	Update completions.py	2025-03-06 11:16:33 +08:00
chenmz00	b2ba795cfd	fix: list models API Fix the list models API to match the corresponding OpenAI API format.	2025-03-05 21:49:27 +08:00
wang jiahao	26f7b4af11	Merge branch 'main' into temperature_top_p_from_request	2025-02-27 18:08:55 +08:00
Atream	f403cde6d4	Merge pull request #650 from ceerRep/main feat: basic api key support	2025-02-27 12:16:53 +08:00
swu-hyk	ec7e912fee	modify	2025-02-26 19:21:30 +08:00
swu-hyk	68e7df3a25	implementation of chat routing for Ollama	2025-02-26 17:05:00 +08:00
ceerrep	f639fbc19e	feat: basic api key support	2025-02-25 14:11:39 +08:00
lazymio	76487c4dcb	Revert repetition_penalty as it is not in API spec	2025-02-24 21:30:03 +08:00
lazymio	05ad288453	Also /chat/completions	2025-02-24 21:08:36 +08:00
lazymio	bf36547f98	Also allow repetition_penalty	2025-02-24 21:07:35 +08:00
lazymio	8704c09192	Allow temperature and top_p from requests	2025-02-24 21:01:33 +08:00
ceerrep	584c7d5639	fix: object type for non-streaming response	2025-02-18 23:50:28 +08:00
ceerrep	6d45871de8	fix: workaround return dummy usage	2025-02-18 22:39:49 +08:00
ceerrep	ca2090d89b	feat: use model name in openai endpoint	2025-02-17 00:27:32 +08:00
RodriMora	b1bff2a405	Added simple /models endpoint to work with frontends that don't allow bypass check like Openweb-ui	2025-02-07 10:30:39 +01:00
liam	dd1d8667f3	✨: refactor local_chat and fix message slice bug in server	2024-11-04 14:02:19 +08:00
chenxl	18c42e67df	Initial commit	2024-07-27 16:06:58 +08:00

31 commits