Commit graph

31 commits

Author SHA1 Message Date
qiyuxinlin
b17ab8653c update speed test 2025-04-22 07:38:05 +00:00
qiyuxinlin
03a65d6bea roll back ktransformers backend, add max_tokens, max_completion_tokens param 2025-04-21 12:55:37 +00:00
wang jiahao
a1162eea01
Merge pull request #1158 from Creeper-MZ/function_call
Some checks failed
Book-CI / test (push) Failing after 9s
Deploy / deploy (ubuntu-latest) (push) Failing after 2s
Deploy / deploy (windows-latest) (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Update Function call
2025-04-19 16:31:37 +08:00
Creeper-MZ
133ba746e9 优化提示词,解决部分Deepseek r1的兼容性
优化提示词,解决部分Deepseek r1的兼容性

fix non stream
2025-04-19 01:20:27 -04:00
Creeper-MZ
4fb19bfcae Update chat.py 2025-04-17 09:19:14 -04:00
Yuhao Tsui
8ce34b3b5c
Modify the performance calculation module
Modify the performance data calculation module from estimation to retrieving from `raw_usage`.
2025-04-17 16:57:53 +08:00
wang jiahao
6e4da83d4b
Merge pull request #978 from cyhasuka/main
Some checks failed
Deploy / deploy (ubuntu-latest) (push) Failing after 3s
Book-CI / test (push) Has been cancelled
Deploy / deploy (macos-latest) (push) Has been cancelled
Deploy / deploy (windows-latest) (push) Has been cancelled
Feat: Support Non-streaming chat in Ollama backend
2025-04-17 14:34:35 +08:00
Creeper-MZ
cb266c98d4 Fix a bug 2025-04-16 23:31:33 -04:00
Creeper-MZ
6bc2e85343 Update chat.py 2025-04-16 15:54:23 -04:00
Creeper-MZ
88f688e2c8 更改token注入逻辑,减少token注入量,防止遗忘
Update chat.py

Update chat.py

Update chat.py
2025-04-16 15:52:24 -04:00
Creeper-MZ
a7e8d7c1af updata function_call 2025-04-13 23:48:51 -04:00
Yuhao Tsui
84164f584c
Update completions.py 2025-03-26 15:39:46 +08:00
Yuhao Tsui
e5694f91c0
Merge branch 'kvcache-ai:main' into main 2025-03-10 09:10:28 +08:00
BITcyman
299c4dca64 [update] support openai chat completion api 2025-03-07 08:51:09 +00:00
Yuhao Tsui
d050d8655f
Update completions.py 2025-03-06 11:16:33 +08:00
chenmz00
b2ba795cfd
fix: list models API
Fix the list models API to match the corresponding OpenAI API format.
2025-03-05 21:49:27 +08:00
wang jiahao
26f7b4af11
Merge branch 'main' into temperature_top_p_from_request 2025-02-27 18:08:55 +08:00
Atream
f403cde6d4
Merge pull request #650 from ceerRep/main
feat: basic api key support
2025-02-27 12:16:53 +08:00
swu-hyk
ec7e912fee modify 2025-02-26 19:21:30 +08:00
swu-hyk
68e7df3a25 implementation of chat routing for Ollama 2025-02-26 17:05:00 +08:00
ceerrep
f639fbc19e feat: basic api key support 2025-02-25 14:11:39 +08:00
lazymio
76487c4dcb
Revert repetition_penalty as it is not in API spec 2025-02-24 21:30:03 +08:00
lazymio
05ad288453
Also /chat/completions 2025-02-24 21:08:36 +08:00
lazymio
bf36547f98
Also allow repetition_penalty 2025-02-24 21:07:35 +08:00
lazymio
8704c09192
Allow temperature and top_p from requests 2025-02-24 21:01:33 +08:00
ceerrep
584c7d5639 fix: object type for non-streaming response 2025-02-18 23:50:28 +08:00
ceerrep
6d45871de8 fix: workaround return dummy usage 2025-02-18 22:39:49 +08:00
ceerrep
ca2090d89b feat: use model name in openai endpoint 2025-02-17 00:27:32 +08:00
RodriMora
b1bff2a405 Added simple /models endpoint to work with frontends that don't allow bypass check like Openweb-ui 2025-02-07 10:30:39 +01:00
liam
dd1d8667f3 : refactor local_chat and fix message slice bug in server 2024-11-04 14:02:19 +08:00
chenxl
18c42e67df Initial commit 2024-07-27 16:06:58 +08:00