Initial commit

2026-04-28 20:00:06 +00:00 · 2024-07-27 16:06:58 +08:00 · 2024-07-27 16:06:58 +08:00 · 18c42e67df
commit 18c42e67df
247 changed files with 53775 additions and 0 deletions
--- a/doc/en/api/server/api.md
+++ b/doc/en/api/server/api.md
@ -0,0 +1,108 @@
+# API
+
+- [OpenAI ChatCompletion](#openai-chatcompletion)
+- [Ollama ChatCompletion](#ollama-chatcompletion)
+- [OpenAI Assistant](#openai-assistant)
+
+## OpenAI ChatCompletion
+```bash
+POST /v1/chat/completions
+
+```
+Generate responses based on the selected model.
+
+### Parameters
+- `messages`: An array of `message` representing all historical messages. A `message` can be from a user or model (assistant) and includes:
+
+  - `role`: Either `user` or `assistant`, indicating the creator of this message.
+  - `content`: The message from the user or model.
+- `model`: The name of the selected model
+- `stream`: Either true or false. Indicates whether to use streaming response. If true, model inference results are returned via HTTP event stream.
+
+### Response
+- Streaming response: An event stream, each event contains a `chat.completion.chunk`. `chunk.choices[0].delta.content` is the incremental output returned by the model each time.
+- Non-streaming response: Not supported yet.
+
+
+
+### Example
+
+```bash
+curl -X 'POST' \
+  'http://localhost:9112/v1/chat/completions' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "messages": [
+    {
+      "content": "tell a joke",
+      "role": "user"
+    }
+  ],
+  "model": "Meta-Llama-3-8B-Instruct",
+  "stream": true
+}'
+```
+
+```bash
+data:{"id":"c30445e8-1061-4149-a101-39b8222e79e1","object":"chat.completion.chunk","created":1720511671,"model":"not implmented","system_fingerprint":"not implmented","usage":null,"choices":[{"index":0,"delta":{"content":"Why ","role":"assistant","name":null},"logprobs":null,"finish_reason":null}]}
+
+data:{"id":"c30445e8-1061-4149-a101-39b8222e79e1","object":"chat.completion.chunk","created":1720511671,"model":"not implmented","system_fingerprint":"not implmented","usage":null,"choices":[{"index":0,"delta":{"content":"","role":"assistant","name":null},"logprobs":null,"finish_reason":null}]}
+
+data:{"id":"c30445e8-1061-4149-a101-39b8222e79e1","object":"chat.completion.chunk","created":1720511671,"model":"not implmented","system_fingerprint":"not implmented","usage":null,"choices":[{"index":0,"delta":{"content":"couldn't ","role":"assistant","name":null},"logprobs":null,"finish_reason":null}]}
+
+...
+
+data:{"id":"c30445e8-1061-4149-a101-39b8222e79e1","object":"chat.completion.chunk","created":1720511671,"model":"not implmented","system_fingerprint":"not implmented","usage":null,"choices":[{"index":0,"delta":{"content":"two-tired!","role":"assistant","name":null},"logprobs":null,"finish_reason":null}]}
+
+event: done
+data: [DONE]
+```
+
+
+
+## Ollama ChatCompletion
+
+```bash
+POST /api/generate
+```
+
+Generate responses using the selected model.
+
+### Parameters
+- `prompt`: A string representing the input prompt.
+- `model`: The name of the selected model
+- `stream`: Either true or false. Indicates whether to use streaming responses. If true, returns the model inference results in the form of an HTTP event stream.
+
+### Response
+- Streaming response: A stream of JSON responses, each line is a JSON.
+  - `response`: The incremental result of the model completion.
+  - `done`: Whether the inference has finished.
+- Non-streaming response: Not yet supported.
+
+### 例子
+
+```bash
+curl -X 'POST' \
+  'http://localhost:9112/api/generate' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "model": "Meta-Llama-3-8B-Instruct",
+  "prompt": "tell me a joke",
+  "stream": true
+}'
+```
+
+```bash
+{"model":"Meta-Llama-3-8B-Instruct","created_at":"2024-07-09 08:13:11.686513","response":"I'll ","done":false}
+{"model":"Meta-Llama-3-8B-Instruct","created_at":"2024-07-09 08:13:11.729214","response":"give ","done":false}
+
+...
+
+{"model":"Meta-Llama-3-8B-Instruct","created_at":"2024-07-09 08:13:33.955475","response":"for","done":false}
+{"model":"Meta-Llama-3-8B-Instruct","created_at":"2024-07-09 08:13:33.956795","response":"","done":true}
+```
+
+
+