Concedo
|
62bea5ef4f
|
allow overriding the devices directly
|
2026-01-17 19:08:06 +08:00 |
|
Concedo
|
d2b2224b0d
|
vulkan env var always take priority
|
2026-01-17 10:34:45 +08:00 |
|
Concedo
|
7e35954695
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# docs/build.md
# docs/function-calling.md
# examples/eval-callback/eval-callback.cpp
# ggml/CMakeLists.txt
# ggml/src/ggml-cann/ggml-cann.cpp
# ggml/src/ggml-cpu/CMakeLists.txt
# ggml/src/ggml-cpu/kleidiai/kernels.cpp
# ggml/src/ggml-cpu/kleidiai/kernels.h
# ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
# scripts/compare-llama-bench.py
# scripts/server-bench.py
# scripts/tool_bench.py
# tests/test-chat.cpp
# tools/batched-bench/batched-bench.cpp
# tools/llama-bench/llama-bench.cpp
# tools/server/README.md
|
2025-08-31 23:33:36 +08:00 |
|
Concedo
|
4b2ca1169c
|
more consistency fixes
|
2025-08-13 19:28:53 +08:00 |
|
Concedo
|
6d50def409
|
default kv_unified to true, handle LLAMA_SET_ROWS.
|
2025-07-21 16:13:20 +08:00 |
|
Concedo
|
c494525b33
|
update deprecated apis
|
2025-06-13 22:21:15 +08:00 |
|
Concedo
|
7d8aa31f1f
|
fixed embeddings, added new parameter to limit max embeddings context
|
2025-06-10 01:11:55 +08:00 |
|
Concedo
|
cfcdfd69bd
|
allow embeddings models to use mmap
|
2025-06-07 10:14:00 +08:00 |
|
Concedo
|
fe401ca4c2
|
fixed a typo
|
2025-05-30 13:35:42 +08:00 |
|
Concedo
|
e14aec58bc
|
embeds no offload qkv
|
2025-05-29 00:28:02 +08:00 |
|
Concedo
|
fcc1b43c06
|
embeddings change to encode
|
2025-05-28 23:24:33 +08:00 |
|
Concedo
|
da7fd4aa57
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .devops/musa.Dockerfile
# .github/workflows/build.yml
# README.md
# ci/README.md
# docs/docker.md
# examples/lookahead/lookahead.cpp
# examples/lookup/lookup.cpp
# examples/parallel/parallel.cpp
# ggml/src/ggml-musa/CMakeLists.txt
# ggml/src/ggml-sycl/ggml-sycl.cpp
# tests/test-arg-parser.cpp
|
2025-05-21 23:12:22 +08:00 |
|
Concedo
|
2439014a03
|
Merge branch 'upstream' into concedo_experimental
# Conflicts:
# .github/workflows/build.yml
# examples/embedding/embedding.cpp
# tools/imatrix/imatrix.cpp
# tools/perplexity/perplexity.cpp
|
2025-05-08 23:41:02 +08:00 |
|
Concedo
|
e37f27632f
|
clear cpu flag manually for templates, added truncation for embeddings
|
2025-04-02 00:18:30 +08:00 |
|
Concedo
|
6a709be50a
|
replace deprecated
|
2025-03-27 10:27:20 +08:00 |
|
Concedo
|
2bdf1dacff
|
embeddings done
|
2025-03-25 22:41:46 +08:00 |
|
Concedo
|
3992fb79cc
|
wip adding embeddings support
|
2025-03-24 18:01:23 +08:00 |
|