Zonghang Li
|
45e8b0420c
|
fix compute buffer estimate: tested on cuda
|
2025-06-22 08:10:57 +00:00 |
|
Li, Zonghang
|
80e5b71b48
|
fix compute buffer estimate: tested on metal
|
2025-06-20 13:43:55 +04:00 |
|
Zonghang Li
|
dd589561b4
|
improve the computing buffer estimate
|
2025-06-19 08:02:43 +00:00 |
|
DeEMO
|
d6c8d322cd
|
fix try_connect
|
2025-06-12 12:26:10 +00:00 |
|
DeEMO
|
d1b97f798e
|
support reconnection
|
2025-06-12 12:26:09 +00:00 |
|
Li, Zonghang
|
a01fafd126
|
Merge branch 'main' into dev
|
2025-06-03 17:56:47 +04:00 |
|
Li, Zonghang
|
1b3b6a506f
|
fix: add warm-up in profiling to prevent init delay
|
2025-06-03 17:10:09 +04:00 |
|
Li, Zonghang
|
b30f749e5e
|
fix n_embd cannot be divided by quantized block size
|
2025-06-03 14:06:31 +04:00 |
|
Li, Zonghang
|
7b0ededd24
|
Merge branch 'dev' into feat/auto-exit
|
2025-05-20 02:04:14 +08:00 |
|
DeEMO
|
0ad009a2f4
|
fix: update serialization and deserialization for next_ip in device_info
Signed-off-by: DeEMO <yzzxrx@gmail.com>
|
2025-05-19 09:22:16 +00:00 |
|
Lizonghang
|
07c4966a80
|
reduce fio data size to 1gb to speed up profiling
|
2025-05-14 21:26:01 +04:00 |
|
Lizonghang
|
2fbc0c8da3
|
fix: reset -ngl to 0 when GPU is not used and reformat code
|
2025-05-14 13:27:20 +04:00 |
|
leeetao
|
45ec52c2cb
|
Added support for IQ1_M and IQ2_XXS quantization type
|
2025-03-07 16:56:16 +00:00 |
|
leeetao
|
230c68b80c
|
fixed the alignment display
|
2025-03-07 07:55:23 +00:00 |
|
leeetao
|
6a416534c8
|
Fixed the alignment display of device performance
|
2025-03-07 07:46:30 +00:00 |
|
leeetao
|
54c4c1c26e
|
Fixed the flops test for iq1s and q2k quantization types
|
2025-03-07 02:47:00 +00:00 |
|
leeetao
|
2f049b8428
|
Added support for Q2K, IQ1s, IQ4NL quantization types
|
2025-03-04 15:22:55 +00:00 |
|
Lizonghang
|
9cbdf01645
|
fix support for Q5_0
|
2025-02-27 22:25:03 +04:00 |
|
Lizonghang
|
550fdcbc4f
|
add support for Q5_0
|
2025-02-27 21:47:14 +04:00 |
|
Lizonghang
|
24974a488c
|
assume 10% of active pages can be compressed on macOS UMA
|
2025-02-11 11:06:33 +04:00 |
|
Lizonghang
|
d2bc5cd502
|
add pid as suffix to avoid conflicts with other processes
|
2025-02-07 10:29:22 +04:00 |
|
Lizonghang
|
ec73e239c9
|
use 80% available mem as a conservative estimate
|
2025-02-03 18:10:05 +04:00 |
|
Lizonghang
|
64089236eb
|
fix latency estimation in set m1
|
2025-02-03 07:56:02 +04:00 |
|
Lizonghang
|
dd632ee6df
|
ignore the first 5 evals due to preheat
|
2025-01-31 08:53:51 +04:00 |
|
Lizonghang
|
fdecd4b54c
|
more active pages can be compressed
|
2025-01-30 23:17:07 +04:00 |
|
Lizonghang
|
2bc7a56790
|
fix available mem estimation in termux
|
2025-01-30 21:23:05 +04:00 |
|
Lizonghang
|
cd758247e6
|
consider active pages compression in macos available memory estimation
|
2025-01-29 20:33:13 +04:00 |
|
Lizonghang
|
27c996835d
|
fix undeclared identifier get_page_size
|
2025-01-29 19:59:02 +04:00 |
|
Lizonghang
|
4b616baed4
|
fix macos x86_64 available mem estimation
|
2025-01-29 19:57:06 +04:00 |
|
Zonghang Li
|
36f353e374
|
check env path before calling fio to ensure we can find it
|
2025-01-28 13:06:08 +04:00 |
|
Lizonghang
|
1ca9e7974b
|
device_os returns Linux if in Termux
|
2025-01-27 11:14:21 +04:00 |
|
Lizonghang
|
4948b1004c
|
fix device_name() to get device name from host, not from termux
|
2025-01-24 16:56:23 +04:00 |
|
Lizonghang
|
38e4a3eaa0
|
add device_macos_swappable_memory
|
2025-01-23 16:09:04 +04:00 |
|
Lizonghang
|
0b748060ad
|
ignore variable unused warning
|
2025-01-21 21:04:32 +04:00 |
|
Lizonghang
|
c19891f7db
|
fix device os detect
|
2025-01-18 19:56:43 +04:00 |
|
Lizonghang
|
6761ca5358
|
show device props
|
2025-01-18 17:25:27 +04:00 |
|
Lizonghang
|
5d9aadf3d5
|
use highs to solve the allocation program
|
2025-01-15 10:04:04 +04:00 |
|
Zonghang Li
|
0a5450487c
|
fix serialize error
|
2024-12-31 08:24:31 +04:00 |
|
Zonghang Li
|
e7c75d6b4a
|
fix get device os error on linux
|
2024-12-30 21:03:51 +04:00 |
|
Lizonghang
|
fa31ca8e35
|
add os detect
|
2024-12-30 09:13:12 +04:00 |
|
Lizonghang
|
d9beb030ee
|
add EPS in device_compute_delay
|
2024-12-29 22:31:45 +04:00 |
|
Lizonghang
|
fa210d2034
|
remove duplicate calls
|
2024-12-29 22:00:17 +04:00 |
|
Lizonghang
|
a7ec685eda
|
add memcpy speed test
|
2024-12-29 16:19:08 +04:00 |
|
Lizonghang
|
5b46c4e848
|
correct GB to GiB
|
2024-12-28 10:09:43 +04:00 |
|
Lizonghang
|
b2bc836423
|
reformat
|
2024-12-28 08:36:24 +04:00 |
|
Lizonghang
|
70811d85b3
|
remove OOM warning
|
2024-12-22 11:01:45 +04:00 |
|
Lizonghang
|
b6fd762fa8
|
fix log
|
2024-12-21 19:01:00 +04:00 |
|
Lizonghang
|
aa6c352aa2
|
reverse 128MiB free mem in termux
|
2024-12-12 21:27:36 +04:00 |
|
Lizonghang
|
1b169f8f37
|
remove anonpages for repeat count
|
2024-12-12 20:28:59 +04:00 |
|
Zonghang Li
|
6ea692d974
|
use small tensors for test on cpu
|
2024-12-12 15:52:27 +04:00 |
|