prima.cpp

mirror of https://github.com/Lizonghang/prima.cpp.git synced 2025-09-08 19:19:11 +00:00

Author	SHA1	Message	Date
Lizonghang	7cd4936d58	rename to CPU RAM Read BW	2024-12-04 00:04:22 +04:00
Lizonghang	dc03b6216f	set default page size to 4KB if not available from system	2024-12-03 23:25:47 +04:00
Lizonghang	74dbec5086	set default readahead size to 128KB if permission is needed	2024-12-03 22:55:16 +04:00
Zonghang Li	1a7237636e	add cgroup version detect	2024-11-30 09:48:19 +04:00
Zonghang Li	3074763ed4	fix memory detect in docker container	2024-11-30 09:36:01 +04:00
Zonghang Li	eb25858e87	remove cpus_allowed_policy in fio	2024-11-29 22:16:32 +04:00
Zonghang Li	7e4bb65eab	fix bugs on cpu	2024-11-29 22:06:02 +04:00
Zonghang Li	81fd77033e	add gpu support in device_memory_access_delay	2024-11-29 21:56:01 +04:00
Lizonghang	6f54a12c7d	add gpu support in llama_model_kvcache_size and llama_model_compute_buf_size	2024-11-29 21:06:32 +04:00
Lizonghang	f8e9dc2713	add GPU support in device_compute_delay and device_disk_access_delay	2024-11-29 20:21:22 +04:00
Lizonghang	68ecabc8c3	add cpu_read_ram_bw, metal_read_vram_bw, cuda_read_vram_bw	2024-11-29 19:04:53 +04:00
Lizonghang	ce45587ea9	correct GB to GiB	2024-11-29 11:20:19 +04:00
Lizonghang	0f73d12247	decrease compute buf from available memory	2024-11-29 11:15:54 +04:00
Lizonghang	45a1e55eec	reduce kv cache from available memory	2024-11-28 20:21:21 +04:00
Lizonghang	9858d90ce4	get system readahead size automatically	2024-11-28 16:18:41 +04:00
Lizonghang	9a7bbce7ad	fix t_load_us	2024-11-28 15:55:21 +04:00
Lizonghang	740f7f0b95	use multithread disk r/w test	2024-11-27 22:14:17 +04:00
Lizonghang	f7507ec20b	fix disk r/w test, add disk access latency, and correct units (GB, GiB)	2024-11-27 21:36:12 +04:00
Lizonghang	9cd22177d0	remove arg test_file	2024-11-27 21:34:45 +04:00
Lizonghang	0a91ad3edc	fix cuda compatibility errors	2024-11-26 22:35:58 +04:00
Zonghang Li	f78c437172	add device_inp_embd_delay test, device_memory_bw test, device_cuda_memory_bw test,	2024-11-26 22:28:02 +04:00
Lizonghang	a7a95b53fe	add q80xf32 and count_n_params	2024-11-24 23:11:12 +04:00
Lizonghang	3fe00a16a0	count model flops for f32xf32, f16xf32, q4kxf32, q6kxf32	2024-11-24 13:13:32 +04:00
Lizonghang	a5ba34169a	add f32, f16, q4k_f32, q6k_f32 flops test and fix duplicate inp_embd in subgraphs	2024-11-23 21:36:34 +04:00
Zonghang Li	7ee1423006	add model_flops	2024-11-21 20:06:16 +04:00
Zonghang Li	80f6b72e71	remove device_flops from profiler api	2024-11-21 08:37:57 +04:00
Lizonghang	477ecf2084	add llama_model_n_flops	2024-11-20 19:40:27 +04:00
Lizonghang	10f6f92c7e	add f32, f16, q8, q4k speed test for cuda	2024-11-10 23:41:13 +04:00
Lizonghang	f4260bb346	add device_flops() for cpu, metal, and cuda	2024-11-10 23:11:05 +04:00
Lizonghang	5fae6ac36f	add cpu flops test	2024-11-09 20:53:42 +04:00
Lizonghang	2bd4d03aa8	add automatic layer window size assignment workflow	2024-11-08 18:21:03 +04:00
Lizonghang	53cb3a6069	synchronize device info	2024-11-07 22:02:01 +04:00
Lizonghang	ef7fdf70cc	add LLAMA_API llama_profile_device	2024-11-07 09:30:39 +04:00
Zonghang Li	b922418cca	convert MB to GB	2024-11-06 20:47:17 +04:00
Lizonghang	407c71ae52	add cpu and gpu profile	2024-11-06 20:42:28 +04:00
Lizonghang	4e1be1065d	add memory speed test	2024-11-06 10:57:30 +04:00
Zonghang Li	9a03b52785	fix device get name on linux	2024-11-05 22:07:09 +04:00
Lizonghang	a7f3d917a1	add device get name	2024-11-05 22:04:14 +04:00
Zonghang Li	6657885816	fix swap detect on linux	2024-11-05 21:57:09 +04:00
Lizonghang	2d447266e9	add swap capacity test	2024-11-05 21:42:45 +04:00
Lizonghang	9eed6b14bf	add disk read speed test	2024-11-05 21:12:02 +04:00
Lizonghang	9cd66f2145	add profiler	2024-11-05 20:29:09 +04:00
Lizonghang	766ec7862b	test	2024-11-05 17:22:24 +04:00
Lizonghang	76a7fc7527	support different window sizes	2024-10-26 12:34:14 +04:00
Lizonghang	c97ea10617	add mmap prefetch and unloading	2024-10-25 16:33:56 +04:00
Lizonghang	2a01ff5fb1	init	2024-10-23 09:42:32 +04:00
Georgi Gerganov	8c475b97b8	rerank : use [SEP] token instead of [BOS] (#9737 ) * rerank : use [SEP] token instead of [BOS] ggml-ci * common : sanity check for non-NULL tokens ggml-ci * ci : adjust rank score interval ggml-ci * ci : add shebang to run.sh ggml-ci	2024-10-05 15:55:04 +03:00
Daniel Kleine	133c7b46b3	Fixed RNG seed docs (#9723 ) * Update README.md fixed RNG seed info * changed print format to unsigned	2024-10-04 10:54:44 +02:00
Ruchira Hasaranga	8277a817f1	console : utf-8 fix for windows stdin (#9690 ) * utf-8 fix for windows stdin * Update common/console.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-09-30 11:23:42 +03:00
matiaslin	faac0bae26	common : ensure llama_batch size does not exceed max size (#9668 ) A crash was observed when the number of tokens added to a batch exceeds llama_batch size. An assertion in llama_batch_add was added to protect against llama_batch size overflow.	2024-09-29 15:25:00 +03:00

1 2 3 4 5 ...

447 commits