prima.cpp

mirror of https://github.com/Lizonghang/prima.cpp.git synced 2025-09-09 09:04:34 +00:00

Author	SHA1	Message	Date
Li, Zonghang	dfb1feb54e	update README	2025-06-16 12:09:07 +04:00
Li, Zonghang	fbf853341b	add endpoint /v1/cancel	2025-06-07 11:34:38 +04:00
Lizonghang	ef1e10101e	add test for IQ1 and doc for device selection	2025-06-04 15:12:00 +04:00
Lizonghang	c54a6a0132	fix context shifting	2025-05-19 16:58:35 +04:00
Lizonghang	2cc01483fd	support server mode	2025-05-14 18:28:46 +04:00
Lizonghang	ebd09fc83c	Merge branch 'dev'	2025-05-14 14:19:53 +04:00
Lizonghang	258fb2d06b	add QA: How to manually profile a device	2025-05-14 14:19:20 +04:00
Li, Zonghang	e2de4511c5	Update README.md	2025-05-11 18:15:39 +08:00
leeetao	b212d74dc3	update Readme.md	2025-04-17 09:17:11 +00:00
Zonghang Li	f9702ec4c0	update README.md	2025-04-16 15:55:43 +04:00
Li, Zonghang	b59d6d9cdf	Update README.md	2025-04-15 09:59:08 +08:00
Li, Zonghang	6d13836c44	Update README.md	2025-04-11 01:41:43 +08:00
Li, Zonghang	4845abf25e	Update README.md	2025-04-11 01:20:36 +08:00
Lizonghang	e48b804730	update README.md	2025-04-09 13:55:30 +04:00
Li, Zonghang	55f8dc588f	Update README.md	2025-04-09 10:56:25 +08:00
Lizonghang	e421d788d3	update README	2025-04-08 23:15:43 +04:00
Lizonghang	03ff9a7654	update README	2025-04-07 23:28:01 +04:00
Li, Zonghang	a3a1f4499b	Update README.md	2025-04-07 22:14:44 +08:00
Li, Zonghang	98d73778a6	Update README.md	2025-04-07 22:13:31 +08:00
Li, Zonghang	5984b1b75f	Update README.md	2025-04-07 22:13:12 +08:00
Li, Zonghang	ebd15b4112	Update README.md	2025-04-07 22:12:11 +08:00
Li, Zonghang	35adc76337	Update README.md	2025-04-07 22:08:14 +08:00
Lizonghang	87eb1aa7ec	update README	2025-04-07 18:06:57 +04:00
Lizonghang	fffefb9259	update README	2025-04-07 17:57:57 +04:00
Lizonghang	3b264352e7	update README	2025-03-30 23:39:36 +04:00
Lizonghang	2a01ff5fb1	init	2024-10-23 09:42:32 +04:00
Viet-Anh NGUYEN (Andrew)	71967c2a6d	Add Llama Assistant (#9744 )	2024-10-04 20:29:35 +02:00
Paweł Wodnicki	3f1ae2e32c	Update README.md (#9591 ) Add Bielik model.	2024-10-01 19:18:46 +02:00
Georgi Gerganov	589b48d41e	contrib : add Resources section (#9675 )	2024-09-29 14:38:18 +03:00
Aarni Koskela	43bcdd9703	readme : add tool (#9655 )	2024-09-28 15:07:14 +03:00
Georgi Gerganov	b5de3b74a5	readme : update hot topics	2024-09-27 20:57:51 +03:00
Riceball LEE	1d48e98e4f	readme : add programmable prompt engine language CLI (#9599 )	2024-09-23 18:58:17 +03:00
Shane A	0aadac10c7	llama : support OLMoE (#9462 )	2024-09-16 09:47:37 +03:00
OSecret	d6b37c881f	readme : update tools list (#9475 ) * Added link to proprietary wrapper for Unity3d into README.md Wrapper has prebuild library and was tested on iOS, Android, WebGL, PC, Mac platforms, has online demos like [this](https://d23myu0xfn2ttc.cloudfront.net/rich/index.html) and [that](https://d23myu0xfn2ttc.cloudfront.net/). * Update README.md Fixes upon review	2024-09-15 10:36:53 +03:00
Faisal Zaghloul	449ccfb6f5	Add Jais to list of supported models (#9439 ) Co-authored-by: fmz <quic_fzaghlou@quic.com>	2024-09-12 02:29:53 +02:00
Georgi Gerganov	38ca6f644b	readme : update hot topics	2024-09-09 15:51:37 +03:00
Antonis Makropoulos	5ed087573e	readme : add LLMUnity to UI projects (#9381 ) * add LLMUnity to UI projects * add newline to examples/rpc/README.md to fix editorconfig-checker unit test	2024-09-09 14:21:38 +03:00
Georgi Gerganov	b69a480af4	readme : refactor API section + remove old hot topics	2024-09-03 10:00:36 +03:00
Younes Belkada	b40eb84895	llama : support for `falcon-mamba` architecture (#9074 ) * feat: initial support for llama.cpp * fix: lint * refactor: better refactor * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix: address comments * Update convert_hf_to_gguf.py Co-authored-by: compilade <git@compilade.net> * fix: add more cleanup and harmonization * fix: lint * Update gguf-py/gguf/gguf_writer.py Co-authored-by: compilade <git@compilade.net> * fix: change name * Apply suggestions from code review Co-authored-by: compilade <git@compilade.net> * add in operator * fix: add `dt_b_c_rms` in `llm_load_print_meta` * fix: correct printf format for bool * fix: correct print format * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * llama : quantize more Mamba tensors * llama : use f16 as the fallback of fallback quant types --------- Co-authored-by: compilade <git@compilade.net>	2024-08-21 11:06:36 +03:00
wangshuai09	cfac111e2b	cann: add doc for cann backend (#8867 ) Co-authored-by: xuedinge233 <damow890@gmail.com> Co-authored-by: hipudding <huafengchun@gmail.com>	2024-08-19 16:46:38 +08:00
Minsoo Cheong	c679e0cb5c	llama : add EXAONE model support (#9025 ) * add exaone model support * add chat template * fix whitespace Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add ftype * add exaone pre-tokenizer in `llama-vocab.cpp` Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * fix lint Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * add `EXAONE` to supported models in `README.md` * fix space Co-authored-by: compilade <git@compilade.net> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: compilade <113953597+compilade@users.noreply.github.com> Co-authored-by: compilade <git@compilade.net>	2024-08-16 09:35:18 +03:00
Frank Mai	84eb2f4fad	docs: introduce gpustack and gguf-parser (#8873 ) * readme: introduce gpustack GPUStack is an open-source GPU cluster manager for running large language models, which uses llama.cpp as the backend. Signed-off-by: thxCode <thxcode0824@gmail.com> * readme: introduce gguf-parser GGUF Parser is a tool to review/check the GGUF file and estimate the memory usage without downloading the whole model. Signed-off-by: thxCode <thxcode0824@gmail.com> --------- Signed-off-by: thxCode <thxcode0824@gmail.com>	2024-08-12 14:45:50 +02:00
Eric Curtin	b42978e7e4	readme : add ramalama to the availables UI (#8811 ) ramalama is a repo agnostic boring CLI tool that supports pulling from ollama, huggingface and oci registries. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2024-08-05 15:45:01 +03:00
BarfingLemurs	400ae6f65f	readme : update model list (#8851 )	2024-08-05 08:54:10 +03:00
R0CKSTAR	e54c35e4fb	feat: Support Moore Threads GPU (#8383 ) * Update doc for MUSA Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add GGML_MUSA in Makefile Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Add GGML_MUSA in CMake Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * CUDA => MUSA Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * MUSA adds support for __vsubss4 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Fix CI build failure Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-07-28 01:41:25 +02:00
MorganRO8	68504f0970	readme : update games list (#8673 ) Added link to game I made that depends on llama	2024-07-24 19:48:00 +03:00
Thorsten Sommer	3a7ac5300a	readme : update UI list [no ci] (#8505 )	2024-07-24 15:52:30 +03:00
Georgi Gerganov	be0cfb4175	readme : fix server badge	2024-07-19 14:34:55 +03:00
Andy Salerno	fd560fe680	Update README.md to fix broken link to docs (#8399 ) Update the "Performance troubleshooting" doc link to be correct - the file was moved into a dir called 'development'	2024-07-09 14:58:44 -04:00
b4b4o	c4dd11d1d3	readme : fix web link error [no ci] (#8347 )	2024-07-08 17:19:24 +03:00

1 2 3 4 5 ...

417 commits