koboldcpp

mirror of https://github.com/LostRuins/koboldcpp.git synced 2025-09-13 18:39:48 +00:00

Author	SHA1	Message	Date
Gustavo Rocha Dias	3f21bd81f3	doc - Better explanation of how to build the libraries at Windows. (#107 )	2023-04-23 13:40:09 +08:00
Concedo	c454f8b848	Gpt NeoX / Pythia integration completed	2023-04-22 11:23:25 +08:00
Concedo	07bb31b034	wip dont use	2023-04-21 00:35:54 +08:00
Concedo	93761e7baf	slightly clarified the library replacement steps - replacing the dll is necessary in addition to replacing the library imports	2023-04-20 12:23:54 +08:00
Gustavo Rocha Dias	5ca2d774cc	doc - explanation of how to use a custom version of the windows libraries at the lib folder. (#92 ) the dynamic libraries also need to be updated if you replace the import libraries	2023-04-20 12:20:11 +08:00
CRD716	834695fe3a	Minor: Readme fixed grammar, spelling, and misc updates (#1071 )	2023-04-19 19:52:14 +00:00
Georgi Gerganov	7cd5c4a3e9	readme : add warning about Q4_2 and Q4_3	2023-04-19 19:07:54 +03:00
Georgi Gerganov	7faa7460f0	readme : update hot topics about new LoRA functionality	2023-04-18 20:10:26 +03:00
Concedo	f39def81d4	Update readme with more info	2023-04-18 21:44:26 +08:00
Concedo	3614956bc7	update readme	2023-04-18 21:39:05 +08:00
Gustavo Rocha Dias	ed5b5c45a9	doc - enhanced readme explaing how to compile at Windows. (#80 )	2023-04-18 17:40:04 +08:00
Atsushi Tatsuma	e9298af389	readme : add Ruby bindings (#1029 )	2023-04-17 22:34:35 +03:00
AlpinDale	624dc8809e	Added openblas and clblas package names for debian (#63 )	2023-04-15 01:08:56 +08:00
comex	723dac55fa	py : new conversion script (#545 ) Current status: Working, except for the latest GPTQ-for-LLaMa format that includes `g_idx`. This turns out to require changes to GGML, so for now it only works if you use the `--outtype` option to dequantize it back to f16 (which is pointless except for debugging). I also included some cleanup for the C++ code. This script is meant to replace all the existing conversion scripts (including the ones that convert from older GGML formats), while also adding support for some new formats. Specifically, I've tested with: - [x] `LLaMA` (original) - [x] `llama-65b-4bit` - [x] `alpaca-native` - [x] `alpaca-native-4bit` - [x] LLaMA converted to 'transformers' format using `convert_llama_weights_to_hf.py` - [x] `alpaca-native` quantized with `--true-sequential --act-order --groupsize 128` (dequantized only) - [x] same as above plus `--save_safetensors` - [x] GPT4All - [x] stock unversioned ggml - [x] ggmh There's enough overlap in the logic needed to handle these different cases that it seemed best to move to a single script. I haven't tried this with Alpaca-LoRA because I don't know where to find it. Useful features: - Uses multiple threads for a speedup in some cases (though the Python GIL limits the gain, and sometimes it's disk-bound anyway). - Combines split models into a single file (both the intra-tensor split of the original and the inter-tensor split of 'transformers' format files). Single files are more convenient to work with and more friendly to future changes to use memory mapping on the C++ side. To accomplish this without increasing memory requirements, it has some custom loading code which avoids loading whole input files into memory at once. - Because of the custom loading code, it no longer depends in PyTorch, which might make installing dependencies slightly easier or faster... although it still depends on NumPy and sentencepiece, so I don't know if there's any meaningful difference. In any case, I also added a requirements.txt file to lock the dependency versions in case of any future breaking changes. - Type annotations checked with mypy. - Some attempts to be extra user-friendly: - The script tries to be forgiving with arguments, e.g. you can specify either the model file itself or the directory containing it. - The script doesn't depend on config.json / params.json, just in case the user downloaded files individually and doesn't have those handy. But you still need tokenizer.model and, for Alpaca, added_tokens.json. - The script tries to give a helpful error message if added_tokens.json is missing.	2023-04-14 10:03:03 +03:00
CRD716	ec29272175	readme : remove python 3.10 warning (#929 )	2023-04-13 16:59:53 +03:00
Genkagaku.GPT	7e941b95eb	readme : llama node binding (#911 ) * chore: add nodejs binding * chore: add nodejs binding	2023-04-13 16:54:27 +03:00
Judd	4579af95e8	zig : update build.zig (#872 ) * update * update readme * minimize the changes. --------- Co-authored-by: zjli2019 <zhengji.li@ingchips.com>	2023-04-13 16:43:22 +03:00
Georgi Gerganov	f76cb3a34d	readme : change "GPU support" link to discussion	2023-04-12 14:48:57 +03:00
Georgi Gerganov	782438070f	readme : update hot topics with link to "GPU support" issue	2023-04-12 14:31:12 +03:00
Nicolai Weitkemper	4dbbd40750	readme: link to sha256sums file (#902 ) This is to emphasize that these do not need to be obtained from elsewhere.	2023-04-12 08:46:20 +02:00
Pavol Rusnak	8b679987cd	Fix whitespace, add .editorconfig, add GitHub workflow (#883 )	2023-04-11 19:45:44 +00:00
Concedo	ca69e05d1f	update readme and fixed typos	2023-04-11 23:53:21 +08:00
qouoq	a0caa34b16	Add BAIR's Koala to supported models (#877 )	2023-04-10 22:41:53 +02:00
ariez-xyz	b48255db19	add more precise instructions for arch	2023-04-08 10:41:57 +02:00
Concedo	1369b46bb7	notice about false positives	2023-04-08 12:20:48 +08:00
Pavol Rusnak	d2beca95dc	Make docker instructions more explicit (#785 )	2023-04-06 08:56:58 +02:00
Georgi Gerganov	3416298929	Update README.md	2023-04-05 19:54:30 +03:00
Georgi Gerganov	8d10406d6e	readme : change logo + add bindings + add uis + add wiki	2023-04-05 18:56:20 +03:00
Adithya Balaji	594cc95fab	readme : update with CMake and windows example (#748 ) * README: Update with CMake and windows example * README: update with code-review for cmake build	2023-04-05 17:36:12 +03:00
Concedo	eb5b22dda2	rebrand to koboldcpp	2023-04-03 10:35:18 +08:00
Thatcher Chamberlin	d8d4e865cd	Add a missing step to the gpt4all instructions (#690 ) `migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.	2023-04-02 12:48:57 +02:00
Concedo	bb965cc120	Merge branch 'master' into concedo # Conflicts: # README.md	2023-04-02 17:13:28 +08:00
rimoliga	d0a7f742e7	readme: replace termux links with homepage, play store is deprecated (#680 )	2023-04-01 16:57:30 +02:00
Concedo	801b178f2a	still refactoring, but need a checkpoint to prepare build for 1.0.7	2023-04-01 08:55:14 +08:00
Concedo	559a1967f7	Backwards compatibility formats all done Merge branch 'master' into concedo # Conflicts: # CMakeLists.txt # README.md # llama.cpp	2023-03-31 19:01:33 +08:00
Concedo	9eab39fe6d	prepare legacy functions (+1 squashed commits) Squashed commits: [8bc8d0d] prepare for big merge	2023-03-31 17:45:49 +08:00
Concedo	79f9743347	improved console info, fixed utf encoding bugs	2023-03-31 15:38:38 +08:00
Pavol Rusnak	9733104be5	drop quantize.py (now that models are using a single file)	2023-03-31 01:07:32 +02:00
Georgi Gerganov	3df890aef4	readme : update supported models	2023-03-30 22:31:54 +03:00
Concedo	d8febc8653	renamed main python script	2023-03-30 00:48:44 +08:00
Concedo	664b277c27	integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed.	2023-03-30 00:43:52 +08:00
Georgi Gerganov	b467702b87	readme : fix typos	2023-03-29 19:38:31 +03:00
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	2023-03-29 19:37:20 +03:00
Stephan Walter	b391579db9	Update README and comments for standalone perplexity tool (#525 )	2023-03-26 16:14:01 +03:00
Georgi Gerganov	348d6926ee	Add logo to README.md	2023-03-26 10:20:49 +03:00
Georgi Gerganov	55ad42af84	Move chat scripts into "./examples"	2023-03-25 20:37:09 +02:00
Georgi Gerganov	4a7129acd2	Remove obsolete information from README	2023-03-25 16:30:32 +02:00
Gary Mulder	f4f5362edb	Update README.md (#444 ) Added explicit bolded instructions clarifying that people need to request access to models from Facebook and never through through this repo.	2023-03-24 15:23:09 +00:00
LostRuins	1c78ffb964	Update README.md	2023-03-24 22:45:54 +08:00
Georgi Gerganov	b6b268d441	Add link to Roadmap discussion	2023-03-24 09:13:35 +02:00

... 7 8 9 10 11

519 commits