Commit graph

519 commits

Author SHA1 Message Date
Gustavo Rocha Dias
3f21bd81f3
doc - Better explanation of how to build the libraries at Windows. (#107) 2023-04-23 13:40:09 +08:00
Concedo
c454f8b848 Gpt NeoX / Pythia integration completed 2023-04-22 11:23:25 +08:00
Concedo
07bb31b034 wip dont use 2023-04-21 00:35:54 +08:00
Concedo
93761e7baf slightly clarified the library replacement steps - replacing the dll is necessary in addition to replacing the library imports 2023-04-20 12:23:54 +08:00
Gustavo Rocha Dias
5ca2d774cc
doc - explanation of how to use a custom version of the windows libraries at the lib folder. (#92)
the dynamic libraries also need to be updated if you replace the import libraries
2023-04-20 12:20:11 +08:00
CRD716
834695fe3a
Minor: Readme fixed grammar, spelling, and misc updates (#1071) 2023-04-19 19:52:14 +00:00
Georgi Gerganov
7cd5c4a3e9
readme : add warning about Q4_2 and Q4_3 2023-04-19 19:07:54 +03:00
Georgi Gerganov
7faa7460f0
readme : update hot topics about new LoRA functionality 2023-04-18 20:10:26 +03:00
Concedo
f39def81d4 Update readme with more info 2023-04-18 21:44:26 +08:00
Concedo
3614956bc7 update readme 2023-04-18 21:39:05 +08:00
Gustavo Rocha Dias
ed5b5c45a9
doc - enhanced readme explaing how to compile at Windows. (#80) 2023-04-18 17:40:04 +08:00
Atsushi Tatsuma
e9298af389
readme : add Ruby bindings (#1029) 2023-04-17 22:34:35 +03:00
AlpinDale
624dc8809e
Added openblas and clblas package names for debian (#63) 2023-04-15 01:08:56 +08:00
comex
723dac55fa
py : new conversion script (#545)
Current status: Working, except for the latest GPTQ-for-LLaMa format
  that includes `g_idx`.  This turns out to require changes to GGML, so
  for now it only works if you use the `--outtype` option to dequantize it
  back to f16 (which is pointless except for debugging).

  I also included some cleanup for the C++ code.

  This script is meant to replace all the existing conversion scripts
  (including the ones that convert from older GGML formats), while also
  adding support for some new formats.  Specifically, I've tested with:

  - [x] `LLaMA` (original)
  - [x] `llama-65b-4bit`
  - [x] `alpaca-native`
  - [x] `alpaca-native-4bit`
  - [x] LLaMA converted to 'transformers' format using
        `convert_llama_weights_to_hf.py`
  - [x] `alpaca-native` quantized with `--true-sequential --act-order
        --groupsize 128` (dequantized only)
  - [x] same as above plus `--save_safetensors`
  - [x] GPT4All
  - [x] stock unversioned ggml
  - [x] ggmh

  There's enough overlap in the logic needed to handle these different
  cases that it seemed best to move to a single script.

  I haven't tried this with Alpaca-LoRA because I don't know where to find
  it.

  Useful features:

  - Uses multiple threads for a speedup in some cases (though the Python
    GIL limits the gain, and sometimes it's disk-bound anyway).

  - Combines split models into a single file (both the intra-tensor split
    of the original and the inter-tensor split of 'transformers' format
    files).  Single files are more convenient to work with and more
    friendly to future changes to use memory mapping on the C++ side.  To
    accomplish this without increasing memory requirements, it has some
    custom loading code which avoids loading whole input files into memory
    at once.

  - Because of the custom loading code, it no longer depends in PyTorch,
    which might make installing dependencies slightly easier or faster...
    although it still depends on NumPy and sentencepiece, so I don't know
    if there's any meaningful difference.  In any case, I also added a
    requirements.txt file to lock the dependency versions in case of any
    future breaking changes.

  - Type annotations checked with mypy.

  - Some attempts to be extra user-friendly:

      - The script tries to be forgiving with arguments, e.g. you can
        specify either the model file itself or the directory containing
        it.

      - The script doesn't depend on config.json / params.json, just in
        case the user downloaded files individually and doesn't have those
        handy.  But you still need tokenizer.model and, for Alpaca,
        added_tokens.json.

      - The script tries to give a helpful error message if
        added_tokens.json is missing.
2023-04-14 10:03:03 +03:00
CRD716
ec29272175
readme : remove python 3.10 warning (#929) 2023-04-13 16:59:53 +03:00
Genkagaku.GPT
7e941b95eb
readme : llama node binding (#911)
* chore: add nodejs binding

* chore: add nodejs binding
2023-04-13 16:54:27 +03:00
Judd
4579af95e8
zig : update build.zig (#872)
* update

* update readme

* minimize the changes.

---------

Co-authored-by: zjli2019 <zhengji.li@ingchips.com>
2023-04-13 16:43:22 +03:00
Georgi Gerganov
f76cb3a34d
readme : change "GPU support" link to discussion 2023-04-12 14:48:57 +03:00
Georgi Gerganov
782438070f
readme : update hot topics with link to "GPU support" issue 2023-04-12 14:31:12 +03:00
Nicolai Weitkemper
4dbbd40750
readme: link to sha256sums file (#902)
This is to emphasize that these do not need to be obtained from elsewhere.
2023-04-12 08:46:20 +02:00
Pavol Rusnak
8b679987cd
Fix whitespace, add .editorconfig, add GitHub workflow (#883) 2023-04-11 19:45:44 +00:00
Concedo
ca69e05d1f update readme and fixed typos 2023-04-11 23:53:21 +08:00
qouoq
a0caa34b16
Add BAIR's Koala to supported models (#877) 2023-04-10 22:41:53 +02:00
ariez-xyz
b48255db19
add more precise instructions for arch 2023-04-08 10:41:57 +02:00
Concedo
1369b46bb7 notice about false positives 2023-04-08 12:20:48 +08:00
Pavol Rusnak
d2beca95dc
Make docker instructions more explicit (#785) 2023-04-06 08:56:58 +02:00
Georgi Gerganov
3416298929
Update README.md 2023-04-05 19:54:30 +03:00
Georgi Gerganov
8d10406d6e
readme : change logo + add bindings + add uis + add wiki 2023-04-05 18:56:20 +03:00
Adithya Balaji
594cc95fab
readme : update with CMake and windows example (#748)
* README: Update with CMake and windows example

* README: update with code-review for cmake build
2023-04-05 17:36:12 +03:00
Concedo
eb5b22dda2 rebrand to koboldcpp 2023-04-03 10:35:18 +08:00
Thatcher Chamberlin
d8d4e865cd
Add a missing step to the gpt4all instructions (#690)
`migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.
2023-04-02 12:48:57 +02:00
Concedo
bb965cc120 Merge branch 'master' into concedo
# Conflicts:
#	README.md
2023-04-02 17:13:28 +08:00
rimoliga
d0a7f742e7
readme: replace termux links with homepage, play store is deprecated (#680) 2023-04-01 16:57:30 +02:00
Concedo
801b178f2a still refactoring, but need a checkpoint to prepare build for 1.0.7 2023-04-01 08:55:14 +08:00
Concedo
559a1967f7 Backwards compatibility formats all done
Merge branch 'master' into concedo

# Conflicts:
#	CMakeLists.txt
#	README.md
#	llama.cpp
2023-03-31 19:01:33 +08:00
Concedo
9eab39fe6d prepare legacy functions (+1 squashed commits)
Squashed commits:

[8bc8d0d] prepare for big merge
2023-03-31 17:45:49 +08:00
Concedo
79f9743347 improved console info, fixed utf encoding bugs 2023-03-31 15:38:38 +08:00
Pavol Rusnak
9733104be5 drop quantize.py (now that models are using a single file) 2023-03-31 01:07:32 +02:00
Georgi Gerganov
3df890aef4
readme : update supported models 2023-03-30 22:31:54 +03:00
Concedo
d8febc8653 renamed main python script 2023-03-30 00:48:44 +08:00
Concedo
664b277c27 integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed. 2023-03-30 00:43:52 +08:00
Georgi Gerganov
b467702b87
readme : fix typos 2023-03-29 19:38:31 +03:00
Georgi Gerganov
516d88e75c
readme : add GPT4All instructions (close #588) 2023-03-29 19:37:20 +03:00
Stephan Walter
b391579db9
Update README and comments for standalone perplexity tool (#525) 2023-03-26 16:14:01 +03:00
Georgi Gerganov
348d6926ee
Add logo to README.md 2023-03-26 10:20:49 +03:00
Georgi Gerganov
55ad42af84
Move chat scripts into "./examples" 2023-03-25 20:37:09 +02:00
Georgi Gerganov
4a7129acd2
Remove obsolete information from README 2023-03-25 16:30:32 +02:00
Gary Mulder
f4f5362edb
Update README.md (#444)
Added explicit **bolded** instructions clarifying that people need to request access to models from Facebook and never through through this repo.
2023-03-24 15:23:09 +00:00
LostRuins
1c78ffb964
Update README.md 2023-03-24 22:45:54 +08:00
Georgi Gerganov
b6b268d441
Add link to Roadmap discussion 2023-03-24 09:13:35 +02:00