Concedo
1301bd7e29
Fix to skip GPU offloading so falcon models work correctly
2023-08-30 18:26:41 +08:00
Concedo
b95a4ccb22
added a token counting endpoint, set mmq as default
2023-08-24 20:41:49 +08:00
Concedo
280abaf029
added stop reason in the perf endpoint
2023-07-24 11:55:35 +08:00
Concedo
39dc1a46c4
added token count, updated lite
2023-07-20 14:41:06 +08:00
Concedo
1d1111e10f
expose timing info in web api
2023-07-11 18:56:06 +08:00
callMeMakerRen
4e46673f80
Merge branch 'LostRuins:concedo' into concedo
2023-07-08 09:33:26 +08:00
shutup
1727e652f1
expose some useful info that can be used in statistics of performence
2023-07-07 11:52:58 +08:00
Concedo
27a0907cfa
backport MM256_SET_M128I to ggml_v2, updated lite, added support for selecting the GPU for cublas
2023-07-06 22:33:46 +08:00
Concedo
66a3f4e421
added support for lora base
2023-06-10 19:29:45 +08:00
Concedo
43f7e40470
added extra endpoints for abort gen and polled streaming
2023-06-10 18:13:26 +08:00
SammCheese
e6231c3055
back to http.server, improved implementation
2023-06-09 12:17:55 +02:00
SammCheese
9a8da35ec4
working streaming. TODO: fix lite
2023-06-08 18:34:23 +02:00
SammCheese
97971291e9
draft: token streaming
2023-06-08 18:34:08 +02:00
Concedo
a6a0fa338a
cleanup indentation, fixing cublas build
2023-06-08 22:40:53 +08:00
Concedo
6f82e17b7a
added MPT support
2023-06-03 16:14:08 +08:00
Concedo
5d9f5b28a6
rwkv integration completed
2023-05-28 00:48:56 +08:00
Concedo
981d5ba866
Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# Makefile
# README.md
# ggml-opencl.cpp
# llama.cpp
# otherarch/ggml_v2-opencl-legacy.c
2023-05-22 16:16:48 +08:00
Concedo
75e4548821
missed out gpt2
2023-05-21 01:44:47 +08:00
Concedo
c048bcfec4
remove old filever checks (+7 squashed commit)
...
Squashed commit:
[b72627a] new format not working
[e568870] old ver works
[7053b77] compile errors fixed, fixing linkers
[4ae8889] add new ver
[ff82dfd] file format checks
[25b8aa8] refactoring type names
[931063b] still merging
2023-05-21 00:15:39 +08:00
Concedo
b692e4d2a4
wip
2023-05-14 17:21:07 +08:00
Concedo
2f2eff6e13
the dark gods have been sated, and redpajama is integrated... but at what cost?
2023-05-08 20:58:00 +08:00
Concedo
ff93b394da
fixed a typo
2023-05-06 12:37:34 +08:00
Concedo
2edbcebe27
added optional force versioning flag
2023-05-05 22:02:00 +08:00
Concedo
0fc1772a8f
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# README.md
# ggml.c
2023-04-29 11:14:05 +08:00
Concedo
5eec5d6ed9
Added backwards compatibility to an earlier version of NeoX.
2023-04-25 20:34:18 +08:00
Concedo
6e908c1792
added lora support
2023-04-22 12:29:38 +08:00
Concedo
ef13443047
wip pythia integration
2023-04-22 01:08:23 +08:00
Concedo
5160053e51
merged llama adapter into the rest of the gpt adapters
2023-04-21 17:47:48 +08:00
Concedo
c200b674f4
updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter
2023-04-18 17:36:44 +08:00
Concedo
763ad172c0
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
2023-04-17 17:31:45 +08:00
0cc4m
8fbfc80e03
Fix clblast device selection on Linux
2023-04-15 12:02:36 +02:00
Concedo
1bd5992da4
clean and refactor handling of flags
2023-04-12 23:25:31 +08:00
rabidcopy
2444a99db5
Fix make compile error in expose.cpp(?) ( #44 )
...
* fix compile error?
* Update expose.cpp
2023-04-12 16:19:38 +08:00
Concedo
ca69e05d1f
update readme and fixed typos
2023-04-11 23:53:21 +08:00
Concedo
23c675b2e6
integrated optional (experimentl) CLBlast support
2023-04-11 23:33:44 +08:00
Concedo
d8e37bfe75
new gpt2 format supported
2023-04-08 17:35:36 +08:00
Concedo
14273fea7a
integrated gpt2 support
2023-04-04 23:15:47 +08:00
Concedo
8dd8ab1659
Various enhancement and integration pygmalion.cpp
2023-04-03 00:04:43 +08:00
Concedo
9aabb0d9db
massive refactor completed, GPT-J integrated
2023-04-02 17:03:30 +08:00
Concedo
085a9f90a7
still refactoring
2023-04-01 11:56:34 +08:00
Concedo
9ab6e87b58
Merge branch 'master' into concedo
...
# Conflicts:
# CMakeLists.txt
2023-04-01 09:05:45 +08:00
Concedo
801b178f2a
still refactoring, but need a checkpoint to prepare build for 1.0.7
2023-04-01 08:55:14 +08:00
Concedo
6b86f5ea22
halfway refactoring, wip adding other model types
2023-04-01 01:13:05 +08:00
Concedo
559a1967f7
Backwards compatibility formats all done
...
Merge branch 'master' into concedo
# Conflicts:
# CMakeLists.txt
# README.md
# llama.cpp
2023-03-31 19:01:33 +08:00
Concedo
79f9743347
improved console info, fixed utf encoding bugs
2023-03-31 15:38:38 +08:00
Concedo
664b277c27
integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed.
2023-03-30 00:43:52 +08:00
Concedo
57474944d6
Merge branch 'master' into concedo
...
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# Makefile
# README.md
2023-03-26 14:52:08 +08:00
Concedo
3c78124aac
Merge branch 'master' into concedo
...
# Conflicts:
# README.md
2023-03-25 11:20:04 +08:00
Concedo
119392f6f2
defaulting to f32 kv, and 4 threads seem to produce better results
2023-03-25 11:11:40 +08:00
Concedo
c6c60332a4
Optimizations
2023-03-24 21:33:53 +08:00