data
|
Allow ArrowIterator to read from json (#45)
|
2025-02-06 09:57:22 -08:00 |
model
|
black
|
2025-02-14 19:20:17 +00:00 |
plotting
|
Add plotting code from paper (#17)
|
2025-01-09 12:11:50 -08:00 |
preprocess
|
Allow ArrowIterator to read from json (#45)
|
2025-02-06 09:57:22 -08:00 |
tokenizers
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
.DS_Store
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
__init__.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
args.py
|
Allow ArrowIterator to read from json (#45)
|
2025-02-06 09:57:22 -08:00 |
base_transformer.py
|
black
|
2025-02-14 19:20:17 +00:00 |
checkpoint.py
|
Update checkpointing to use fsspec (#39)
|
2025-02-06 09:41:58 -08:00 |
constants.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
float8.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
logger.py
|
Update checkpointing to use fsspec (#39)
|
2025-02-06 09:41:58 -08:00 |
metrics.py
|
Add bpb and n_bytes to metric logging (#41)
|
2025-02-07 13:14:30 -08:00 |
norms.py
|
Fix distributed all reduce grad norm (#40)
|
2025-02-04 16:53:50 -08:00 |
optim.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
probe.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
profiling.py
|
Initial commit
|
2024-12-12 15:32:30 -08:00 |
stool.py
|
Allow ArrowIterator to read from json (#45)
|
2025-02-06 09:57:22 -08:00 |
train.py
|
make sure max_encoder_seq_length matches (#55)
|
2025-02-12 18:27:22 -08:00 |
transformer.py
|
missed a print
|
2025-02-14 19:21:28 +00:00 |