Commit graph

4 commits

Author SHA1 Message Date
Pedro Rodriguez c6ef4285e2 Several changes to enable entropy model training/eval
Summary:

- Make arrow iterator able to read from jsonl files, the entropies are omitted in this case
- Make the data/checkpoint code fsspec compatible
- Fix issues with all reduce with non-bf16 in dist_sum and norm computation.
- Minimal fixes to get eval to run, it is slow currently
- Add bpb numbers during training


Test Plan:
2025-02-04 18:03:19 +00:00
Pedro Rodriguez 6ffeb66b53
Changes for training entropy model and correcting attention in local models (#25)
Some checks failed
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Summary:

- Refactor local model configs to be separate and clearer
- Add attention arguments and correct which attention is used in local models
- Preparation for being able to have an entropy train script
- Fix failing unit tests

Test Plan:
2025-01-17 14:23:01 -08:00
Ink caec8d2621
allow flex-attention to be disabled (#19)
Some checks failed
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
* allow flex-attention to silently fail

* allow flex-attn to be disabled via an env var
2025-01-14 09:32:07 -08:00
Pedro Rodriguez bcc039bb75 Initial commit 2024-12-12 15:32:30 -08:00