blt/bytelatent/data/iterators
Pedro Rodriguez bc39591032 Several changes to enable entropy model training/eval
Summary:

- Make arrow iterator able to read from jsonl files, the entropies are omitted in this case
- Make the data/checkpoint code fsspec compatible
- Fix issues with all reduce with non-bf16 in dist_sum and norm computation.
- Minimal fixes to get eval to run, it is slow currently
- Add bpb numbers during training


Test Plan:

Run

```
torchrun --nproc-per-node 8 -m bytelatent.train config=internal/configs/entropy_model.yaml eval=null max_steps=10100
```

```
python -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null
```

```
torchrun --nproc-per-node 8 -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null
```
2025-02-04 18:19:49 +00:00
..
__init__.py Initial commit 2024-12-12 15:32:30 -08:00
abstract_iterator.py Initial commit 2024-12-12 15:32:30 -08:00
arrow_iterator.py Several changes to enable entropy model training/eval 2025-02-04 18:19:49 +00:00
looping_iterator.py Initial commit 2024-12-12 15:32:30 -08:00
multiprocess_iterator.py This includes fixes that make checkpointing and reloading work correctly. (#35) 2025-01-27 16:56:42 -08:00
packing_iterator.py Initial codes and scripts for training entropy model (#34) 2025-01-27 09:46:44 -08:00
preprocess_iterator.py Initial commit 2024-12-12 15:32:30 -08:00
sampling_iterator.py Initial commit 2024-12-12 15:32:30 -08:00
sequence_iterator.py Initial codes and scripts for training entropy model (#34) 2025-01-27 09:46:44 -08:00
test_arrow_iterator.py Changes for training entropy model and correcting attention in local models (#25) 2025-01-17 14:23:01 -08:00
test_iters.py Initial commit 2024-12-12 15:32:30 -08:00