blt/bytelatent/data/iterators
Pedro Rodriguez 2655e4cf82 Remove byte tokenizer and add config args to switch between byte/patch packing
Summary:

Test Plan:

```
python -m bytelatent.train config=../internal-blt/configs/entropy_model.yaml logging.wandb=null checkpoint.dump.every=1000 checkpoint.eval.every=100000 eval=null

pytest bytelatent/
```
2025-02-22 01:13:13 +00:00
..
__init__.py Initial commit 2024-12-12 15:32:30 -08:00
abstract_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
arrow_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
dev_iterators.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
limit_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
looping_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
multiprocess_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
packing_iterator.py Remove byte tokenizer and add config args to switch between byte/patch packing 2025-02-22 01:13:13 +00:00
preprocess_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
sampling_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
sequence_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
test_arrow_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
test_iters.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00
test_limit_iterator.py Update iterator inheritance, pass file format args, limit iterator (#63) 2025-02-21 16:21:07 -08:00