Commit graph

4 commits

Author SHA1 Message Date
Pedro Rodriguez
9bd51df961
Fix rsync to not preserve original permissions, instead use destination (#76)
Summary:

Test Plan:
2025-03-05 11:49:41 -08:00
Pedro Rodriguez
936d9437be
Allow ArrowIterator to read from json (#45)
Some checks are pending
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
Summary:

Currently, arrow iterator can only read arrow files. However, the pyarrow library can read
other formats, including jsonlines. This allows the same ArrowIterator to read from jsonlines,
so we can read from the original source data, and simply omit the entropy column when doing so

Test Plan:

Run train script until dataloader starts
2025-02-06 09:57:22 -08:00
Srinivasan Iyer
6fbaf7266f
fix stool (#44)
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-05 17:18:40 -08:00
Pedro Rodriguez
bcc039bb75 Initial commit 2024-12-12 15:32:30 -08:00