vikarti.anatra/blt

mirror of https://github.com/facebookresearch/blt.git synced 2025-02-23 21:42:14 +00:00

Author	SHA1	Message	Date
Pedro Rodriguez	ed6300375f	Merge `bec0164820` into sapling-pr-archive-EntilZha	2025-02-14 13:04:04 -08:00
Pedro Rodriguez	bec0164820	Make it possible to specify multiple config files Summary: Test Plan: Test that this iterpolates in the right order, config -> configs -> cli args ``` # All three sources python -m bytelatent.print_config config=bytelatent/configs/debug.yaml configs=[internal/configs/s3_debug.yaml] eval=null # What worked before python -m bytelatent.print_config config=internal/configs/s3_debug.yaml eval=null ```	2025-02-14 21:03:57 +00:00
Pedro Rodriguez	1c7031b4c4	Merge `be3ff12cfe` into sapling-pr-archive-EntilZha	2025-02-14 13:03:38 -08:00
Pedro Rodriguez	be3ff12cfe	Make it possible to specify multiple config files Summary: Test Plan: Test that this iterpolates in the right order, config -> configs -> cli args ``` # All three sources python -m bytelatent.print_config config=bytelatent/configs/debug.yaml configs=[internal/configs/s3_debug.yaml] eval=null # What worked before python -m bytelatent.print_config config=internal/configs/s3_debug.yaml eval=null ```	2025-02-14 21:03:26 +00:00
Srinivasan Iyer	f3e8125f74	using apex rmsnorm (#57 ) Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details * using apex rmsnorm * added message for missing apex * black * missed a print --------- Co-authored-by: Srini Iyer <sviyer@meta.com>	2025-02-14 11:22:03 -08:00
Srinivasan Iyer	c49e25171e	Update README.md (#58 )	2025-02-14 11:16:49 -08:00
Pedro Rodriguez	8c61ab5e67	Fix multiprocessing dataloader checkpointing and use it in the train script (#50 ) Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-13 11:58:23 -08:00
Pedro Rodriguez	84afa0f121	merge commit for archive created by Sapling Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-13 19:01:55 +00:00
Pedro Rodriguez	53529dcc78	Fix multiprocessing dataloader checkpointing and use it in the train script Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details Summary: Test Plan:	2025-02-13 19:01:49 +00:00
Pedro Rodriguez	76e7b001bb	Merge `0c6cb995a0` into sapling-pr-archive-EntilZha	2025-02-13 10:39:03 -08:00
Pedro Rodriguez	0c6cb995a0	Fix multiprocessing dataloader checkpointing and use it in the train script Summary: Test Plan:	2025-02-13 18:38:58 +00:00
Pedro Rodriguez	85c2f28f26	Test first batch matches (#53 ) Summary: Test Plan:	2025-02-13 10:05:08 -08:00
Pedro Rodriguez	45d52b7ae3	Merge `ab8f8a4412` into sapling-pr-archive-EntilZha	2025-02-13 10:04:43 -08:00
Pedro Rodriguez	ab8f8a4412	Test first batch matches Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details Summary: Test Plan:	2025-02-13 18:04:30 +00:00
Srinivasan Iyer	9d907fed1c	disable reshard after forward (#56 ) Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Co-authored-by: Srini Iyer <sviyer@meta.com>	2025-02-12 18:33:53 -08:00
Srinivasan Iyer	48e4ad0bd2	make sure max_encoder_seq_length matches (#55 ) * make sure max_encoder_seq_length matches * black and assert comment --------- Co-authored-by: Srini Iyer <sviyer@meta.com>	2025-02-12 18:27:22 -08:00
Pedro Rodriguez	078791996f	Merge `ece82cb960` into sapling-pr-archive-EntilZha Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-12 11:25:18 -08:00
Pedro Rodriguez	ece82cb960	Make it possible to specify multiple config files Summary: Test Plan: Test that this iterpolates in the right order, config -> configs -> cli args ``` # All three sources python -m bytelatent.print_config config=bytelatent/configs/debug.yaml configs=[internal/configs/s3_debug.yaml] eval=null # What worked before python -m bytelatent.print_config config=internal/configs/s3_debug.yaml eval=null ```	2025-02-12 19:25:06 +00:00
Pedro Rodriguez	15d9c40abe	Merge `3e3193c1d4` into sapling-pr-archive-EntilZha	2025-02-12 10:24:54 -08:00
Pedro Rodriguez	c0c5bdba91	Merge `c54c9f0517` into sapling-pr-archive-EntilZha	2025-02-12 10:24:45 -08:00
Pedro Rodriguez	3e3193c1d4	Fix multiprocessing dataloader checkpointing and use it in the train script Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Test Plan:	2025-02-12 18:24:40 +00:00
Pedro Rodriguez	c54c9f0517	Test first batch matches Summary: Test Plan:	2025-02-12 18:24:39 +00:00
Pedro Rodriguez	ec59c13d81	Merge `bd3cf61bb9` into sapling-pr-archive-EntilZha	2025-02-12 10:11:44 -08:00
Pedro Rodriguez	9613e0ea5f	merge commit for archive created by Sapling	2025-02-12 18:09:31 +00:00
Pedro Rodriguez	bd3cf61bb9	Fix multiprocessing dataloader checkpointing and use it in the train script Summary: Test Plan:	2025-02-12 18:09:26 +00:00
Pedro Rodriguez	4cee32ea8c	Test first batch matches Summary: Test Plan:	2025-02-12 18:09:26 +00:00
Pedro Rodriguez	b61a612bbb	Merge `92af9b3f56` into sapling-pr-archive-EntilZha	2025-02-12 10:07:50 -08:00
Pedro Rodriguez	92af9b3f56	Test first batch matches Summary: Test Plan:	2025-02-12 18:07:22 +00:00
Pedro Rodriguez	c6cbacc8c1	merge commit for archive created by Sapling Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-11 22:56:32 +00:00
Pedro Rodriguez	38cc67a953	Fix multiprocessing dataloader checkpointing and use it in the train script Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Test Plan:	2025-02-11 22:56:26 +00:00
Pedro Rodriguez	c4b7a01b2b	Merge `5c8fb4f1b3` into sapling-pr-archive-EntilZha Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details	2025-02-07 15:27:12 -08:00
Pedro Rodriguez	5c8fb4f1b3	Fix multiprocessing dataloader checkpointing and use it in the train script Summary: Test Plan:	2025-02-07 23:27:05 +00:00
Srinivasan Iyer	22c7fe1d1c	fix save and reload model state (#49 ) Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details Co-authored-by: Srini Iyer <sviyer@meta.com>	2025-02-07 14:27:47 -08:00
Pedro Rodriguez	fe45f69fbf	Add bpb and n_bytes to metric logging (#41 ) Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Test Plan:	2025-02-07 13:14:30 -08:00
Pedro Rodriguez	b35206d756	merge commit for archive created by Sapling Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-07 21:13:42 +00:00
Pedro Rodriguez	8d7338308e	Add bpb and n_bytes to metric logging Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details Summary: Test Plan:	2025-02-07 21:13:36 +00:00
Pedro Rodriguez	f783846574	merge commit for archive created by Sapling Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-07 00:26:06 +00:00
Pedro Rodriguez	b6396eb0f4	Add bpb and n_bytes to metric logging Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Test Plan:	2025-02-07 00:26:01 +00:00
Srinivasan Iyer	aebdc481a8	Fix init and repro (#48 ) Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details * Fix init and repro * comment + black --------- Co-authored-by: Srini Iyer <sviyer@meta.com>	2025-02-06 14:18:02 -08:00
Pedro Rodriguez	ab594996a9	Merge `4e2ed0aa05` into sapling-pr-archive-EntilZha Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details	2025-02-06 10:24:36 -08:00
Pedro Rodriguez	4e2ed0aa05	Add bpb and n_bytes to metric logging Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Test Plan:	2025-02-06 18:08:01 +00:00
Pedro Rodriguez	936d9437be	Allow ArrowIterator to read from json (#45 ) Some checks are pending Lint with Black / lint (push) Waiting to run Details Lint with isort / lint (push) Waiting to run Details Summary: Currently, arrow iterator can only read arrow files. However, the pyarrow library can read other formats, including jsonlines. This allows the same ArrowIterator to read from jsonlines, so we can read from the original source data, and simply omit the entropy column when doing so Test Plan: Run train script until dataloader starts	2025-02-06 09:57:22 -08:00
Pedro Rodriguez	2950b63cf2	Merge `9c3c997cae` into sapling-pr-archive-EntilZha	2025-02-06 09:44:41 -08:00
Pedro Rodriguez	9c3c997cae	Allow ArrowIterator to read from json Some checks failed Lint with Black / lint (push) Has been cancelled Details Lint with isort / lint (push) Has been cancelled Details Summary: Currently, arrow iterator can only read arrow files. However, the pyarrow library can read other formats, including jsonlines. This allows the same ArrowIterator to read from jsonlines, so we can read from the original source data, and simply omit the entropy column when doing so Test Plan: Run train script until dataloader starts	2025-02-06 17:44:36 +00:00
Pedro Rodriguez	fff80b86b5	Merge `0e9421af07` into sapling-pr-archive-EntilZha	2025-02-06 09:43:15 -08:00
Pedro Rodriguez	0e9421af07	Allow ArrowIterator to read from json Summary: Test Plan:	2025-02-06 17:43:11 +00:00
Pedro Rodriguez	18ae0ba444	Merge `8d26140970` into sapling-pr-archive-EntilZha	2025-02-06 09:42:51 -08:00
Pedro Rodriguez	8d26140970	Allow ArrowIterator to read from json Summary: Test Plan:	2025-02-06 17:42:42 +00:00
Pedro Rodriguez	afedb16598	Update checkpointing to use fsspec (#39 ) Summary: - Make the data/checkpoint code fsspec compatible - Still will not work with s3 saves, due to `torch.distributed.checkpoint.save` not being out of the box workable with `fsspec`. Will implement in followup PR Test Plan: Run unit tests and the commands below ``` python -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null checkpoint.dump.every=100 ``` ``` torchrun --nproc-per-node 8 -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null checkpoint.dump.every=100 ``` These currently won't work due to the torch distributed save, but theses hould be tested at a later date ``` python -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null checkpoint.dump.every=100 dump_dir=s3://blt/scratch/checkpoint-test/ ``` ``` torchrun --nproc-per-node 8 -m bytelatent.train config=internal/configs/s3_debug.yaml eval=null checkpoint.dump.every=100 dump_dir=s3://blt/scratch/checkpoint-test/ ```	2025-02-06 09:41:58 -08:00
Pedro Rodriguez	d44902da97	Merge `f058373889` into sapling-pr-archive-EntilZha	2025-02-06 09:37:27 -08:00

1 2 3

130 commits