mirror of
https://github.com/facebookresearch/blt.git
synced 2025-01-18 16:37:46 +00:00
parent
898671b66b
commit
9065bb1cce
|
@ -53,7 +53,7 @@ This command will download the `fineweb_edu` and prepare it for training in the
|
||||||
python setup/download_prepare_hf_data.py fineweb_edu <MEMORY> --data_dir ./data --seed 42 --nchunks <NCHUNKS>
|
python setup/download_prepare_hf_data.py fineweb_edu <MEMORY> --data_dir ./data --seed 42 --nchunks <NCHUNKS>
|
||||||
```
|
```
|
||||||
|
|
||||||
to download tokenizer (here llama3), use the folowing script:
|
to download tokenizer (here llama3), use the following script:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python setup/download_tokenizer.py llama3 <SAVE_PATH> --api_key <HUGGINGFACE_TOKEN>
|
python setup/download_tokenizer.py llama3 <SAVE_PATH> --api_key <HUGGINGFACE_TOKEN>
|
||||||
|
|
Loading…
Reference in a new issue