Default branch

1b67cbe022 · Improve HF integration () · Updated 2025-04-18 21:27:53 +00:00

Branches

f74aa7bd1a · Correctly reset batch iterator at each arrow create_iter call. · Updated 2025-03-03 23:32:30 +00:00

20
1

2cae41fe1f · Get evals working again. · Updated 2025-02-28 00:41:01 +00:00

20
1

e668ac0280 · Initialize rope embeddings properly for the entropy model · Updated 2025-02-25 20:38:52 +00:00

22
1

c7b40706f0 · Pass mask in packing_iterator, correctly handle last batch, fix masking · Updated 2025-02-25 19:11:23 +00:00

22
1

edccc0873d · Remove byte tokenizer and add config args to switch between byte/patch packing · Updated 2025-02-24 23:56:43 +00:00

23
1

4c6ee1aef0 · Add vocab and seq len abstract fields · Updated 2025-02-24 22:41:01 +00:00

25
1

0ffe2ab685 · Update iterator inheritance, pass file format args, limit iterator · Updated 2025-02-20 00:57:17 +00:00

26
1

2f247263b9 · Make apex logs less noisy · Updated 2025-02-18 18:43:06 +00:00

27
1

3117ac1f1f · Make it possible to specify multiple config files · Updated 2025-02-18 18:41:02 +00:00

28
1

1a14267b30 · Update README.md with arxiv citation · Updated 2025-02-15 19:50:41 +00:00

29
1

89deebc8f3 · missed a print · Updated 2025-02-14 19:21:28 +00:00

33
4

67b6c3b3da · Update README.md · Updated 2025-02-13 19:57:49 +00:00

32
1

53529dcc78 · Fix multiprocessing dataloader checkpointing and use it in the train script · Updated 2025-02-13 19:01:49 +00:00

32
1

ab8f8a4412 · Test first batch matches · Updated 2025-02-13 18:04:30 +00:00

33
1

67624845d0 · disable reshard after forward · Updated 2025-02-13 02:33:20 +00:00

34
1

00c7a6f194 · black and assert comment · Updated 2025-02-13 02:26:46 +00:00

35
2

3075d7bf83 · fix save and reload model state · Updated 2025-02-07 21:46:34 +00:00

36
1

8d7338308e · Add bpb and n_bytes to metric logging · Updated 2025-02-07 21:13:36 +00:00

38
1

ba922695b3 · comment + black · Updated 2025-02-06 22:14:20 +00:00

38
2

9c3c997cae · Allow ArrowIterator to read from json · Updated 2025-02-06 17:44:36 +00:00

39
1