Commit graph

  • 18ae0ba444
    Merge 8d26140970 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-06 09:42:51 -0800
  • 8d26140970 Allow ArrowIterator to read from json Pedro Rodriguez 2025-02-06 17:42:32 +0000
  • afedb16598
    Update checkpointing to use fsspec (#39) Pedro Rodriguez 2025-02-06 09:41:58 -0800
  • d44902da97
    Merge f058373889 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-06 09:37:27 -0800
  • f058373889 Update checkpointing to use fsspec pr39 Pedro Rodriguez 2025-02-06 17:37:20 +0000
  • e13495c351
    Merge 341264685a into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-06 09:34:50 -0800
  • 341264685a Update checkpointing to use fsspec Pedro Rodriguez 2025-02-06 17:33:38 +0000
  • 2d1c766050
    Merge 45bfe94c1e into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-05 17:27:35 -0800
  • 45bfe94c1e Broken train reproducing bf16 error pr47 Pedro Rodriguez 2025-02-06 01:27:15 +0000
  • 739dc71a0a
    Add rope fp32 (#43) Srinivasan Iyer 2025-02-05 17:19:37 -0800
  • 6fbaf7266f
    fix stool (#44) Srinivasan Iyer 2025-02-05 17:18:40 -0800
  • 8212e9b6f2 fix stool stool_fix Srini Iyer 2025-02-06 00:55:25 +0000
  • b28ceb624d Add flag for rope outer in fp32 add_rope_fp32 Srini Iyer 2025-02-06 00:40:51 +0000
  • 162b99b4a3 Log model Srini Iyer 2025-02-06 00:26:37 +0000
  • 7cf8fab49b
    Fix wandb logging (#42) Srinivasan Iyer 2025-02-05 16:24:39 -0800
  • a27ab3de8e Fix wandb logging fix_wandb Srini Iyer 2025-02-06 00:07:59 +0000
  • 8f1a9a858e Minimal working eval Pedro Rodriguez 2025-02-05 22:47:01 +0000
  • 48cf4dfee1 Allow ArrowIterator to read from json Pedro Rodriguez 2025-02-05 22:47:00 +0000
  • 1377fcb010
    Merge 2f42633b07 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-05 14:32:03 -0800
  • 2f42633b07 Add bpb and n_bytes to metric logging Pedro Rodriguez 2025-02-05 22:26:31 +0000
  • b2f2a6a76e merge commit for archive created by Sapling Pedro Rodriguez 2025-02-05 22:10:37 +0000
  • 1450464031 Update checkpointing to use fsspec Pedro Rodriguez 2025-02-05 19:09:13 +0000
  • c3d7f720f0
    Merge b6e53f1d4c into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 16:55:37 -0800
  • b6e53f1d4c Update checkpointing to use fsspec Pedro Rodriguez 2025-02-05 00:55:18 +0000
  • c79b1fdbd0
    Fix distributed all reduce grad norm (#40) Pedro Rodriguez 2025-02-04 16:53:50 -0800
  • e1cd15ec30 merge commit for archive created by Sapling Pedro Rodriguez 2025-02-05 00:53:00 +0000
  • ac257bac19 Fix distributed all reduce grad norm pr40 Pedro Rodriguez 2025-02-05 00:50:20 +0000
  • 2d68e5126d merge commit for archive created by Sapling Pedro Rodriguez 2025-02-05 00:51:52 +0000
  • 9cf7847e26 Fix distributed all reduce grad norm Pedro Rodriguez 2025-02-05 00:50:20 +0000
  • 8db01ac392
    Merge 4ad4889405 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 16:30:42 -0800
  • 4ad4889405 Update checkpointing to use fsspec Pedro Rodriguez 2025-02-05 00:30:25 +0000
  • 97e3bc0427
    Merge b2058fb0f6 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 16:30:13 -0800
  • b2058fb0f6 Update checkpointing to use fsspec Pedro Rodriguez 2025-02-05 00:27:00 +0000
  • 740d76cd69
    Merge e742218d65 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 16:21:14 -0800
  • e742218d65 Update checkpointing to use fsspec Pedro Rodriguez 2025-02-05 00:20:57 +0000
  • 9c4cca558b
    Merge bc39591032 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 10:19:56 -0800
  • bc39591032 Several changes to enable entropy model training/eval Pedro Rodriguez 2025-02-04 18:19:41 +0000
  • f73b9e1a41 merge commit for archive created by Sapling Pedro Rodriguez 2025-02-04 18:05:21 +0000
  • ab399e981d Several changes to enable entropy model training/eval Pedro Rodriguez 2025-02-04 18:04:54 +0000
  • 48df9ce785
    Merge c6ef4285e2 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-04 10:03:26 -0800
  • c6ef4285e2 Several changes to enable entropy model training/eval Pedro Rodriguez 2025-02-04 18:03:18 +0000
  • 4ff8341738
    Merge 11cad6c84d into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-02-03 18:29:37 -0800
  • 11cad6c84d WIP parallel copy script pr38 Pedro Rodriguez 2025-01-28 00:57:06 +0000
  • 7044771a12
    This includes fixes that make checkpointing and reloading work correctly. (#35) Pedro Rodriguez 2025-01-27 16:56:42 -0800
  • 4db801a532
    Merge caf82b924e into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-27 16:54:52 -0800
  • caf82b924e This includes fixes that make checkpointing and reloading work correctly. pr35 Pedro Rodriguez 2025-01-28 00:38:16 +0000
  • c2f1e4845e
    Merge e02ba763b0 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-27 16:38:54 -0800
  • e02ba763b0 This includes fixes that make checkpointing and reloading work correctly. Pedro Rodriguez 2025-01-28 00:38:16 +0000
  • 7622d28b74
    Initial codes and scripts for training entropy model (#34) Pedro Rodriguez 2025-01-27 09:46:44 -0800
  • b1c12dd275
    Merge 34ca1f7d4b into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-24 13:59:47 -0800
  • 34ca1f7d4b Initial codes and scripts for training entropy model pr34 Pedro Rodriguez 2025-01-24 21:55:24 +0000
  • f1a2589266
    Merge fb09022e5e into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-24 13:55:48 -0800
  • fb09022e5e Initial codes and scripts for training entropy model Pedro Rodriguez 2025-01-24 21:55:24 +0000
  • a809259e71
    Use load_async flag to not start MP iterator (#33) Pedro Rodriguez 2025-01-24 10:57:20 -0800
  • 17b727465f
    Merge bd461af91a into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-24 10:56:34 -0800
  • bd461af91a Use load_async flag to not start MP iterator pr33 Pedro Rodriguez 2025-01-24 18:56:18 +0000
  • bc42cebd7d
    Update file check script to check sizes (#32) Pedro Rodriguez 2025-01-22 13:06:46 -0800
  • cbcc3e1868 merge commit for archive created by Sapling Pedro Rodriguez 2025-01-22 19:58:13 +0000
  • 8a3084c346 Update file check script to check sizes pr32 Pedro Rodriguez 2025-01-22 19:57:54 +0000
  • 392117bff2
    Fix realtime entropy patching (#26) Ink 2025-01-21 18:34:23 -0600
  • b4f15da655 remove unused import Luciferian Ink 2025-01-21 17:07:24 -0600
  • feb780a7b1 revert patcher to original Luciferian Ink 2025-01-21 17:01:52 -0600
  • 6f8f711a49 ensure patcher is serializable Luciferian Ink 2025-01-21 12:50:27 -0600
  • e381743f6e fix an error with monotonicity and small sequence lengths Luciferian Ink 2025-01-19 14:49:05 -0600
  • 1ed34abbbb remove legacy arg Luciferian Ink 2025-01-18 08:16:47 -0600
  • 916a6622d5 return preds from calculate_entropies() Luciferian Ink 2025-01-18 08:12:05 -0600
  • 1ca0e04004 fix grad flow Luciferian Ink 2025-01-18 02:02:56 -0600
  • 5adf1c7133 allow grads when calculating entropies Luciferian Ink 2025-01-18 01:42:00 -0600
  • 9e42f5dd1d revert to original patcher style, fix warning Luciferian Ink 2025-01-17 20:36:07 -0600
  • cff0dcb7ab allow patch_batch_size to be adjusted in the forward() method Luciferian Ink 2025-01-17 19:41:17 -0600
  • 175fce61df remove spammy warning Luciferian Ink 2025-01-17 18:04:37 -0600
  • 6129756e10 remove unused argument Luciferian Ink 2025-01-17 18:04:25 -0600
  • 420326184a allow loading of the entropy model directly Luciferian Ink 2025-01-17 18:03:18 -0600
  • 6ffeb66b53
    Changes for training entropy model and correcting attention in local models (#25) Pedro Rodriguez 2025-01-17 14:23:01 -0800
  • ee0d68395a merge commit for archive created by Sapling Pedro Rodriguez 2025-01-17 22:21:56 +0000
  • 7f305b3871 [WIP] Changes for training entropy model and correcting attention in local models pr25 Pedro Rodriguez 2025-01-17 22:21:50 +0000
  • 3e3a9fc314 merge commit for archive created by Sapling Pedro Rodriguez 2025-01-17 01:02:33 +0000
  • 374409fa3b [WIP] Changes for training entropy model and correcting attention in local models Pedro Rodriguez 2025-01-17 01:01:29 +0000
  • 020cf16c1b
    Merge 38022ac06e into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-16 13:51:17 -0800
  • 38022ac06e [WIP] Changes for training entropy model and correcting attention in local models Pedro Rodriguez 2025-01-16 21:51:04 +0000
  • caec8d2621
    allow flex-attention to be disabled (#19) Ink 2025-01-14 11:32:07 -0600
  • 1fa58d2bd8 allow flex-attn to be disabled via an env var Luciferian Ink 2025-01-14 00:39:32 -0600
  • 1da3dd9315
    Update preprocess_entropies script to blt inference + add fsspec support (#23) Pedro Rodriguez 2025-01-13 15:28:14 -0800
  • 4974f9512d merge commit for archive created by Sapling Pedro Rodriguez 2025-01-13 23:26:35 +0000
  • d718cfa9a1 Update preprocess_entropies script to blt inference + add fsspec support pr23 Pedro Rodriguez 2025-01-13 23:26:26 +0000
  • 95e7e89f42 merge commit for archive created by Sapling Pedro Rodriguez 2025-01-13 23:14:27 +0000
  • 3f045f1123 Update preprocess_entropies script to blt inference + add fsspec support Pedro Rodriguez 2025-01-13 23:13:49 +0000
  • 159f56d4a9 allow flex-attention to silently fail Luciferian Ink 2025-01-12 01:16:39 -0600
  • b0120da72f
    Replace regular filesystem calls with fsspec + add s3 support (#18) Pedro Rodriguez 2025-01-10 11:04:41 -0800
  • 5657903dea merge commit for archive created by Sapling Pedro Rodriguez 2025-01-10 01:04:26 +0000
  • a1d05403b4 Replace regular filesystem calls with fsspec + add s3 support pr18 Pedro Rodriguez 2025-01-10 01:02:25 +0000
  • 3104e8e317 merge commit for archive created by Sapling Pedro Rodriguez 2025-01-10 01:03:28 +0000
  • 1bf6d15e5a Replace regular filesystem calls with fsspec + add s3 support Pedro Rodriguez 2025-01-10 01:02:25 +0000
  • 8e0732e2a6
    Merge c137f4e636 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-09 17:02:30 -0800
  • c137f4e636 Replace regular filesystem calls with fsspec + add s3 support Pedro Rodriguez 2025-01-10 01:02:25 +0000
  • 3dadbfea4b
    Merge 84854423c4 into sapling-pr-archive-EntilZha Pedro Rodriguez 2025-01-09 17:02:04 -0800
  • 84854423c4 Replace regular filesystem calls with fsspec + add s3 support Pedro Rodriguez 2025-01-10 01:00:54 +0000
  • d4ddb95322
    Add plotting code from paper (#17) Pedro Rodriguez 2025-01-09 12:11:50 -0800
  • 28016f144d Add plotting code from paper pr17 Pedro Rodriguez 2025-01-09 20:06:18 +0000
  • 2fdc6f3cc9
    Package bytelatent as a module (#7) Ink 2025-01-06 18:44:50 -0600