Pedro Rodriguez
ff36aa8642
Add vocab and seq len abstract fields ( #66 )
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
2025-02-24 14:41:58 -08:00
Pedro Rodriguez
b0956bde99
Make apex logs less noisy ( #60 )
...
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Summary:
Test Plan:
2025-02-18 10:45:56 -08:00
Srinivasan Iyer
f3e8125f74
using apex rmsnorm ( #57 )
...
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
* using apex rmsnorm
* added message for missing apex
* black
* missed a print
---------
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-14 11:22:03 -08:00
Srinivasan Iyer
22c7fe1d1c
fix save and reload model state ( #49 )
...
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-07 14:27:47 -08:00
Srinivasan Iyer
aebdc481a8
Fix init and repro ( #48 )
...
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* Fix init and repro
* comment + black
---------
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-06 14:18:02 -08:00
Srinivasan Iyer
739dc71a0a
Add rope fp32 ( #43 )
...
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* Log model
* Add flag for rope outer in fp32
---------
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-05 17:19:37 -08:00
Pedro Rodriguez
7622d28b74
Initial codes and scripts for training entropy model ( #34 )
...
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
Summary:
Test Plan:
2025-01-27 09:46:44 -08:00
Ink
392117bff2
Fix realtime entropy patching ( #26 )
...
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* allow loading of the entropy model directly
* remove unused argument
* remove spammy warning
* allow patch_batch_size to be adjusted in the forward() method
* revert to original patcher style, fix warning
* allow grads when calculating entropies
* fix grad flow
* return preds from calculate_entropies()
* remove legacy arg
* fix an error with monotonicity and small sequence lengths
* ensure patcher is serializable
* revert patcher to original
* remove unused import
2025-01-21 16:34:23 -08:00
Pedro Rodriguez
6ffeb66b53
Changes for training entropy model and correcting attention in local models ( #25 )
...
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Summary:
- Refactor local model configs to be separate and clearer
- Add attention arguments and correct which attention is used in local models
- Preparation for being able to have an entropy train script
- Fix failing unit tests
Test Plan:
2025-01-17 14:23:01 -08:00
Pedro Rodriguez
bcc039bb75
Initial commit
2024-12-12 15:32:30 -08:00