Commit graph

8 commits

Author SHA1 Message Date
Pedro Rodriguez a3e0647d03 Make apex logs less noisy
Summary:

Test Plan:
2025-02-14 23:45:28 +00:00
Srinivasan Iyer f3e8125f74
using apex rmsnorm (#57)
Some checks failed
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
* using apex rmsnorm

* added message for missing apex

* black

* missed a print

---------

Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-14 11:22:03 -08:00
Srinivasan Iyer 22c7fe1d1c
fix save and reload model state (#49)
Some checks failed
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-07 14:27:47 -08:00
Srinivasan Iyer aebdc481a8
Fix init and repro (#48)
Some checks are pending
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* Fix init and repro

* comment + black

---------

Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-06 14:18:02 -08:00
Srinivasan Iyer 739dc71a0a
Add rope fp32 (#43)
Some checks are pending
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* Log model

* Add flag for rope outer in fp32

---------

Co-authored-by: Srini Iyer <sviyer@meta.com>
2025-02-05 17:19:37 -08:00
Ink 392117bff2
Fix realtime entropy patching (#26)
Some checks are pending
Lint with Black / lint (push) Waiting to run
Lint with isort / lint (push) Waiting to run
* allow loading of the entropy model directly

* remove unused argument

* remove spammy warning

* allow patch_batch_size to be adjusted in the forward() method

* revert to original patcher style, fix warning

* allow grads when calculating entropies

* fix grad flow

* return preds from calculate_entropies()

* remove legacy arg

* fix an error with monotonicity and small sequence lengths

* ensure patcher is serializable

* revert patcher to original

* remove unused import
2025-01-21 16:34:23 -08:00
Pedro Rodriguez 6ffeb66b53
Changes for training entropy model and correcting attention in local models (#25)
Some checks failed
Lint with Black / lint (push) Has been cancelled
Lint with isort / lint (push) Has been cancelled
Summary:

- Refactor local model configs to be separate and clearer
- Add attention arguments and correct which attention is used in local models
- Preparation for being able to have an entropy train script
- Fix failing unit tests

Test Plan:
2025-01-17 14:23:01 -08:00
Pedro Rodriguez bcc039bb75 Initial commit 2024-12-12 15:32:30 -08:00