Summary:
Test Plan:
Test that this iterpolates in the right order, config -> configs -> cli args
```
# All three sources
python -m bytelatent.print_config config=bytelatent/configs/debug.yaml configs=[internal/configs/s3_debug.yaml] eval=null
# What worked before
python -m bytelatent.print_config config=internal/configs/s3_debug.yaml eval=null
```
Summary:
- Refactor local model configs to be separate and clearer
- Add attention arguments and correct which attention is used in local models
- Preparation for being able to have an entropy train script
- Fix failing unit tests
Test Plan: