Update cross-modal-bootstrap.md

This commit is contained in:
PSBigBig 2025-09-05 11:29:34 +08:00 committed by GitHub
parent 531475c760
commit c5da31df9c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -1,5 +1,22 @@
# Cross-Modal Bootstrap — Multimodal Long Context
<details>
<summary><strong>🧭 Quick Return to Map</strong></summary>
<br>
> You are in a sub-page of **Multimodal_LongContext**.
> To reorient, go back here:
>
> - [**Multimodal_LongContext** — long-context reasoning across text, vision, and audio](./README.md)
> - [**WFGY Global Fix Map** — main Emergency Room, 300+ structured fixes](../README.md)
> - [**WFGY Problem Map 1.0** — 16 reproducible failure modes](../../README.md)
>
> Think of this page as a desk within a ward.
> If you need the full triage and all prescriptions, return to the Emergency Room lobby.
</details>
When different modalities (video frames, audio tracks, OCR text) start at different offsets or initialize in the wrong order, the entire context alignment collapses.
This page gives guardrails to synchronize bootstrap order across multimodal inputs.