OmniRoute/docs/I18N.md
2026-04-02 14:40:01 -03:00

409 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# i18n — Internationalization Guide
OmniRoute supports **30 languages** with full dashboard UI translation, translated documentation, and RTL support for Arabic and Hebrew.
🌐 **Languages:** 🇺🇸 [English](../I18N.md) | 🇧🇷 [Português (Brasil)](./pt-BR/I18N.md) | 🇪🇸 [Español](./es/I18N.md) | 🇫🇷 [Français](./fr/I18N.md) | 🇩🇪 [Deutsch](./de/I18N.md) | 🇮🇹 [Italiano](./it/I18N.md) | 🇷🇺 [Русский](./ru/I18N.md) | 🇨🇳 [中文 (简体)](./zh-CN/I18N.md) | 🇯🇵 [日本語](./ja/I18N.md) | 🇰🇷 [한국어](./ko/I18N.md) | 🇸🇦 [العربية](./ar/I18N.md) | 🇮🇳 [हिन्दी](./hi/I18N.md) | 🇹🇭 [ไทย](./th/I18N.md) | 🇹🇷 [Türkçe](./tr/I18N.md) | 🇺🇦 [Українська](./uk-UA/I18N.md) | 🇻🇳 [Tiếng Việt](./vi/I18N.md) | 🇧🇬 [Български](./bg/I18N.md) | 🇩🇰 [Dansk](./da/I18N.md) | 🇫🇮 [Suomi](./fi/I18N.md) | 🇮🇱 [עברית](./he/I18N.md) | 🇭🇺 [Magyar](./hu/I18N.md) | 🇮🇩 [Bahasa Indonesia](./id/I18N.md) | 🇲🇾 [Bahasa Melayu](./ms/I18N.md) | 🇳🇱 [Nederlands](./nl/I18N.md) | 🇳🇴 [Norsk](./no/I18N.md) | 🇵🇹 [Português (Portugal)](./pt/I18N.md) | 🇷🇴 [Română](./ro/I18N.md) | 🇵🇱 [Polski](./pl/I18N.md) | 🇸🇰 [Slovenčina](./sk/I18N.md) | 🇸🇪 [Svenska](./sv/I18N.md) | 🇵🇭 [Filipino](./phi/I18N.md) | 🇨🇿 [Čeština](./cs/I18N.md)
## Quick Reference
| Task | Command |
|------|---------|
| Generate translations | `node scripts/i18n/generate-multilang.mjs messages` |
| Translate docs (LLM) | `python3 scripts/i18n_autotranslate.py --api-url <url> --api-key <key> --model <model>` |
| Validate a locale | `python3 scripts/validate_translation.py quick -l cs` |
| Check code keys | `python3 scripts/check_translations.py` |
| Generate QA report | `node scripts/i18n/generate-qa-checklist.mjs` |
| Visual QA (Playwright) | `node scripts/i18n/run-visual-qa.mjs` |
## Architecture
### Source of Truth
- **UI strings**: `src/i18n/messages/en.json` (English source, ~2800 keys)
- **Locale files**: `src/i18n/messages/{locale}.json` (30 translations)
- **Framework**: `next-intl` with cookie-based locale resolution
- **Config**: `src/i18n/config.ts` — defines all 30 locales, language names, flags
### Runtime Flow
1. User selects language → `NEXT_LOCALE` cookie set
2. `src/i18n/request.ts` resolves locale: cookie → `Accept-Language` header → fallback `en`
3. Dynamic import loads `messages/{locale}.json`
4. Components use `useTranslations("namespace")` and `t("key")`
### Supported Locales
| Code | Language | RTL | Google Translate Code |
|------|----------|-----|----------------------|
| `ar` | العربية | Yes | `ar` |
| `bg` | Български | No | `bg` |
| `cs` | Čeština | No | `cs` |
| `da` | Dansk | No | `da` |
| `de` | Deutsch | No | `de` |
| `es` | Español | No | `es` |
| `fi` | Suomi | No | `fi` |
| `fr` | Français | No | `fr` |
| `he` | עברית | Yes | `iw` |
| `hi` | हिन्दी | No | `hi` |
| `hu` | Magyar | No | `hu` |
| `id` | Bahasa Indonesia | No | `id` |
| `it` | Italiano | No | `it` |
| `ja` | 日本語 | No | `ja` |
| `ko` | 한국어 | No | `ko` |
| `ms` | Bahasa Melayu | No | `ms` |
| `nl` | Nederlands | No | `nl` |
| `no` | Norsk | No | `no` |
| `phi` | Filipino | No | `tl` |
| `pl` | Polski | No | `pl` |
| `pt` | Português (Portugal) | No | `pt` |
| `pt-BR` | Português (Brasil) | No | `pt` |
| `ro` | Română | No | `ro` |
| `ru` | Русский | No | `ru` |
| `sk` | Slovenčina | No | `sk` |
| `sv` | Svenska | No | `sv` |
| `th` | ไทย | No | `th` |
| `tr` | Türkçe | No | `tr` |
| `uk-UA` | Українська | No | `uk` |
| `vi` | Tiếng Việt | No | `vi` |
| `zh-CN` | 中文 (简体) | No | `zh-CN` |
## Adding a New Language
### 1. Register the Locale
Edit `src/i18n/config.ts`:
```ts
// Add to LOCALES array
"xx",
// Add to LANGUAGES array
{ code: "xx", label: "XX", name: "Language Name", flag: "🏳️" },
```
### 2. Add to Generator
Edit `scripts/i18n/generate-multilang.mjs` — add entry to `LOCALE_SPECS`:
```js
{
code: "xx",
googleTl: "xx",
label: "XX",
flag: "🏳️",
languageName: "Language Name",
readmeName: "Language Name",
docsName: "Language Name",
},
```
### 3. Generate Initial Translation
```bash
node scripts/i18n/generate-multilang.mjs messages
```
This creates `src/i18n/messages/xx.json` auto-translated from `en.json` via Google Translate.
### 4. Review & Fix Auto-Translations
Auto-translations are a starting point. Review manually for:
- Technical accuracy
- Context-appropriate terminology
- Proper handling of placeholders (`{count}`, `{value}`, etc.)
### 5. Validate
```bash
python3 scripts/validate_translation.py quick -l xx
python3 scripts/validate_translation.py diff common -l xx
```
### 6. Generate Translated Documentation
```bash
node scripts/i18n/generate-multilang.mjs docs
```
## Auto-Translation Pipeline
### generate-multilang.mjs (Google Translate)
**Primary auto-translation engine** — uses Google Translate free API to generate translations for UI strings, READMEs, and documentation.
```bash
node scripts/i18n/generate-multilang.mjs [messages|readme|docs|all]
```
| Mode | What it does |
|------|-------------|
| `messages` | Translates missing keys in `src/i18n/messages/{locale}.json` from `en.json` |
| `readme` | Translates `README.md` into all locales as `README.{code}.md` in project root |
| `docs` | Translates `DOC_SOURCE_FILES` into `docs/i18n/{locale}/{docName}` |
| `all` | Runs all three modes |
**Features:**
- **Text protection**: Masks code blocks (```` ``` ````), inline code (`` ` ``), markdown links/images (`[text](url)`), HTML tags, tables, and ICU placeholders (`{count}`, `{value}`, `{total}`, etc.) before translation, then restores them
- **Chunked batching**: Joins multiple strings with `__OMNIROUTE_I18N_SEPARATOR__` delimiters to minimize API calls (max 1800 chars per request)
- **In-memory cache**: Avoids redundant API calls for repeated strings within a session
- **Retry logic**: Exponential backoff (up to 5 attempts with 300ms × attempt delay) for 429/5xx errors
- **Timeout**: 20 seconds per request
- **Skip existing**: If target file already exists, it is NOT overwritten
**Important behaviors:**
- `docs/i18n/README.md` is **regenerated** each run — it's an auto-generated index of all docs
- Root `README.{code}.md` files are only created if they don't exist (skips locales in `EXISTING_README_CODES`)
- Language bars (`🌐 **Languages:** ...`) are automatically inserted/updated in all translated docs
### i18n_autotranslate.py (LLM-based)
**Secondary translator** — uses any OpenAI-compatible LLM API (including OmniRoute itself) to translate existing `docs/i18n/` markdown files. Best for polishing or re-translating docs with better quality than Google Translate.
```bash
python3 scripts/i18n_autotranslate.py \
--api-url http://localhost:20128/v1 \
--api-key sk-your-key \
--model gpt-4o
```
**Features:**
- Scans `docs/i18n/` markdown files for English paragraphs
- Skips code blocks, tables, and already-translated content
- Sends paragraphs to LLM with technical translation system prompt
- Supports all 30 languages
## Validation & QA
### validate_translation.py
**Translation validator** — compares any locale JSON against `en.json` and reports issues.
```bash
# Quick check (counts only)
python3 scripts/validate_translation.py quick -l cs
# Output:
# Missing: 0
# Untranslated: 0
# Ignored (UNTRANSLATABLE_KEYS): 236
# Detailed diff by category
python3 scripts/validate_translation.py diff common -l cs
python3 scripts/validate_translation.py diff settings -l cs
# Export to CSV
python3 scripts/validate_translation.py csv -l cs > report.csv
# Export to Markdown
python3 scripts/validate_translation.py md -l cs > report.md
# Full report (default)
python3 scripts/validate_translation.py -l cs
```
**Detects:**
- **Missing keys** — keys in `en.json` but not in locale file
- **Extra keys** — keys in locale file but not in `en.json`
- **Untranslated keys** — keys where locale value equals English source (excluding allowlist)
- **Placeholder mismatches** — ICU placeholders that don't match between source and translation
**Exit codes:**
| Code | Meaning |
|------|---------|
| 0 | OK |
| 1 | Generic error |
| 2 | Missing strings (hard error) |
| 3 | Untranslated warning (soft) |
**Environment:** Set `TRANSLATION_LANG=cs` or use `-l cs` flag.
### check_translations.py
**Code-to-JSON key checker** — scans `src/**/*.tsx` and `src/**/*.ts` for `useTranslations()` calls and verifies all referenced keys exist in `en.json`.
```bash
# Basic check
python3 scripts/check_translations.py
# Verbose output
python3 scripts/check_translations.py --verbose
# Auto-fix (adds missing keys to en.json)
python3 scripts/check_translations.py --fix
```
### generate-qa-checklist.mjs
**Static analysis QA** — scans Next.js page files for i18n risk metrics and generates a Markdown report.
```bash
node scripts/i18n/generate-qa-checklist.mjs
```
**Checks:**
- Fixed-width class usage (overflow risk)
- Directional left/right classes (RTL risk)
- Clipping-prone patterns
- Locale parity (missing/extra keys vs `en.json`)
- README language selector bars in priority locales (`es`, `fr`, `de`, `ja`, `ar`)
**Output:** `docs/reports/i18n-qa-checklist-{date}.md`
### run-visual-qa.mjs
**Visual QA via Playwright** — takes screenshots of all dashboard routes in multiple locales and viewports, then evaluates page health.
```bash
# Default: es, fr, de, ja, ar on localhost:20128
node scripts/i18n/run-visual-qa.mjs
# Custom base URL and locales
QA_BASE_URL=http://staging.example.com QA_LOCALES=de,fr node scripts/i18n/run-visual-qa.mjs
# Custom routes
QA_ROUTES=/dashboard/settings,/dashboard/providers node scripts/i18n/run-visual-qa.mjs
```
**Detects:**
- Text overflow
- Element clipping
- RTL layout mismatches
**Output:** `docs/reports/i18n-visual-qa-{date}.md` + JSON report
## Managing Untranslatable Keys
### untranslatable-keys.json
**File:** `scripts/i18n/untranslatable-keys.json`
Allowlist of keys that should remain identical to English source. Used by `validate_translation.py` to avoid false-positive "untranslated" warnings.
```json
{
"description": "Keys that should remain untranslated...",
"keys": [
"common.model",
"common.oauth",
"health.cpu",
...
]
}
```
**What belongs here:**
- Brand/product names: `landing.brandName`, `common.social-github`
- Technical terms/acronyms: `health.cpu`, `mcpDashboard.pid`, `settings.ai`
- ICU/format strings: `apiManager.modelsCount`, `health.millisecondsShort`
- Placeholder values: `providers.openaiBaseUrlPlaceholder`, `cliTools.baseUrlPlaceholder`
- Protocol names: `common.http`, `common.oauth`, `providers.oauth2Label`
- Navigation sections: `sidebar.primarySection`, `sidebar.cliSection`
**To add a key:** Edit the `keys` array in `scripts/i18n/untranslatable-keys.json` and re-run validation.
## CI Integration
### GitHub Actions (`.github/workflows/ci.yml`)
The CI pipeline validates all locales on every push and PR:
1. **`i18n-matrix` job** — dynamically discovers all locale files (excluding `en.json`)
2. **`i18n` job** — runs `validate_translation.py quick -l '<lang>'` for each locale in parallel
3. **`ci-summary` job** — aggregates results into a dashboard summary
```yaml
# i18n-matrix: discovers languages
LANGS=$(ls src/i18n/messages/*.json | xargs -n1 basename | sed 's/.json$//' | grep -v '^en$')
# i18n: validates each language
python3 scripts/validate_translation.py quick -l '${{ matrix.lang }}'
```
**Dashboard output:**
```
## 🌍 Translations
| Metric | Value |
|--------|------|
| Languages checked | 30 |
| Total untranslated | 0 |
✅ All translations complete
```
## File Structure
```
src/i18n/
├── config.ts # Locale definitions (30 locales, RTL config)
├── request.ts # Runtime locale resolution
└── messages/
├── en.json # Source of truth (~2800 keys)
├── cs.json # Czech translation
├── de.json # German translation
└── ... # 30 locale files total
scripts/
├── i18n/
│ ├── generate-multilang.mjs # Auto-translation engine (Google Translate, 888 lines)
│ ├── generate-qa-checklist.mjs # Static analysis QA
│ ├── run-visual-qa.mjs # Playwright visual QA
│ └── untranslatable-keys.json # Allowlist for validation (236 keys)
├── validate_translation.py # Translation validator
├── check_translations.py # Code-to-JSON key checker
└── i18n_autotranslate.py # LLM-based doc translator
.github/workflows/
└── ci.yml # i18n validation in CI matrix
docs/
├── I18N.md # This file — i18n toolchain documentation
├── i18n/
│ ├── README.md # Auto-generated language index
│ ├── cs/ # Czech docs
│ │ └── docs/
│ │ ├── I18N.md # Czech translation of this file
│ │ └── ...
│ ├── de/ # German docs
│ └── ... # 30 locale directories
└── reports/
├── i18n-qa-checklist-*.md # Static analysis reports
└── i18n-visual-qa-*.md # Visual QA reports
```
## Best Practices
### When Editing Translations
1. **Always edit `en.json` first** — it's the source of truth
2. **Run `generate-multilang.mjs messages`** to propagate new keys to all locales
3. **Review auto-translations** — Google Translate is a starting point, not final
4. **Validate before committing**`python3 scripts/validate_translation.py quick -l <lang>`
5. **Update `untranslatable-keys.json`** if a key should remain in English
### Placeholder Safety
- ICU placeholders (`{count}`, `{value}`, `{total}`, `{seconds}`) must be preserved exactly
- Plural formats (`{count, plural, one {# model} other {# models}}`) must maintain structure
- The validator detects placeholder mismatches automatically
### Adding New Translation Keys in Code
```tsx
// Use namespaced keys
const t = useTranslations("settings");
t("cacheSettings"); // maps to settings.cacheSettings in JSON
// Run check_translations.py to verify keys exist
python3 scripts/check_translations.py --verbose
```
### RTL Considerations
- Arabic (`ar`) and Hebrew (`he`) are RTL locales
- Avoid hardcoded `left`/`right` CSS — use `start`/`end` logical properties
- Visual QA catches RTL layout mismatches via `run-visual-qa.mjs`
## Known Issues & History
### `in.json` → `hi.json` Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This created an orphaned `in.json` duplicate of `hi.json`. Fixed by changing `code: "in"` to `code: "hi"` in `generate-multilang.mjs` and removing the orphaned file.
### `docs/i18n/README.md` Is Auto-Generated
The `docs/i18n/README.md` file is completely regenerated by `generate-multilang.mjs docs`. Any manual edits will be lost. Use `docs/I18N.md` (this file) for hand-written documentation that should persist.
### External Untranslatable Keys List
The `untranslatable-keys.json` allowlist was moved from an inline Python set in `validate_translation.py` to an external JSON file for easier maintenance. The validator loads it at runtime.
### `generate-multilang.mjs` Hindi Code Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This was introduced in upstream commit `952b0b22c` by `diegosouzapw`. Fixed by changing `code: "in"` to `code: "hi"` in the `LOCALE_SPECS` array and removing the orphaned `in.json` file.
### `validate_translation.py` Ignored Count Output
The `quick` check now displays the count of ignored keys from `untranslatable-keys.json`:
```
Missing: 0
Untranslated: 0
Ignored (UNTRANSLATABLE_KEYS): 236
```