openclaw/src/media
Omar Shahine da3d17e1ca
fix(tts): pre-transcode synthesized audio to opus-in-CAF for native iMessage voice-memo bubbles via BlueBubbles (#72586)
End-to-end testing on macOS + BlueBubbles + ElevenLabs walked through three CAF flavors before landing on the format Apple's Messages.app actually emits when a user records a native iMessage voice memo:

- PCM int16 @ 44.1 kHz CAF: BlueBubbles' internal `afconvert -f m4af -d aac` conversion fails; the original CAF reaches iMessage but renders with 0 s duration.
- AAC @ 22.05 kHz mono CAF: BlueBubbles' conversion succeeds and the server silently downgrades the delivery, sending the converted MP3 as a generic audio attachment.
- **Opus @ 24 kHz mono CAF**: byte-identical to the descriptor block Apple's Messages.app produces; BlueBubbles passes it through unchanged and iMessage renders a native voice-memo bubble with proper duration and waveform UI.

Adds an opt-in `tts.voice.preferAudioFileFormat` channel capability and a macOS `afconvert`-backed pre-transcode in the speech-core pipeline. BlueBubbles declares `preferAudioFileFormat: "caf"`. Other channels are unaffected. Falls back to the original buffer when the host platform, the source/target pair, or the transcoder process can't produce the preferred container — so non-Darwin hosts and unsupported provider combinations are unchanged.

Also adds a `caff` magic-byte sniff in `src/media/mime.ts` so the auto-reply host-local-media validator (which uses `file-type` and didn't recognize CAF natively) accepts the buffer instead of dropping it as "⚠️ Media failed."

Fixes #72506.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:15:16 -07:00
..
audio-tags.ts
audio-transcode.test.ts fix(google): emit opus voice-note tts 2026-04-25 21:33:33 +01:00
audio-transcode.ts fix(google): emit opus voice-note tts 2026-04-25 21:33:33 +01:00
audio.test.ts refactor: generalize voice audio compatibility 2026-04-22 06:58:45 +01:00
audio.ts refactor: generalize voice audio compatibility 2026-04-22 06:58:45 +01:00
base64.test.ts
base64.ts
channel-inbound-roots.fast-path.test.ts
channel-inbound-roots.ts
configured-max-bytes.ts fix(media): preserve oversized video generation delivery 2026-04-25 12:41:43 +01:00
constants.ts
document-extractors.runtime.test.ts refactor(pdf): move document extraction to plugin 2026-04-24 17:15:05 -07:00
document-extractors.runtime.ts refactor(pdf): move document extraction to plugin 2026-04-24 17:15:05 -07:00
fetch.test.ts test: generalize media fetch token fixtures 2026-04-22 06:45:09 +01:00
fetch.ts fix(zalo): add SSRF guard on outbound photo URLs [AI-assisted] (#69593) 2026-04-21 19:20:26 +05:30
ffmpeg-exec.test.ts
ffmpeg-exec.ts
ffmpeg-limits.ts
file-context.test.ts
file-context.ts
host.test.ts
host.ts
image-ops.helpers.test.ts
image-ops.input-guard.test.ts
image-ops.tempdir.test.ts
image-ops.ts refactor(media): move sharp image ops into media runtime (#71519) 2026-04-25 04:31:10 -07:00
inbound-path-policy.test.ts
inbound-path-policy.ts
input-files.fetch-guard.test.ts
input-files.ts refactor(pdf): move document extraction to plugin 2026-04-24 17:15:05 -07:00
load-options.test.ts
load-options.ts fix(zalo): add SSRF guard on outbound photo URLs [AI-assisted] (#69593) 2026-04-21 19:20:26 +05:30
local-media-access.test.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00
local-media-access.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00
local-roots.test.ts
local-roots.ts
media-reference.test.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00
media-reference.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00
media-source-url.ts
mime.test.ts fix(tts): pre-transcode synthesized audio to opus-in-CAF for native iMessage voice-memo bubbles via BlueBubbles (#72586) 2026-04-27 14:15:16 -07:00
mime.ts fix(tts): pre-transcode synthesized audio to opus-in-CAF for native iMessage voice-memo bubbles via BlueBubbles (#72586) 2026-04-27 14:15:16 -07:00
outbound-attachment.test.ts fix(media): preserve outbound attachment filenames 2026-04-21 14:19:27 +05:30
outbound-attachment.ts fix(media): preserve outbound attachment filenames 2026-04-21 14:19:27 +05:30
parse.test.ts fix(media): gate markdown image extraction by channel (#72718) 2026-04-27 11:27:35 +01:00
parse.ts fix(media): gate markdown image extraction by channel (#72718) 2026-04-27 11:27:35 +01:00
pdf-extract.test.ts refactor(pdf): move document extraction to plugin 2026-04-24 17:15:05 -07:00
pdf-extract.ts refactor(pdf): move document extraction to plugin 2026-04-24 17:15:05 -07:00
png-encode.ts
prompt-image-order.ts
qr-image.test.ts refactor(qr): share PNG data URL helpers (#70784) 2026-04-23 15:41:45 -07:00
qr-image.ts refactor(qr): share PNG data URL helpers (#70784) 2026-04-23 15:41:45 -07:00
qr-runtime.ts fix(qr): replace qrcode-terminal with qrcode-tui 2026-04-23 13:06:14 -07:00
qr-terminal.ts fix(qr): replace qrcode-terminal with qrcode-tui 2026-04-23 13:06:14 -07:00
read-capability.test.ts
read-capability.ts
read-response-with-limit.test.ts
read-response-with-limit.ts
server.outside-workspace.test.ts
server.runtime.ts
server.test-support.ts
server.test.ts fix(media): remove express from media host (#71436) 2026-04-25 01:39:42 -07:00
server.ts fix(media): remove express from media host (#71436) 2026-04-25 01:39:42 -07:00
sniff-mime-from-base64.ts
store.outside-workspace.test.ts
store.redirect.test.ts
store.runtime.ts
store.test.ts fix(webchat): support non-image file attachments 2026-04-26 10:58:24 -07:00
store.ts fix(webchat): support non-image file attachments 2026-04-26 10:58:24 -07:00
temp-files.ts
test-helpers.ts
web-media.test.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00
web-media.ts fix(media): centralize inbound media reference resolution 2026-04-25 00:57:07 +01:00