# Round 16 — Tower Floors J23-J26 · Sound × 3 + Visual

**Date**: 2026-04-30 · **Status**: ✅ COMPLETE

---

## 🏛️ Multi-Modal Output Wing

ה-tower לא מסתפק ב-text. ה-**4 קומות** האלה מייצרות תוכן multi-modal: voice, podcast intro, visual.

```
J17.5 Editor done
        │
        ├──→ J23 TTS (Text-to-Speech)
        ├──→ J24 Podcast Intro (audio collage)
        ├──→ J25 Sound Design (sfx + ambient)
        └──→ J26 Visual Generator (cover image, infographic)
```

מתאים ל-tenants עם voice/podcast products (Maariv-Audio, BBC Sounds, NYT Audio).

---

## 🗣️ J23 · TTS (Text-to-Speech)

**Provider options**:
- ElevenLabs Pro (Hebrew voice cloning, $0.30/1000 chars)
- Google Cloud TTS WaveNet ($0.016/1000 chars)
- Azure Neural Voice ($0.016/1000 chars)
- OpenAI TTS-1-HD ($0.030/1000 chars)

**Recommendation**: Google Cloud TTS for Hebrew (cheapest + good quality), ElevenLabs for English (natural).

**Per-tenant voice**:
- Maariv: ElevenLabs voice "Itay-male-news"
- BBC: ElevenLabs voice "James-male-news"
- NYT: OpenAI voice "echo"

**Output**: MP3 64kbps (~1MB per minute), uploaded to R2 `master-jason-audio/{tenant}/{flow_id}.mp3`.

**Streaming**: TTS yes — chunks emitted as audio chunks (Web Audio API supports streaming).

**Cost**: 0.0048$ per 300-word article (Hebrew via Google) · 4-8s

**A2UI**:
```json
{
  "id": "audio-player",
  "component": "AudioPlayer",
  "src": "{path: '/audio/url'}",
  "duration": "{path: '/audio/duration_sec'}"
}
```

---

## 🎙️ J24 · Podcast Intro (Audio Collage)

**תפקיד**: ליצור 30-60 שניות podcast intro מ-press release.

**Components**:
1. TTS narrator (J23 output, intro paragraph)
2. Background music (low gain, royalty-free pool)
3. SFX accent (whoosh on transition)
4. Brand jingle (~3 sec, tenant-specific)

**Algorithm**:
```
audio = mix([
  brand_jingle.fade_in(0.5s),
  narrator.delay(1.5s).fade_in(0.3s),
  bg_music.gain(-18dB).loop_for(narrator.duration),
  sfx_whoosh.at(narrator.duration - 0.5s)
])
```

**Tools**: pydub or ffmpeg (kalfantop ffmpeg server already on EX63 ✅).

**Cost**: 0.0001$ + 2-3s rendering

**Output**: 30-60s MP3 ב-R2.

---

## 🔊 J25 · Sound Design (SFX + Ambient)

**תפקיד**: לבחור soundscape לarticle.

**Mapping by desk**:
| Desk | Soundscape |
|---|---|
| Breaking | sharp electronic ping + tension |
| Security | drone + low rumble |
| Politics | distant chatter + papers shuffling |
| Sports | crowd murmur + whistle |
| Tech | digital hum + keyboard clicks |
| Entertainment | upbeat synth + chime |

**Library**: Free Sound Pack ב-R2 `master-jason-audio/sfx/{desk}/{variant}.mp3` (~50 files).

**Cost**: 0$ (template-based)

---

## 🎨 J26 · Visual Generator

**תפקיד**: ליצור cover image אם press release לא הגיעה עם אחד.

**Modes**:

1. **From archive**: pgvector lookup + similar past article cover (royalty-cleared).
2. **AI-generated**: 
   - Imagen 4 (Google) — 0.04$ per image
   - SDXL Turbo (open) — 0.001$ per image (slower)
   - DALL-E 3 — 0.04$ per image
3. **Infographic**: D3-rendered SVG with stats from article (free).

**Per-tenant style**:
- Maariv: documentary photography, Israeli-centric
- BBC: clean editorial photography
- NYT: muted color palette, illustrative

**Cost**: 0$ (archive) ↔ 0.04$ (AI) · 200ms ↔ 8s

**A2UI**:
```json
{ "id": "hero", "component": "Image", "src": "{path: '/visual/url'}", "alt": "..." }
```

---

## ⏱️ Timing

| Floor | Time | Cost |
|---|---|---|
| J23 TTS | 6s | 0.005$ |
| J24 Podcast | 2.5s | 0.0001$ |
| J25 Sound Design | 50ms | 0$ |
| J26 Visual | 200ms-8s | 0$-0.04$ |
| **Parallel max** | **~8s** | **~0.045$** |

ה-Visual (J26) הוא ה-bottleneck אם Imagen 4 נבחר. ל-real-time use → archive lookup (200ms).

---

## ✅ Closure

✅ Round 16 closed. **Multi-modal = 8s · ~0.045$ (worst case)**.
