News of the day

1. Microsoft's VibeVoice-Realtime offers low-latency TTS for agents and live narration, generating speech in ~300ms from streaming text. Read more

2. OpenAI and NORAD launch three custom GPTs: Elf Enrollment for portraits, Toy Lab for coloring pages, and Story Creator for tales. Read more

3. UK GP surgeries are adopting AI to improve patient care by modernizing phone answering, automating appointments, and assisting clinicians with guidance. Read more

4. Explore overlooked tech, climate challenges in the Arab region, media mergers, and AI's impact on research and online platforms. Read more

Our take

Hi Dotikers!

Microsoft just dropped VibeVoice Realtime, a real time text to speech model that's open source and can start talking in 300 milliseconds. In a market dominated by ElevenLabs and OpenAI jealously guarding their proprietary tech, Redmond shows up with an MIT licensed model anyone can download and tinker with.

The approach is clever: while an LLM generates its response token by token, VibeVoice Realtime is already speaking. No more waiting for the reasoning to finish before hearing the first word. For conversational agents and voice interfaces, this is exactly what the open source ecosystem was missing.

On the technical side, Microsoft stacks a 500 million parameter Qwen model with an acoustic decoder and a diffusion head, totaling around 1 billion parameters. The whole thing compresses audio at 7.5 Hz, a ridiculously low frame rate that enables up to 10 minutes of fluid speech generation. Benchmarks show a 2% error rate on LibriSpeech, playing in the same league as the heavy hitters.

That said, VibeVoice Realtime remains a research tool. English only, fixed voice, and Microsoft actually pulled the project back in September after discovering it was being misused for deepfake creation. The model returns today with guardrails: built in audio watermarking and automatic disclaimers.

This is the kind of release that feels good. When APIs charge a dollar per minute, having a free and modifiable alternative changes the game for developers who want to experiment without mortgaging their cloud budget.

G.

Meme of the day

Reply

Avatar

or to participate

Keep Reading