Applications open now

Voice AI 101

A practical crash course on how to build voice AI: from pipeline fundamentals and architecture to your first working voice agent.

4 live sessions recordings included $490 applications open now

Who it is for

For people who want to understand voice AI as a system, not a set of APIs.

Founders who want a system-level understanding of voice AI.
Engineers moving into speech, conversational AI, or voice agents.
Product builders who need practical intuition about how voice systems actually work.
Technical people who want to go beyond simply calling APIs.

Outcomes

You leave with a working map of the voice stack and a first agent.

01

Understand the core voice pipeline: audio -> ASR -> LLM -> tools -> TTS.

02

Learn why latency, interruptions, and turn-taking are central in voice AI.

03

Know the main building blocks of a voice agent.

04

Understand tradeoffs of local vs API-based components.

05

Build a first working voice agent.

06

Develop vocabulary to discuss voice AI architecture credibly.

Curriculum

4 live sessions: from fundamentals to build-along.

Session 1

Voice AI Fundamentals

  • What voice AI is and how a voice agent differs from a chatbot.
  • The core pipeline: audio -> ASR -> LLM -> TTS.
  • Batch vs realtime systems.
  • Where the real constraints appear: latency, interruptions, noisy input, user experience.

Session 2

ASR, TTS, and Voice Stack Components

  • How ASR and TTS work at a practical level.
  • Whisper, faster-whisper, local models, and API-based approaches.
  • Model/service tradeoffs.
  • How to choose a stack based on product constraints rather than hype.

Session 3

Agents, Tools, and Realtime Architecture

  • The LLM as the reasoning layer of a voice agent.
  • Tool use, function calling, memory, and orchestration.
  • Streaming, turn-taking, and interruptions.
  • How a real voice agent architecture works beyond a neat diagram.

Session 4

Build-Along and First Voice Agent

  • Build a simple end-to-end voice agent.
  • Review common failure points and debugging patterns.
  • Discuss iteration, productization, and next steps after the course.
  • Understand what it takes to move from prototype to product.

Included

Short format, practical depth, founder-led teaching.

4 live sessions
Recording access
Practical notes and supporting materials
Example stack references
Implementation guidance
Architecture and stack breakdowns
Application review and founder follow-up after submission

Application

Tell me what you want to build in voice AI.

The form helps identify applicant fit, technical background, motivation, and the right founder-led follow-up.

Applications open now

$490

FAQ

Short answers before you apply.

Who is this course for? +

Founders, engineers, PMs, and technical builders who want system-level understanding of voice AI.

Do I need prior speech AI experience? +

No prior speech AI expertise is required. Technical curiosity and willingness to reason about architecture are expected.

Will there be recordings? +

Yes. Recordings are included so you can revisit the material after the live sessions.

Is this technical or no-code? +

It is practical and technical. The goal is to understand the stack, constraints, and build a first agent, not only click through no-code tools.

What will I be able to build after it? +

A simple working voice agent and a stronger architecture base for future prototypes.

How is this different from generic AI courses? +

The focus is voice-specific system understanding: latency, turn-taking, interruptions, ASR/TTS, realtime pipelines, and production tradeoffs, not generic LLM prompting.