Convo AI World

A podcast from Agora

Explore voice-first conversational AI through honest conversations with practitioners. Hear from AI builders, infra engineers, product strategists, and more for the latest insights on what it takes to build best-in-class conversational AI experiences.

AI Practitioners
Industry Insights
Technical Deep Dives
Best Practices

Listen on Your Favorite Platform

Available on all major podcast platforms

10 episodes
View
00:39:14

Real-Time Avatars, Translation, and Visual Storytelling with Akool's Jeff Lu

In this episode of the Convo AI World podcast, Hermes Frangoudis interviews Jeff Lu from Akool, a company revolutionizing video generation technology. They discuss Akool's origin, its innovative approach to visual storytelling, and the various use cases of its technology in marketing, internal communications, and more. Jeff shares insights on the challenges of balancing quality and cost in video generation, the importance of real-time inference, and advancements in video translation. The conversation also touches on Akool's strategy for staying ahead in the rapidly evolving generative AI landscape and the future of creativity in content creation.

0
#video generation#akool
00:39:39

AI at the Edge: 6G, Arabic LLMs & the Middle East’s AI Leap with Mérouane Debbah

In this episode of Convo AI World Podcast, we dive deep into the future of AI, telecom, and the evolving role of conversational interfaces with Prof. Merouane Debbah, Founding Director of the Khalifa University 6G Research Center and one of the leading minds behind the Arab world’s first large language models — Noor and Falcon.

0
#6g#arabic llms
00:51:12

The Voice AI and VR Revolution in Heavy Machinery with Carbon Origins' Amogha

In this episode of the Convo AI World Podcast, Hermes Frangoudis interviews Amogha Srirangarajan, Co-founder and CEO of Carbon Origins. They discuss the evolution of Carbon Origins from last-mile delivery robots to heavy machinery teleoperation, the integration of voice AI and VR in enhancing operator experiences, and the future of robotics in construction and space mining. Amogha shares insights on the challenges of labor shortages in critical industries and how Carbon Origins aims to address these through innovative technology and partnerships. The conversation also touches on ambitious plans for energy solutions and space exploration, highlighting the potential of robotics in shaping the future of human civilization.

0
#carbon origins#heavy machinery
00:33:38

Open-Source Voice Activity Detection with TEN Framework's Ziyi Lin

Ziyi Lin, speech engineer on the TEN Framework team, joins the Convo AI World podcast to explore the design and impact of a new open-source Voice Activity Detection (VAD) model. The episode explores the challenges faced with existing VAD solutions, the importance of high-quality training data, and the design choices that led to improved performance metrics. Ziyi explains how VAD functions as a critical component in conversational AI, managing real-time processing and latency, and the advantages of deploying it on edge devices.

0
#voice activity detection#VAD
00:40:42

Building AI Community with Voice AI Space

Thibault Mardinli (T-Bot) from Voice AI Space joins to discuss the evolution of Voice AI communities and ecosystems. Hermes and Thibault explore Thibault's journey from building a Voice AI startup to creating an open resource platform, the challenges of discoverability in the fragmented Voice AI landscape, and the democratization of AI expertise through visual interfaces. The conversation covers the spectrum of Voice AI companies from infrastructure to UX-focused products, adoption in emerging markets, privacy considerations, and the future of voice-first interfaces. Thibault shares insights on building global communities, curating quality resources, and the grassroots movement powering Voice AI innovation.

0
#voice ai#conversational ai
01:06:02

The Science Behind AI Speech Recognition with Deepgram's Andrew Seagraves

Deepgram's VP of Research Andrew Seagraves joins to explore the science and engineering behind modern speech recognition systems. Hermes and Andrew dive deep into why speech recognition isn't a solved problem, the two-stage training process of speech-to-text models, and the challenges of balancing real-time latency with accuracy. The conversation covers Deepgram's origins from dark matter research, power laws in speech data, buffer-based architectures for real-time transcription, and frontier challenges like multilingual code-switching, emotion detection, and conversational dynamics. Andrew shares insights on model deployment, customer use cases from NASA to food ordering, and the future of self-adapting speech models.

0
#speech recognition#deepgram
00:37:01

AI Content Moderation with Google's Ninny Wan

Google's Ninny Wan, Product Lead for AI Content Safety, joins to discuss the evolution of AI content moderation in the age of GenAI. The conversation covers Google's approach to semantic understanding, multilingual moderation across 140+ languages, synthetic data generation for training, and the balance between user freedom and safety. Ninny shares insights on transformer models, human-in-the-loop processes, cross-functional safety reviews, and Google's on-device privacy-compliant features like sensitive content warnings.

0
#ai content moderation#google
00:44:29

Interactive Digital Avatars with Trulience's Richard Bowdler

Trulience's Head of Growth Richard Bowdler joins to discuss the world of interactive digital avatars and conversational AI. Hermes and Richard explore how Trulience creates lifelike avatars, the technology behind real-time client-side rendering, multilingual support, and real-world applications from healthcare to customer service. The conversation covers the evolution from capture cages to modern avatar creation, competitive advantages in scalability, and the democratization of AI expertise through visual interfaces.

0
#interactive avatars#trulience
00:30:47

Real-Time Translation with Palabra's Artem Kukharenko and Ivan Kuzin

In this episode, Palabra's Artem Kukharenko (Co-Founder) and Ivan Kuzin (Head of Business Development) join to discuss the Palabra real-time speech-to-speech translation technology, the inspiration behind Palabra, common misconceptions about AI translation, the balance between latency and accuracy, and the challenges of voice cloning and intonation. The conversation also covers the applications of their technology, user feedback, differentiation in a competitive market, privacy and data security, benchmarking, developer experience, and future advancements in AI and speech translation.

0
#real-time translation#palabra
00:31:08

Introduction to Conversational AI with Agora's Ben Weekes

Agora's Ben Weekes joins to discuss the world of voice-first conversational AI. Hermes and Ben delve into the differences between voice and chat-based systems, explore the real-world applications of conversational AI, and break down the technology stack involved in creating effective voice agents. The conversation also touches on virtual avatars, infrastructure challenges, and the various conversational AI frameworks available for developers.

0
#conversational ai#voice ai

Convo AI Newsletter

Subscribe to stay up to date on what's happening in conversational and voice AI.

Loading form...
✓ Conversational AI news✓ No spam, ever✓ Unsubscribe anytime

Convo AI World

Building something innovative in conversational AI? We'd love to hear your story. Reach out to share your insights with our community of builders, developers, and AI practitioners.

Quick Links

© 2025 Convo AI World
Made byAgora
SitemapAll rights reserved