• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Social Media Examiner

Your Guide to the Marketing Jungle

  • 🔥 Free Newsletter
  • 🎙️ Podcasts
    • Social Media Marketing Podcast
    • AI Explored Podcast
    • Our YouTube Channel
  • 🌟 AI Society
  • 🗓️ Marketing Conference
  • 🤖 AI Conference
  • 👋 About Us
    • Marketing Events
  • Search
  • Social Media Marketing WorldImprove your strategy & find your next big marketing ideas!DISCOVER WHAT YOU'VE BEEN MISSING

    AI Voice Agents: How to Get Started

    by Michael Stelzner / January 20, 2026

    Wondering if AI voice agents could improve and scale your customer service? Want to know what it takes to implement AI voice assistants in your business?

    In this article, you'll discover how to deploy AI voice agents that handle real customer interactions while avoiding common pitfalls.

    This article was co-created by Tommy Chryst and Michael Stelzner. For more about Tommy, scroll to the end of this article.

    How AI Voice Agents Help Businesses

    AI voice agents cost eight to twelve cents per minute. Compare that to human phone staff, and you'll almost always see a clear ROI.

    At approximately ten cents per call, use cases that were previously economically impossible suddenly become viable, enabling proactive customer service, extensive lead follow-up, and large-scale reactivation campaigns that would never work with human staff.

    Beyond the core conversation loop, you can integrate different business functions. The agent can update your CRM, log calls into Google Sheets, or send emails based on the conversation content. This is where voice agents become true business tools rather than just talking systems.

    The businesses implementing voice agents now gain significant competitive advantages in customer service, lead qualification, and operational efficiency.

    The technology is ready. The question is whether you're ready to implement it. Below, you’ll discover what you’ll need to build each part of your AI voice agent for phone calls.

    How AI Voice Agents Work

    A voice agent consists of three different core conversation components working in unison within an AI voice agent platform to create an interactive agent that can respond dynamically to customer questions and even handle objections.

    The Ears: The ears are speech-to-text technology. This component transcribes whatever the person on the other line says and converts it to text.

    The Brain: The brain is an LLM (large language model) like GPT. This component is text-to-text. It takes the transcribed response, runs it through whatever instructions you've given the agent, and then outputs what you want it to respond with.

    The Mouth: The mouth is text-to-speech technology. This component gives voice to the text the brain generated and speaks it back to the human. The entire process happens within about a second.

    How Latency Affects Your AI Agent’s Performance

    The ear, brain, and mouth layers of voice agents each contribute to latency:

    • Speech-to-text typically adds one hundred to two hundred milliseconds of latency.
    • LLM latency varies significantly. For example, using ChatGPT 5.2 versus 5.2 Nano can create a three to four hundred millisecond difference.
    • Voice contributes three hundred to four hundred milliseconds.

    Human speech typically ranges from eight hundred milliseconds to a second. You want your agent to respond within that range, so balancing the combined latency between the models is key to optimal performance.

    Ready to Supercharge Your Marketing Strategy?

    Social Media Marketing World

    Get expert training and an unbeatable conference experience when you attend Social Media Marketing World—from your friends at Social Media Examiner.

    Broaden your reach, skyrocket your engagement, and grow your sales. Become the marketing hero your company or clients need!

    🔥 Save $590 on an All-Access ticket. Sale Ends Tuesday! 🔥

    GET THE DETAILS

    #1: Choose Your AI Voice Agent Tech Stack: Recommendations

    No-Code Voice Agent Platforms

    This tool provides the infrastructure to choose and connect your speech-to-text (ears), LLM (brain), and text-to-speech (mouth) models. You simply choose each model from the relevant options. Once you return to the agent screen, you'll see the total cost and total expected latency for your agent. You can experiment with different model combinations to find the optimal balance of performance and cost.

    The three main platforms for building no-code voice agents are Retell AI, Vapi, and ElevenLabs' own agent builder; they bring together the ears, mouth, and brain components you choose so you don't have to write code.

    All three platforms are entry-level tools that let you create a free account to get a demo up and running.

    ai-voice-agents-how-to-get-started-retell-set-up

    Tommy Chryst favors Retell AI for almost every project due to its user experience and the company’s track record with uptime. When you're running calls 24/7 for your business, it's critical that the infrastructure doesn't go down.

    These tools offer industry-specific modes, such as medical terminology, so they can recognize and transcribe specialized vocabulary that standard transcription wouldn't normally handle correctly.

    The Ears: Speech-to-Text Model

    You can also configure custom keywords in your transcription settings. For example, if the AI mispronounces or mistranscribes your company name or product terms, you can add these to a recognition library so they're transcribed correctly every time.

    Be aware that some voice agent platforms may not allow you to switch away from their default speech-to-text transcription provider, but when you can, Chryst recommends Deepgram.

    The Brain: LLM Model

    Don’t assume that the newest model of an LLM will work better for your voice agent. When a brand new LLM launches, the initial traffic surge from users eager to test it can create problems for you.

    Chryst's agency primarily uses ChatGPT, and he says GPT 4.o performs consistently, but notes 5.1 is getting more consistent now that 5.2 has launched and taken some of the traffic load off 5.1.

    The Mouth: Text-to-Speech Model

    ElevenLabs has long been the leader in text-to-speech. However, other companies are catching up.

    Cartesia represents a strong alternative. Their newest voice model performs exceptionally well for voice AI applications. Their newest voice model is faster and less expensive than ElevenLabs and delivers comparable sound quality.

    #2: Decide How to Use Your AI Voice Agent: Inbound or Outbound Calls

    Voice agents enable use cases in two ways: by replacing existing business functions and by creating entirely new ones that wouldn't be financially viable for a human to handle but are still valuable to customers and the business.

    Consider phone calls. Phone calls fall into two fundamental categories: inbound and outbound. Either a call is coming into your business, or your business is making a call. How can an AI voice agent help?

    Inbound Use Cases

    The main inbound use case covers receptionist duties and customer support. These functions end up being similar across businesses. Voice agents can handle FAQs by accessing a knowledge base with any extra information you want the agent to know.

    Start by analyzing your business. Figure out what types of calls you receive. You might receive several different types: calls to book meetings, FAQs, questions about hours, and pricing inquiries. Build in the responses to each type. Don't create a deterministic system—instead, identify your common call types and enable the agent to answer those questions and take appropriate actions.

    AI Is No Longer Optional for Marketers—Ready to Master It?

    AI Business World

    Join over a thousand forward-thinking marketers at AI Business World—a conference-in-a-conference at Social Media Marketing World.

    Get two days of practical AI training where you'll discover:

    ✅ Which tools to use for your tasks—from content creation to data analysis

    ✅ Systems that 3x your output—leaving time for strategy and creativity

    ✅ Proven strategies you can deploy right away—no guesswork, no wasted budget

    Become the indispensable AI expert your company needs.

    GET YOUR TICKETS—SAVE $250

    The primary use case for inbound call agents is appointment scheduling. This works well because it's straightforward—the caller requests a specific time, and the agent books it.

    Outbound Use Cases

    Outbound applications are more interesting because people think about them less, even though they can be equally valuable.

    Follow-up calls represent a common outbound use case. Consider an e-commerce platform or shipping company dealing with porch pirates around the holiday season. The business wants to call customers right before their package is picked up or delivered to make sure it doesn't sit outside unattended. Hiring a call center or team of humans to make these calls wouldn't be viable—the ROI isn't there. But when it only takes ten cents to make that call, it becomes worth doing.

    Reactivation campaigns offer another powerful application. One car wash had a huge list of people who had churned or were previous customers. They wanted to offer a summer sale, dropping their price from $37 to $19 a month. Having a human call thousands of leads might take a month or two, but an AI voice agent can make those calls—hundreds or thousands a day—and pitch the new reactivation campaign.

    Pro Tip: Research the Legal Considerations for AI Voice Agents: The Telephone Consumer Protection Act (TCPA) restricts unsolicited telemarketing calls, texts, and faxes by requiring consumer consent and setting rules for how businesses can contact consumers by phone and through messaging systems. Be sure your voice agent use is compliant.

    #3: Plan Your AI Voice Agent Before You Build

    After you identify your use case, complete a discovery process and map out the specifics of how you want the voice agent to operate and the knowledge base.

    ​For example, if you want to use a voice agent to take over customer support that currently relies on a form submission system, consider this approach.

    ​Analyze the past month of support tickets and categorize them.

    • When will my order ship?
    • When will my order arrive?
    • What are your hours?
    • Is this product suitable for vegetarians?

    Next, determine what conversation paths you need to design. How should the agent respond in each scenario, and what actions it should take: What resources and systems will your agent need access to? For example, clear SOPs to guide responses or your fulfillment system to access order details.

    Keep your prompts simple, conversational, and casual, and remember to configure your agent to ask one question at a time. Otherwise, it'll ask six questions at once, and callers won't know where to start.

    Build in confirmation steps. Have the agent repeat back to the customer what it heard. The agent should ask: “Is this what you're saying? Am I hearing you correctly?” Get confirmation before taking action. The agent might say: “So what I'm hearing is this is your problem. Did I get that correct?” If they say yes, proceed. If they say no, they'll add more information.

    As you build your prompt, consider the three core AI voice agent function stages:

    Pre-call Functions: These run before the conversation starts. For example, you can look up the caller's phone number in your database and greet them by name if they're a returning customer.

    In-Call Functions: These execute during the conversation. The most common is checking calendar availability and booking appointments. When the caller requests a specific time, the agent checks availability, books it, and confirms the booking.

    Functions running during the call add complexity. If someone hangs up early, in-call functions might not complete, and the call data might not be captured, so it’s a good practice to think creatively about what can wait until after the call and place as many functions post-call as possible.

    For example, updating a CRM doesn't need to happen while the customer is on the line. Instead, log your call data to a Google Sheet after the call ends. Once it's logged in the Sheet, automatically update your CRM with that information. This keeps conversations flowing naturally while ensuring all your systems stay synchronized.

    Post-Call Functions: These run after the call ends, such as uploading all the information to a Google Sheet.

    Retell and Vapi both store call logs locally on their platforms, but you should also store your call recordings in at least two locations; Chryst recommends Google Sheets. This ensures you have backups and provides easier access for reference after the call.

    #4: Test and Optimize Your AI Voice Agent

    The biggest factor in agent quality is iterative refinement.

    To test your agent, route it straight to your phone. Call it, listen to how it sounds, and check the latency. You're going to iterate on your prompt over time—it's an expected part of the process.

    Chryst typically spends about two weeks deploying an agent, then another six weeks listening to calls and making adjustments.

    As you review calls, look for places where the agent tripped up. Then find the location in the prompt that most likely caused it to hallucinate or stumble. You might need to remove information or add more context. This process is where you'll reap most of the benefits and improvements.

    The issues are often small. Maybe one detail in the opening message or an extra comma that makes the agent pause too long will throw people off. Deleting that one element can make a huge difference in how the AI agent performs.

    Tommy Chryst is the founder of Arose AI, an agency specializing in AI voice solutions for businesses. Follow him YouTube and LinkedIn.

    Other Notes From This Episode

    • Connect with Michael Stelzner @Stelzner on Facebook and @Mike_Stelzner on X.
    • Watch this interview and other exclusive content from Social Media Examiner on YouTube.

    Listen to the Podcast Now

    This article is sourced from the AI Explored podcast. Listen or subscribe below.

    Where to subscribe: Apple Podcasts | Spotify | YouTube Music | YouTube | Amazon Music | RSS

    ✋🏽 If you enjoyed this episode of the AI Explored podcast, please head over to Apple Podcasts, leave a rating, write a review, and subscribe.


    Stay Up-to-Date: Get New Marketing Articles Delivered to You!

    Don't miss out on upcoming social media marketing insights and strategies! Sign up to receive notifications when we publish new articles on Social Media Examiner. Our expertly crafted content will help you stay ahead of the curve and drive results for your business. Click the link below to sign up now and receive our annual report!

     
    AI Business Society

    Want to Unlock AI Marketing Breakthroughs?

    If you’re like most of us, you are trying to figure out how to use AI in your marketing. Here's the solution: The AI Business Society—from your friends at Social Media Examiner.

    The AI Business Society is the place to discover how to apply AI in your work. When you join, you'll boost your productivity, unlock your creativity, and make connections with other marketers on a similar journey.

    I'M READY TO BECOME AN AI-POWERED MARKETER

    Tags: AI Explored podcast

    About the authorMichael Stelzner

    Michael Stelzner is the founder of Social Media Examiner and Social Media Marketing World—the industry's largest conference. He's also the founder of the AI Business Society and the AI Business World conference. Michael hosts the Social Media Marketing Podcast and the AI Explored podcast, and is the author of the books Launch and Writing White Papers.
    Other posts by Michael Stelzner »

    Get Social Media Examiner’s Future Articles in Your Inbox!

    Get our latest articles delivered to your email inbox and get the FREE Social Media Marketing Industry Report (43 pages, 50+ charts)!

    Industry Report Cover

    Worth Exploring:

    Facebook

    Marketing Help Explore More →

    Instagram

    Marketing Help Explore More →

    YouTube

    Marketing Help Explore More →

    Linkedin

    Marketing Help Explore More →

    AI

    Next Frontier Explore More →

    Social Media Marketing Industry Report

    Get Free Report →

    Social Marketing Trends

    The data you've been missing!

    Need a new plan? Discover how marketers plan to change their social activities in the 17th annual Social Media Marketing Industry Report. It reveals what marketers have planned for their social activities, content marketing, and more! Get this free report now and never miss another great article from us. Join more than 385,000 marketers!

    Simply click the button below to get the free report:

    Footer

    Your Guide to the Marketing Jungle
    Copyright © 2026 Social Media Examiner®
    All Rights Reserved. Terms of Use | Privacy Policy.

    Helpful Links

    • About us
    • Our content via email
    • Our podcasts
    • Our YouTube channel
    • Our live show
    • Our social media marketing industry report
    • Our AI marketing industry report
    • Sponsorship opportunities
    • RSS