Article

An AI voice agent platform that saves time and money

AI & data engineering

Author

Solvd

Length

7 min read

As AI voice agents become increasingly complex and find more real-world applications, companies face a challenge: how to thoroughly test a voice agent before letting it out in the real world.

Identifying and fixing defects early through prototypes or proof-of-concepts can reduce repair costs by up to 30 times compared to addressing issues post-release; however, challenges arise even before companies get to the release stage, too.

The real-life implementation of new technology requires trust, and companies around the world halt releases of their newly built AI Voice Agents driven by concerns about security and reputational risks. The confidence they need is often best gained with thorough testing on real company cases using real company data. That’s why Solvd’s team decided to build its own platform for AI Voice Agent prototyping — to help companies build and test prototypes faster and become more certain in their AI Voice Agents before letting them out in the real world. As a result, this platform reduces building a voice agent from a few hours to one hour of a developer’s work and enables non-technical people to experiment with the AI voice agent.

Solvd’s AI Voice Agent platform

To reduce costs and improve the flexibility of the platform, the platform mentioned above was built by combining existing frameworks, integrators and vendors, reducing both the future effort of building the prototype and the team’s capacity required to get a usable tool to support day-to-day work for Solvd’s clients. The platform consists of:

Pipecat framework. Pipecat lets the team connect AI models into a single workflow — text, vision and structured data all flow through one channel. For example, end users can upload a product photo, type a short note and ask about return policies in the same message. Low‑millisecond latency keeps the exchange fluid, and the framework’s adapters simplify shipping the agent.
Daily.co chat rooms. To deliver a simple yet powerful UI, the team decided to use the Daily.co room. End users can meet the bot exactly where they would meet a human representative — in a chat interface on a website, while exchanging emails, a pop-up window with a contact form or while exchanging messages with the contact center.
Twilio phone calls support. Up to 77% of users think a phone call is the fastest way to get an answer from a company. Twilio’s programmable voice API lets the agent use phone calls in any way necessary, including picking up calls, making calls and any other interaction, as easily as operating in any other environment. Voice agents should slot into existing call flows, not force teams to re‑architect their work.
Large Language Models. The platform enables the team to pick the model that appears to be the best fit for the task, including GPT and Claude, among others. LLMs sit at the system’s heart, stitching context across turns and tools. They draft emails, summarize tickets and oversee processes. They also decide what to do next, not just what to say next. Using the Pipecat framework mentioned above allows the team to pick the model of choice without needing to change the code or rewire the app.
Real-time speech-to-speech API. To provide the agent with a voice, the team has chosen OpenAI’s Realtime speech-to-speech API for speech that pauses, emphasizes and emotes as naturally as can be achieved.
API connections. Conversation alone doesn’t close a ticket or ship an order. We wire the agent into CRMs, databases and e‑commerce platforms so it can update a record, trigger a refund or schedule a pickup while it’s still on the call. This feature proves to be critical when it comes to building a PoC or testing a hypothesis, as a standalone agent with no connection to the external system is nearly impossible to validate.

Why does building a prototyping platform matter?

Using the building bricks listed above makes it easier to launch the agent prototype. This gives the team and client more time and space to test out multiple scenarios and approaches. It is not uncommon for a prototype to uncover new issues, help spot challenges or even redefine the scope of the project. The following examples show how the platform impacts work on AI agents by tackling issues commonly seen in the development process.

Tackling challenges in prototyping — practical examples

Building the AI Voice Agent prototyping platform aimed to tackle the technical challenges, costs and agility of development of the AI voice agents, making the prototyping process easier, cheaper and more affordable.

Technical challenges

A prototype, by definition, is a working version of the product or service. By that, it requires solving a particular set of technical problems. These may include both AI-specific problems or more general ones.

Example: Customer service in an e-commerce agent we tested had to deal with multiple SKUs and products. The prototype enabled the team to minimize the risks of hallucinating information about the existence or availability of products in the offer.

Cost of time and materials

When building a prototype, the company needs to invest time and money to actually deliver something, without the certainty about the usability or applicability of the delivered solutions. Picking only the essential parts to develop, without overcommitting, brings significant savings both in time and materials.

Example: Travel agency AI Voice Agent was tasked with customer support, like booking, re-booking or handling typical questions. The platform was used to test scenarios and data access to ensure that tasks can or cannot be done, and where the biggest gains and savings are to be found.

Testing

The prototype’s goal is to spot potential problems early during development, so the delivered system needs to be flexible and as easy to change as possible.

Example: A staff training agent had to combine the ability to understand the user’s speech with delivering responses in an understandable way. To find the best model fitting, the team used the prototyping platform during the tests, changing speech recognition models with just a click or two, without the need to rewire the solution.

Agility

In search of market fit, the prototype needs to be revised multiple times. The platform, built with reusable and modifiable components, lets the team test multiple scenarios in order to solve challenges encountered and spot the perfect use case.

Example: The staff training agent delivered for the hospitality business was less convenient than expected when delivered in the chat interface. To enable employees to attend courses and improve their skills while having their hands occupied with their tasks, the company decided to test the voice agent. Solvd’s team connected the AI voice agent testing platform with an internal knowledge base via API, so it was possible to run and test the voice agent faster and more agile.

Summary

AI Voice Agents are a unique type of tech that emulates interaction with human assistants, building trust between the company and the customer. With proper testing and prototyping, this trust will not be ruined by malfunction, hallucination or bad market fit, delivering the system that supports modern business in the best way possible.