From mobile-first to AI-first
I’ve worked with hundreds of organizations over the last 20 years, and the world of e-commerce and technology has evolved exponentially. My experience focuses on working with people who embrace technology and its transformative potential. I’ve seen, guided, and been a part of implementing multiple rising waves of new tech that promised to better engage with customers.
When the iPhone was launched, I worked with companies like Panera Bread to completely re-imagine their in-store experience and build one of the first iPad-based kiosk ordering systems. A restructuring of Panera’s digital experience unlocked exceptional growth over the years, and soon the concept of “Mobile first” and “mobile-centered design” grew rapidly, optimizing experiences where consumers interact most – their phone.
Leaping ahead, waves of technologies and platforms were launched, including new trends in e-commerce, such as “omni-channel” experiences. With consumers having many places to engage (SMS, web, mobile, in-store, customer support, etc.), businesses must balance consistency in brand experience and personalization to the channel. And with the flood of AI, many companies are struggling to keep up. How do we bring it all together?
Introducing Multimodal Commerce
Multimodal commerce is the next significant evolution in consumer experience. The technical definition is the next generation of online shopping, enabling customers to discover, interact with, and buy products seamlessly through text, voice, image, video, and AR – creating personalized, connected journeys wherever they engage.
But, the key idea is about creating intelligent, frictionless and highly engaging experiences. While many companies are struggling to deliver on “hyperpersonalization” of experience, multimodal commerce bridges the gap between new interaction modalities (voice, chat, image, video) and adaptive AI models.
In three years, if customers can’t communicate easily and directly with your site, they will shop somewhere else.
Moving beyond clicks and keyboards
If we think about the shift from “mobile-first” to “AI-first” experiences, it’s essentially about understanding consumer intent with the lowest amount of friction.
Most people are familiar with a typical e-commerce experience; generally, the same from store-to-store, large e-commerce platforms seeking consistency. Shoppers type keywords to search, filter, and sort. Their experience is commercialized and standardized. Platforms have largely failed on the promise of personalization. The height of which is liking products, building wish lists, or keeping track of purchases for future recommendations. Product discovery to purchase hasn’t fundamentally changed that much in 10 years (hence the perpetual “product detail page.”)
Let’s say, a shopper sees somebody wearing an outfit they dig. First, they’ll have to describe the outfit to Google to hone in on a brand and hopefully find an e-commerce website. Then they filter by gender, by article of clothing, select color & size, and see what they have available. The typical website will use high-quality imagery; models trying on the clothing, different angles, trusted reviews, etc. But, even with all these different solutions, people still want to see and feel how it looks on them. The consumer then might order three sizes, hoping one will fit and return the others. Retailers know this struggle – the standardized ‘free returns’ is a costly and logistical nightmare. For customers, it’s a long journey with many obstacles.
Imagine instead, a shopper could take a picture and have the image automatically analyzed to find similar products, outfits, and looks. We call it “Shop the Look” and have already helped our clients implement it.
Being able to use multimedia inputs and interactions to help customers find products, experience them, and purchase is truly transformative.
And that’s just product discovery.
We are starting to see these innovations transform entire industries. Take, for example, eyewear. For a long time, a few companies had a stranglehold on eyewear, creating ultra-high prices. However, with low-cost “Virtual Try-on” experiences, companies like Zenni Optical can now offer an e-commerce experience that completely replaces in-person shopping. I can say, from owning over 12 different pairs of glasses, that this is a game-changer.
It’s important to note, however, that these experiences take know-how to build. Their virtual try-on is provided by a 3rd party company called Fittingbox, a uniquely bespoke experience for glasses. Zenni originally launched this experience in 2019, so it’s been over 6 years of refinement to the point where it could be commercialized.
With all of the available ML models, foundational model advancements, AI-enabled products, digital platforms, and the overall pace of innovation, companies can now affordably start designing their north star.
It doesn’t take much to imagine just how disruptive this could be. How long until other retailers don’t need brick and mortar? Will companies with physical stores fail as their online competitors cut unneeded costs? Every retailer needs to take this very seriously.
Building a holistic multimodal experience
There isn’t a silver bullet for every retailer. Many get stuck in the “wait and see” mindset to avoid investing too early in new technologies. But the result is almost always the same – packaged software offerings with mediocre results. Solutions like Constructor.com look great on paper but are incredibly costly and don’t move the needle.
The flip side is over-eager investment in technologies that are premature. This typically causes companies to incur high costs and forego quick short-term wins, which kills momentum.
Our team focuses on balancing these approaches. The best solutions start with the desired experience or business outcome and then use AI as a tool to reach that goal. The resulting solution is often a combination of models, incubated development, and tight integration with e-commerce systems. Our team keeps a pulse on technology changes to make sure we are always optimizing and building the right solution.
This is what we do. The power of multimodal commerce experiences is incredible, and we have found that you need the smartest people combined with industry expertise to deliver. This is why we have so many PhDs and academic papers published. We focus on going beyond the tuning of foundational models.
We build commercial-grade, globally scalable AI solutions.
We understand that foundation model capabilities are constantly changing. No foundation model has “turnkey” consumer experiences available. Most experiences require multi-modal orchestration that could mean coordinating over 10 models as part of a modular pipeline. We know that underlying data is often complex, diverse, and unstructured. And we know that piloting is significantly different from performance and cost at scale.
And most importantly, we understand that brands lose consumer trust if AI isn’t implemented correctly. We ensure that model outputs go through ethical review to maintain fairness and inclusion. As a company that pioneered quality engineering, evals & governance, risk, and compliance for 15 years, we deliver trusted solutions.
The e-commerce evolution
The next generation of e-commerce will be defined by retailers who can harness AI for meaningful customer experiences. Multimodal commerce is not a trend, but a structural shift in how people discover and interact with products. As interactions move beyond clicks and keywords, brands that adapt will shape new forms of engagement, whether it’s for eyeglasses or cocktail glasses. While the brands that hesitate will be outpaced by competitors offering seamless, intelligent, and intuitive shopping journeys.
Our team combines deep expertise in AI, multimodal orchestration, and enterprise-grade engineering. We build solutions that are scalable, ethical, and tailored to real business outcomes. The next era of commerce belongs to organizations that embrace these technologies with clarity and purpose. We are here to help them design that future.
When done right, multimodal commerce blends various technologies together to create a new philosophy of service, where companies think more like stewards of information and access – a trusted companion to the customer. In a world where attention is fleeting and trust is hard-won, understanding, anticipating, and delivering a customer’s needs may stand out as the most valuable service a company can bring to the table.