Business news

Discovery Systems For Marketplaces: Designing Surfaces That Stay Fast At Scale

In large marketplaces, discovery surfaces have become the primary revenue infrastructure. Consumers open an app, glance at the homepage or promotion rail, and expect the right options to appear instantly. Behind that simple moment sit complex systems: experimental ranking logic, AI models and promotion engines, latency-sensitive databases, and infrastructure that must keep up with growth in demand. When those systems fall behind, conversion drops, costs rise, and the organization loses trust in its own experiments. When they hold up, every new product or campaign has a reliable surface to stand on.

Ujjwal Gulecha, a Member of Technical Staff at Anthropic and co-inventor of the patent “System and method for logistical assistance with data exchange,” built his career on that distinction. His operating principle is straightforward, treat discovery as core infrastructure, then use architecture and operational discipline to keep it fast, reliable, and ready for the next wave of products and AI powered experiences.

Discovery Surfaces As Revenue Infrastructure

That shift is sharpest in multi-category delivery and logistics platforms, where discovery surfaces sit directly on top of transaction volume and AI-driven recommendations. The global quick-commerce and delivery services were valued at $98.5 billion in 2024 and are projected to reach $405.81 billion by 2034, with mobile applications accounting for well over half of current revenue. At the same time, recent performance data shows that 53% of mobile site visits are abandoned if a page takes longer than 3 seconds to load, and slow websites contribute to about $2.6 billion in lost sales each year. In that environment, a homepage or main discovery view is no longer just a layout decision; it is a latency-sensitive surface that either converts visitors or sends them elsewhere.

Inside DoorDash, Gulecha owned that reality. As an engineering leader, he led a team of more than ten engineers across backend, iOS, Android, and web to build the next generation discovery experience on the consumer app, owning high traffic surfaces like the homepage end-to-end. That homepage accounted for a double-digit percentage of company revenue, on a revenue base measured in billions, which meant that even small improvements to availability or latency mattered at scale. His team treated operational excellence as a product requirement, instrumenting reliability and performance so that each change improved the surface rather than introducing new fragility. Product, design, analytics and engineering could plan new experiences, including future AI-assisted discovery flows, knowing that the discovery surface would stay responsive as traffic and assortment grew.

“Discovery only creates value if people can reach it quickly and trust it to work every time. When the homepage behaves like infrastructure, teams can focus on better ideas instead of wondering whether the basics will hold,” states Gulecha.

Architectures That Turn Scale Into Efficiency

As catalogues expand and performance expectations harden, and as AI workloads compete for the same capacity, those surfaces depend on back-end systems that can keep throughput high while controlling cost. Cloud spending patterns show why this matters, with recent reports indicating that self reported wasted cloud spending on infrastructure and platform services still sits at about 27%,, even after targeted optimization efforts. When the architecture underneath discovery does not evolve, organizations pay both in higher invoices and in slower user experiences that are harder to debug.

Gulecha, a Senior IEEE member, addressed that head-on with a multi-quarter carousel system re-architecture that became a backbone for discovery at DoorDash. He proposed and led a four engineer effort that raised availability for this core system from 98% to 99.99% while simultaneously reducing latency by an order of magnitude. The redesign saved millions of dollars in infrastructure costs and unlocked new capabilities for how items and experiences could be arranged on high-traffic surfaces. Around that technical work, he put simple but effective guardrails in place, including pull request review guidelines, RFC review practices, and operational excellence meetings that kept quality high as more engineers and stakeholders contributed to the platform.

“Scale without discipline turns into complexity that nobody wants to touch. The goal was to make the core systems both cheaper and safer to change, so teams could move faster without worrying about hidden performance debt,” notes Gulecha.

Promotion Systems That Keep Customers Engaged

With that backbone in place, marketplaces still depend on promotion systems that turn attention into action without overwhelming users or infrastructure. The digital coupons market was estimated at $117.96 billion in 2025 and is projected to grow to $250.91 billion by 2035, reflecting the role targeted incentives now play in online commerce. As more of these offers move into mobile and app-first channels, the platforms that run them must handle peaks in volume, personalization logic, AI-assisted targeting, and redemption tracking while still presenting a clear experience for customers who simply want the right deal at the right time.

At DoorDash, Gulecha co-developed the core promotions system and led the design and development of the banners platform, which the company uses to run millions of promotions and banners across its surfaces. The system supported a wide range of campaigns for acquiring new customers, retaining existing ones, and encouraging incremental spend, all while staying integrated with search, ordering, subscription programs and partner needs. One of the clearest tests came with the “Win $1 Million in Big Macs” promotion in September 2019, the largest and most successful campaign in the company’s history, which the platform powered end-to-end. That experience reinforced a design pattern he carries forward, building promotion systems that can handle spikes in demand and complex eligibility rules without sacrificing clarity or stability for customers or merchants.

“Promotions should feel like a well-timed nudge, not a stress test for the platform. When the system underneath is calm at peak, customers remember the offer, not the friction,” explains Gulecha.

Operational Excellence For Experiment Heavy Teams

As promotion volume and product ideas grow, and as more AI driven features are layered onto core flows, the only way to keep these experiences dependable is to operate them with disciplined experimentation and reliability practices. For many businesses, uptime has become as important as raw feature delivery, with 99.99% availability translating to only 52.6 minutes of downtime per year. Recent analyses suggest that downtime can cost large enterprises more than $1,000,000 per hour. In that context, teams that run hundreds of concurrent experiments on core surfaces have to be as rigorous about stability as they are about new ideas.

Gulecha’s work on operational excellence at DoorDash was built for precisely that environment. He led three engineers across two teams in a sustained effort to reduce homepage latency by 70% over multiple quarters while the organization continued to launch new products on the same surfaces. At any moment, more than 500 experiments could be running on the homepage, so guardrails around performance and incident response were essential. His focus on foundational efficiency was evident earlier in his tenure, when he reduced more than 20% of the load on the main monolith database in a short period, work that proved critical for smooth holiday operations when staffing was naturally thinner. Earlier, at Groupon, he had already learned these lessons by building a microservice from scratch to integrate external partners like Mindbody, helping the company move toward voucherless deals on a scalable, reusable foundation.

“Operational excellence is not a separate track from product work, it is the reason ambitious ideas survive contact with real traffic. When teams can experiment safely, they earn the confidence to keep improving the experience,” says Gulecha.

Looking Ahead, Where Discovery And AI Share The Same Baseline

Those habits matter even more as marketplaces and enterprises bring more AI into discovery and decision-making. The recommendation engine market is estimated at $9.15 billion in 2025 and is projected to reach $38.18 billion by 2030, while the personalization software market is expected to grow from $11.98 billion in 2025 to $45.07 billion by 2032. As those systems become embedded in more products and workflows, the winners will be teams that treat AI-powered discovery as part of the same reliability and performance baseline as the rest of their platforms.

Gulecha’s current role at Anthropic, building AI solutions for some of the largest enterprises, extends that mindset into a new context. He brings a record of making discovery surfaces dependable at DoorDash, hardening integrations at Groupon, and turning complex re-architectures into measurable improvements in availability, latency, and cost. In the middle of that journey, he also shared his lessons publicly, including a widely read engineering article on taming content discovery scaling challenges with hexagons and Elasticsearch, which distilled practical patterns for real-world systems. Now he is applying those patterns at the intersection of AI safety and enterprise reliability, where discovery, recommendations, and operations have to work together.

“AI will only earn its place in critical discovery flows if the underlying systems stay reliable, observable, and simple to reason about. The baseline is still the same, fast surfaces, stable architectures, and teams that know exactly why the system behaved the way it did,” remarks Gulecha.

Comments
To Top

Pin It on Pinterest

Share This