For retailers with catalogs containing tens of millions of SKUs, there is one inevitable realization: most of the catalog is rarely seen. The issue is product discovery. Not traffic, assortment depth or even merchandising effort.
Even when the search function works properly, the infrastructure often struggles to consistently surface relevant products across a catalog of this size. Consequently, a small subset of products receives the majority of engagement, while the majority of products remain invisible. While with smaller catalogs, this imbalance can go unnoticed.
What starts as a search optimization problem evolves into a constraint of the discovery infrastructure that limits how much of the catalog can generate revenue. For leaders responsible for ecommerce growth, the question shifts from
"Is search working?"
to something more strategic:
"Is our discovery system capable of operating at the scale our catalog has reached?"
In smaller environments, discovery systems operate under relatively predictable conditions. Product attributes are consistent. Query patterns repeat frequently. Merchandising rules remain manageable. However, once catalogs exceed several million SKUs, these assumptions begin to break down. Three structural changes reshape how discovery systems behave:
Large catalogs often contain thousands of products with nearly identical attributes. Examples include:
Ranking these products requires far more than keyword matching. Product discovery systems must distinguish between highly similar items and interpret ambiguous shopper queries.
Without strong ranking models, search results tend to favor either:
These patterns both make it difficult for new or long-tail products to surface.
In very large catalogs, a significant proportion of searches fall into the "long tail." Since these queries appear rarely, if at all, the system cannot rely on historical engagement data to determine their relevance. Shoppers use a wide variety of language when describing products, which includes:
Product discovery systems that rely heavily on exact keyword matches struggle in this environment. Interpreting intent becomes the core challenge.
At scale, product catalogs behave more like live systems than static datasets. Large retailers frequently process:
Discovery platforms must absorb these updates while maintaining stable search performance and relevance. When ingestion pipelines lag or indexing processes cannot keep pace, products may exist in the catalog yet remain undiscoverable.
In practice, product discovery limitations rarely manifest as system failures. Instead, they manifest as operational friction across teams. Merchandising teams implement additional rules to prevent key products from disappearing from the results.
Common warning signs include:
Individually, these problems appear manageable. Together, however, they suggest that the discovery system is no longer adapting naturally to the catalog scale. Instead, teams are compensating for structural limitations through manual effort.
Merchandising teams play a critical role in shaping discovery experiences. However, manual intervention becomes increasingly difficult to sustain at very large catalog sizes.
Consider a retailer managing tens of millions of SKUs across hundreds of categories. Even a small set of merchandising adjustments can multiply quickly, such as:
As rule sets grow, the interactions between them become harder to predict. A change intended to improve one category may affect search results in other categories. Over time, it becomes more difficult to predict discovery behavior.
Rather than enabling experimentation, the system becomes fragile. Teams hesitate to adjust ranking logic because the consequences are difficult to predict across such a large catalog. This is one of the most common signs that your product discovery architecture needs to evolve beyond rule-driven models.
Discovery systems designed for extremely large catalogs behave differently in several important ways. Rather than relying primarily on manual configuration, they combine multiple intelligence layers to interpret shopper intent and product relationships.
Key capabilities shouldn’t be limited to, but typically include:
Dynamic Ranking and segmentation use shopper behavior signals, such as clicks, dwell time, and purchase patterns, to continuously refine product rankings.
Emerging discovery interfaces allow shoppers to interact with the catalog through conversations and guided exploration, helping them navigate large assortments and find the right products faster.
Advanced product discovery platforms interpret meaning rather than exact keywords. This helps connect shoppers' language with product attributes, even when the wording differs from catalog descriptions.
Product discovery systems detect shifts in demand patterns across the catalog and adjust rankings to reflect emerging interest.
Attribute enrichment continuously refines catalog understanding by learning from product attributes, shopper queries, and behavioral data.
Most platforms appear similar in product demonstrations. Differences only emerge when the system operates under real production conditions. Decision-makers should evaluate product discovery platforms across several dimensions.
| Evaluation Area | What to Examine |
|---|---|
| Indexing architecture | Can the system distribute indexing across nodes to maintain performance as catalogs grow? |
| Query performance | Does search maintain low latency during complex filtered queries and peak traffic periods? |
| Catalog ingestion | How quickly do product updates appear in discovery results after ingestion? |
| Ranking adaptability | Can relevance models learn from behavioral data rather than relying solely on manual rules? |
| Operational flexibility | Can teams evolve ranking logic and indexing structures without disrupting live traffic? |
Testing these capabilities with realistic catalog sizes is essential.
Retailers managing large and complex catalogs often look for proof that a discovery platform can perform reliably in production environments. Several global retailers have reported measurable improvements after modernizing their product discovery infrastructure with Netcore Unbxd.
City Furniture achieved a 20% increase in conversion rate and 2.5x uplift in add-to-cart rates, highlighting how improved relevance and vector search can directly impact revenue across large product catalogs.
Restaurant Equippers, a B2B equipment retailer with a complex catalog, saw 20% higher add-to-cart rates, 3x growth in search engagement, and 20% revenue growth after implementing an AI Shopping Agent that guides shoppers through product discovery like a digital salesperson.
ResMed, a global health technology brand, recorded a 23% increase in revenue, 21% improvement in conversion rate, and 15% higher customer engagement after improving query intent recognition and search relevance.
Backcountry, a major outdoor gear retailer, experienced an 11% increase in demand from search sessions and 9% higher revenue per session, after optimizing search and personalization performance.
WEX Photo Video, a retailer with a highly attribute-rich catalog, reduced zero-result queries by 60%, while driving 12% revenue growth and an 18% increase in average order value through more accurate AI-driven search.
Camper, a global footwear brand managing high product variation across markets, achieved a 10% increase in search-led sessions and significantly reduced zero-result pages by improving autosuggest and multilingual search experiences.
Across these implementations, the pattern is consistent: when discovery systems are designed to interpret shopper intent, scale across large catalogs, and continuously adapt to behavioral signals, retailers see measurable gains in conversion, engagement, and revenue performance.
Once catalog sizes reach tens of millions of products, businesses often realize that incremental improvements are insufficient. Search performance may still appear acceptable. Filters may still function. However, the operational cost of maintaining relevance continues to rise, and the system becomes more difficult to evolve.
This is typically the moment when discovery evolves from a technical concern into a strategic platform decision. Retailers managing catalogs of such a large size are increasingly evaluating product discovery platforms that have already proven their ability to operate under the conditions discussed above.
This is not necessarily because their current systems have failed, but because the next stage of growth requires a product discovery infrastructure designed for it.
If your team is evaluating whether your current discovery system can support catalogs at this scale, book a demo with our product discovery experts to see how modern product discovery platforms handle tens of millions of SKUs in real ecommerce environments.