Most ecommerce sites are haemorrhaging organic traffic, not because their products aren’t great, but because Google can’t efficiently crawl, index, or understand their store. Faceted navigation creates thousands of junk URLs. Category filters duplicate content at scale. And poor site architecture buries your best pages from Google’s bots entirely.
Fix these technical foundations and you stop the bleeding. Fast.
Quick answer.
- Crawl budget determines how many pages Google will crawl on your site. Waste it on filter URLs and you starve your real product pages
- Faceted navigation is the single biggest crawl budget killer for ecommerce stores
- Flat site architecture keeps important pages within 3 clicks of your homepage
- Canonical tags, noindex directives, and robots.txt all control what Google indexes
- XML sitemaps and structured data accelerate discovery and rich result eligibility
What is technical SEO for ecommerce and why does it matter?
Technical SEO for ecommerce is the process of optimising the infrastructure of your online store so search engines can crawl, index, and rank your pages efficiently. Unlike content SEO, which focuses on what your pages say, technical SEO focuses on how your site is built.
For ecommerce sites, this matters more than almost any other vertical. A typical online retailer might have thousands, tens of thousands, or even hundreds of thousands of URLs, including product pages, category pages, filter combinations, paginated pages, and more. Without a solid technical foundation, most of those pages will never rank.
The stakes are high. Ecommerce sites that neglect technical SEO often see entire sections of their catalogue invisible to Google, even when those pages have excellent content and strong backlinks. Organic revenue strategies for ecommerce only work when search engines can actually see and index your pages.
How crawl budget works and why ecommerce sites burn through it.
Crawl budget is the number of pages Googlebot will crawl on your site within a given timeframe. It’s determined by two factors: crawl rate limit (how fast Google crawls without overloading your server) and crawl demand (how much Google wants to crawl your site based on its popularity and freshness).
For small sites with a few hundred pages, crawl budget is rarely a concern. For ecommerce sites with thousands of URLs, it becomes a critical resource.
Here’s the problem: most ecommerce platforms generate enormous numbers of low-value URLs automatically. Every filter combination, sort order, and pagination variant creates a new URL. A product catalogue with 500 products, 10 filter categories, and 5 sort options can generate over 25,000 unique URL combinations. Google doesn’t know which ones matter. It crawls many of them, wastes its budget, and then runs out of capacity before reaching your real product and category pages.
The result? Your best pages get crawled less frequently, take longer to appear in search results, and often don’t rank as well as they should.
Why faceted navigation is the biggest crawl budget killer.
Faceted navigation, the filter systems on category pages that let users narrow by size, colour, price, brand, and more, is the number one source of crawl budget waste on ecommerce sites.
Every time a user applies a filter combination, most ecommerce platforms create a new URL. On a clothing store, filtering by “blue + medium + cotton + under $100” creates a URL like /womens-tops?colour=blue&size=medium&material=cotton&price=0-100. That’s a unique URL. Multiply this across thousands of products and dozens of filter attributes, and you have a URL explosion.
Stopping filters from wasting crawl budget requires a deliberate strategy:
- Use canonical tags to point filter URLs back to the clean category page
- Apply noindex directives to filter combinations that don’t have standalone search value
- Block specific parameters via robots.txt to prevent Googlebot from following them
- Configure URL parameter handling in Google Search Console to guide crawl behaviour
Not all filter combinations deserve the same treatment. A filter like /mens-running-shoes/size-12/ might have genuine search demand and deserve to be indexed. A combination like /mens-running-shoes?colour=red&size=12&sort=newest almost certainly doesn’t. The key is making that distinction deliberately, not leaving it to chance.
How site architecture affects crawl efficiency and rankings.
Site architecture is the way your pages are connected and organised. For ecommerce, the goal is simple: keep every important page within 3 clicks of your homepage and ensure that link equity flows efficiently through your internal linking structure.
Poor site architecture looks like this:
- Homepage → Category → Subcategory → Sub-subcategory → Product (5+ clicks deep)
- Orphaned product pages with no internal links pointing to them
- Category pages with no links to related categories
- Pagination that creates dead-end chains of pages
Good ecommerce site architecture looks like this:
- Homepage → Category → Product (2-3 clicks maximum)
- Clear category and subcategory hierarchy
- Cross-linking between related categories and complementary products
- Breadcrumb navigation that reinforces the hierarchy
Building crawlable internal links is one of the highest-leverage technical improvements you can make. When Google’s crawlers land on your homepage, they follow links. If those links lead efficiently to your most important product and category pages, those pages get crawled more often, indexed faster, and generally rank better.
For large ecommerce sites, an internal linking audit is often where the biggest gains are hidden.
Indexation problems: why Google isn’t indexing your pages.
Getting pages crawled is one challenge. Getting them indexed is another. Indexation means Google has determined a page is worth including in its search index and making available for ranking. Many ecommerce pages get crawled but never indexed.
Common reasons Google doesn’t index ecommerce pages:
Duplicate content at scale. When multiple URLs serve essentially the same content, Google selects one to index and often ignores the rest. On ecommerce sites, this is endemic: product pages with multiple colour variants, category pages with and without trailing slashes, HTTP vs HTTPS versions of pages, and filter URLs all create duplicate content problems. Resolving duplicate content at scale requires a combination of canonical tags, consistent URL structures, and parameter handling.
Thin content. Product pages with only a product title, price, and a few sentences of manufacturer description offer little value to Google. Pages with thin content are frequently crawled but not indexed. The fix is adding original, detailed product descriptions, user-generated reviews, specifications, and supporting content.
Noindex directives. Sometimes pages that should be indexed have noindex tags applied, either intentionally (for filter pages) or accidentally (a misconfigured template). Diagnosing Google indexing problems should always include checking for accidental noindex tags on important pages.
Soft 404s. A soft 404 is when a page returns a 200 OK status code but contains “no results found” or out-of-stock content. Google treats these like 404 errors and won’t index them.
If you’re finding pages aren’t appearing in Google despite being live, the process for getting your site indexed by Google always starts with checking these four issues first.
XML sitemaps: what ecommerce sites get wrong.
An XML sitemap is a file that lists the URLs on your site you want Google to crawl and index. For ecommerce, sitemaps are a critical tool for managing large URL sets, but most ecommerce sites implement them poorly.
Common mistakes:
- Including noindex pages in the sitemap (creates a contradiction Google has to resolve)
- Including out-of-stock product pages that return soft 404 content
- Not segmenting sitemaps by content type (products, categories, blog)
- Failing to update sitemaps when products are added or removed
- Including paginated pages beyond page 1
Best practices for creating effective XML sitemaps:
- Only include URLs you want indexed (no noindex pages, no filter URLs)
- Segment into product sitemaps, category sitemaps, and blog sitemaps
- Submit via Google Search Console and monitor for errors
- Automate sitemap generation so it stays current with your catalogue
- Set accurate <lastmod> dates so Google prioritises recently updated pages
A well-structured sitemap is especially valuable for new product launches. Without it, new pages rely entirely on internal links to be discovered. With it, you can fast-track discovery via Search Console.
Pagination SEO: handling large product catalogues without sacrificing crawl budget.
Pagination is the practice of splitting large lists of products across multiple pages: /category/page/1/, /category/page/2/, and so on. Managed badly, pagination turns into a crawl budget trap and an indexation headache.
The key insight: only page 1 of a category should typically be indexed. Pages 2, 3, and beyond are navigation aids for users, not standalone landing pages.
Best practices for paginated pages include:
- Apply a canonical tag on pages 2+ pointing back to page 1
- Do not use noindex on paginated pages (this prevents Googlebot from following links to deeper products)
- Ensure products on paginated pages are also linked from non-paginated paths (categories, internal links)
- Avoid infinite scroll implementations that JavaScript-render products unless you’ve confirmed Googlebot can render them
Pagination is often overlooked in technical SEO audits because it seems minor. On a catalogue with 500+ products spread across 50 paginated pages, it’s not minor at all.
Structured data for ecommerce: product and review schema.
Structured data is code added to your pages that helps Google understand what your content is about. For ecommerce, product and review structured data is particularly valuable because it enables rich results in Google Search: star ratings, price, availability, and more displayed directly in the search results.
Product and review structured data can meaningfully improve click-through rates from organic search. When your listing shows a 4.8-star rating and “In Stock” while competitors show plain blue links, you win more clicks at the same ranking position.
The most important schema types for ecommerce:
- Product schema: Name, description, SKU, brand, price, currency, availability
- Review schema: Aggregate rating, review count, individual reviews
- BreadcrumbList schema: Reinforces your site hierarchy
- Organisation schema: Business details for brand authority
If you’re running a medium to large ecommerce store, implementing product and review structured data across thousands of product pages requires a template-level implementation, not manual page-by-page coding.
The revenue-killing technical issues hiding in your ecommerce site.
Most ecommerce sites have a handful of critical technical issues that directly suppress organic revenue. A thorough revenue-killing issues audit checklist typically uncovers:
- Crawl budget waste from faceted navigation (often 50-90% of a large site’s URL count)
- Duplicate content from parameter variations, protocol issues, or trailing slash inconsistencies
- Missing or misconfigured canonical tags on product variant pages
- Thin product pages with manufacturer-copy descriptions
- Broken internal links pointing to 404 or redirected pages
- Slow page load times on category pages due to unoptimised product images
- Missing structured data on core product and category pages
- Accidental noindex tags on indexable pages
Any one of these is damaging. Multiple issues compound each other: a slow site with duplicate content and crawl budget waste is fighting Google at every turn.
If you’re looking to fix these issues without the time or internal bandwidth to work through them systematically, our managed technical SEO services handle the full diagnostic and remediation process so your team can focus on running your business.
Critical technical ranking factors every ecommerce site must address.
Beyond the ecommerce-specific issues above, there are critical technical ranking factors that apply to every site but hit ecommerce stores particularly hard:
Core Web Vitals. Google’s page experience signals measure loading speed (LCP), interactivity (INP), and visual stability (CLS). Large product images, third-party scripts, and complex filter JavaScript all drag these scores down. Ecommerce sites consistently underperform on Core Web Vitals.
HTTPS. Every page of your store should be served over HTTPS. Mixed content warnings, pages with HTTP assets on HTTPS pages, and HTTP redirects all create both security and SEO issues.
Mobile usability. The majority of ecommerce traffic arrives on mobile devices. Google uses mobile-first indexing, which means it crawls and indexes the mobile version of your site. If your mobile experience has usability issues, your rankings suffer even for desktop queries.
Page speed. Category pages loaded with product images are often the slowest pages on ecommerce sites. Lazy loading images, compressing files, and reducing render-blocking scripts are the three highest-impact interventions.
Crawl errors. 404 errors, server errors, and redirect chains all waste crawl budget and dilute link equity. A clean crawl error profile is table stakes for any serious ecommerce SEO effort.
How to audit your ecommerce site for technical SEO issues.
A systematic audit is the only way to know what’s actually happening on your site. Flying blind on technical issues means you’re optimising content and building links on a broken foundation.
Here’s how to approach an ecommerce technical SEO audit:
- Crawl your site using a tool like Screaming Frog to map all URLs, status codes, and meta directives
- Analyse your crawl budget in Google Search Console’s Crawl Stats report
- Identify URL bloat by comparing crawled URLs to intentionally indexed URLs
- Check your sitemap for accuracy, ensure only indexable pages are listed
- Review canonical tags across product variants, filter pages, and paginated pages
- Audit internal links for broken links, redirect chains, and depth issues
- Test structured data using Google’s Rich Results Test
- Measure Core Web Vitals via PageSpeed Insights and Search Console’s Experience report
- Check indexation in Search Console using the “Pages” report to identify excluded pages
For large ecommerce sites, this process surfaces dozens of issues. Prioritise by impact: crawl budget waste and indexation problems first, then structured data and Core Web Vitals, then refinements to internal linking and page speed.
Building a strong technical foundation for long-term organic growth.
Technical SEO isn’t a one-time project. It’s ongoing. New products are added, old ones are discontinued, site templates change, and Google’s standards evolve. Ecommerce sites that treat technical SEO as a one-off audit inevitably drift back into problems.
The highest-performing ecommerce SEO operations build technical health into their regular workflow:
- Monthly crawl audits to catch new issues before they compound
- Automated sitemap updates tied to the product catalogue
- Structured data validation as part of every template change
- Core Web Vitals monitoring with alerts for regressions
- Quarterly content audits to identify thin pages and consolidation opportunities
Ecommerce SEO management services that handle ongoing technical health let internal teams focus on merchandising and commercial priorities, rather than diagnosing crawl anomalies in Search Console.
The ecommerce stores that consistently dominate organic search aren’t necessarily the ones with the biggest budgets or the most backlinks. They’re the ones with the cleanest technical foundations: efficient crawl paths, strong indexation, and structured data that earns rich results. Fix the foundation, and everything else gets easier.



