18th of 20 Questions.

How would you architect SSG for a site with 100,000+ pages?

Architect SSG for 100,000+ pages by using Incremental Static Generation (ISR) with on-demand revalidation, strategic pre-rendering of popular content, distributed caching, and optimized build pipelines to manage scale

Building a statically generated site with 100,000+ pages requires a fundamental shift in architecture. A naive approach that pre-renders all pages at build time would result in build times of hours or days, making rapid deployment impossible. Instead, you must adopt a hybrid strategy that combines pre-rendering of critical paths, on-demand generation for long-tail content, intelligent caching layers, and optimized data fetching. The key is to treat SSG as a continuous process rather than a one-time build, leveraging Next.js features like Incremental Static Regeneration (ISR), on-demand revalidation, and strategic pre-rendering to scale effectively.

Core Scaling Strategies

Pre-render only critical pages at build time (e.g., homepage, popular products, recent posts) and generate the rest on-demand using dynamicParams: true or fallback: true/blocking .
Implement ISR with appropriate revalidate windows to keep content fresh without rebuilding everything—time-based for predictable updates, on-demand for immediate changes .
Use on-demand revalidation with tags and paths to update only affected pages when content changes, avoiding full-site rebuilds .
Distribute caching across layers: Next.js data cache, full route cache, and a CDN edge cache to reduce origin load and improve global performance .
Optimize data fetching with batching, caching, and incremental data loading to prevent API overload during generation .

Foundation: ISR Configuration for Scale

Strategy 1: Selective Pre-rendering with generateStaticParams

For 100,000+ pages, caching becomes critical. Implement a multi-tier caching strategy: First, the Next.js data cache stores fetch responses to prevent repeated API calls during regeneration. Second, the full route cache stores rendered HTML. Third, a CDN edge cache (Cloudflare, Fastly, AWS CloudFront) caches responses globally, dramatically reducing origin load. Set appropriate Cache-Control headers based on content freshness requirements—long TTLs for stable content, shorter for frequently updated pages. For self-hosted deployments, implement a custom cache handler using Redis or similar to share cache across multiple server instances .

Strategy 2: Tiered Caching Implementation

Strategy 3: On-Demand Revalidation with Webhooks

Implement CMS webhooks that trigger on-demand revalidation only for changed content, not full-site rebuilds .
Use tags for efficient invalidation: tag each product with its ID and category, so updating a product revalidates its page and any category/list pages .
Queue revalidation requests to prevent overwhelming your server during mass updates (e.g., bulk price changes) .
Monitor revalidation frequency and set rate limits to prevent abuse or accidental DDOS from misconfigured webhooks .

Strategy 3: On-Demand Revalidation with Queue

With 100,000+ pages, build time optimization becomes critical. Implement parallel data fetching in generateStaticParams to reduce API latency. Use streaming pagination to avoid memory overflow when loading large datasets. Consider splitting builds by category or content type using multiple deployment pipelines. For extremely large sites, explore distributed builds where different parts of the site are built on separate machines and then combined. Monitor build metrics (duration, memory usage, API response times) to identify bottlenecks and optimize incrementally .

Strategy 4: Optimized Build Pipeline

Strategy 5: Data Layer Optimization

Implement a GraphQL or dedicated API layer with batching to prevent N+1 queries during page generation .
Use database read replicas for build-time data fetching to avoid impacting production traffic .
Cache API responses aggressively during build with Redis or similar to prevent redundant calls across pages .
Consider using a static CDN for product images and assets to reduce build-time processing and bandwidth .
Implement incremental data loading where you fetch only changed content since last build using webhooks or timestamps .

Strategy 5: Data Layer with Redis Cache

At scale, you need visibility into your SSG architecture. Implement comprehensive logging and monitoring: track build durations, cache hit rates, revalidation frequencies, and error rates. Use tools like New Relic, DataDog, or open-source alternatives to monitor your Next.js application. Set up alerts for when build times exceed thresholds or when cache hit rates drop. Monitor API response times during builds to identify bottlenecks. Consider implementing distributed tracing to understand the full request path from CDN to origin to database .

Strategy 7: Deployment and Infrastructure

Use a platform with built-in ISR support like Vercel or Netlify to handle the complexity of distributed caching and background regeneration .
If self-hosting, deploy multiple Next.js instances behind a load balancer with shared Redis cache to handle regeneration load .
Configure CDN (Cloudflare, Fastly, AWS CloudFront) with appropriate cache behaviors and purge APIs for coordinated invalidation .
Implement blue-green deployments to avoid downtime during full rebuilds when they are necessary .
Use feature flags to gradually roll out changes to a subset of pages before full deployment .

Question Loading...