Programmatic SEO is the practice of generating many pages from one template and one data source. You write a single page component, point it at a list of entities or keyword variants, and Next.js produces a route for each one. Done well, a few hundred lines of code can rank for thousands of long-tail queries that you would never write by hand.
The naive version fails in a specific and predictable way. You loop over a CSV, drop the same paragraph structure on every page, swap a city name or a product attribute, and ship 500 routes. Google fetches them, notices they are 90% identical, and merges them. The duplicates either get folded into one canonical page or sit unindexed forever. You shipped 500 pages and rank with maybe three.
This tutorial builds a programmatic SEO setup in Next.js 15 (App Router) and spends most of its time on the part that actually matters: making each generated page genuinely unique so the index treats it as its own page. We will model a data source, generate static routes with generateStaticParams, template the page, emit per-page structured data, and serve a dynamic sitemap. The running example generates pages for SaaS integration guides ("how to connect X to Y"), but the pattern applies to any entity-times-attribute dataset.
Prerequisites
Before you begin, make sure you have:
- Node.js 18.18 or newer.
- A Next.js 15 project using the App Router. If you are starting fresh, run npx create-next-app@latest and choose the App Router and TypeScript.
- A data source. For this tutorial it is a typed array, but the same code works if you swap the array for a database query or a fetch to a headless CMS.
Confirm your version with npx next --version. The code below relies on Next.js 15 behavior where dynamic route params are asynchronous, so 15.0 or newer is required.
Modeling the data source
Every programmatic SEO page is one row of data rendered through one template. Start by defining the shape of a row and a function that returns all rows. Keeping this behind a function (rather than importing a raw array everywhere) means you can later replace it with a database call without touching the page component.
Here is a typed data module:
// lib/integrations.ts
export type Integration = {
slug: string;
source: string;
target: string;
// Real, per-row signal. These are NOT placeholders.
setupMinutes: number;
authMethod: 'oauth' | 'api-key' | 'webhook';
monthlyVolume: number; // Search demand for this exact variant.
steps: string[]; // The actual steps for THIS pair.
caveat: string; // A specific gotcha for THIS pair.
};
const INTEGRATIONS: Integration[] = [
{
slug: 'stripe-to-slack',
source: 'Stripe',
target: 'Slack',
setupMinutes: 10,
authMethod: 'webhook',
monthlyVolume: 320,
steps: [
'Create a Stripe webhook endpoint pointing at your handler.',
'Add a Slack incoming webhook URL.',
'Map the charge.succeeded event to a Slack message payload.',
],
caveat: 'Stripe retries failed webhooks, so make the Slack post idempotent.',
},
{
slug: 'github-to-notion',
source: 'GitHub',
target: 'Notion',
setupMinutes: 15,
authMethod: 'oauth',
monthlyVolume: 210,
steps: [
'Authorize a Notion integration and share the target database with it.',
'Subscribe to GitHub issue events via a webhook.',
'Insert one Notion page per issue using the database ID.',
],
caveat: 'Notion rate-limits to roughly 3 requests per second, so queue bulk syncs.',
},
// ...hundreds more rows
];
export function getAllIntegrations(): Integration[] {
return INTEGRATIONS;
}
export function getIntegration(slug: string): Integration | undefined {
return INTEGRATIONS.find((integration) => integration.slug === slug);
}
Notice that the row carries real, distinct fields per entry: actual setup steps, a specific caveat, a real auth method, and a monthlyVolume that reflects search demand for that exact variant. That last point matters later. The whole strategy lives or dies on whether each row contains genuine, non-interchangeable information.
Generating the routes
Next.js generates one page per row through generateStaticParams. Create a dynamic segment at app/integrations/[slug]/page.tsx and export the params function. At build time, Next.js calls it once, gets the full list of slugs, and statically renders a page for each.
// app/integrations/[slug]/page.tsx
import { notFound } from 'next/navigation';
import { getAllIntegrations, getIntegration } from '@/lib/integrations';
export const dynamicParams = false; // 404 anything not in the list
export function generateStaticParams() {
return getAllIntegrations().map((integration) => ({
slug: integration.slug,
}));
}
export default async function IntegrationPage({
params,
}: {
params: Promise<{ slug: string }>;
}) {
const { slug } = await params;
const integration = getIntegration(slug);
if (!integration) {
notFound();
}
return <IntegrationTemplate integration={integration} />;
}
Two things to call out. In Next.js 15, params is a Promise, so the component is async and you await params before reading slug. Setting dynamicParams = false means any slug not returned by generateStaticParams returns a 404 instead of being rendered on demand, which keeps junk URLs out of your index.
Templating the page
The template is where every page becomes structurally the same while the content differs. Render the per-row data through real headings and prose, not a key-value dump. The goal is a page a human would find useful, because a page a human finds useful is also a page Google can tell apart from its siblings.
This template also builds and ships its JSON-LD structured data inline, so each generated page carries a machine-readable description built from its own per-row steps:
// app/integrations/[slug]/page.tsx continued
import type { Integration } from '@/lib/integrations';
function IntegrationTemplate({ integration }: { integration: Integration }) {
const { source, target, setupMinutes, authMethod, steps, caveat } =
integration;
const jsonLd = {
'@context': 'https://schema.org',
'@type': 'HowTo',
name: `How to connect ${source} to ${target}`,
totalTime: `PT${setupMinutes}M`,
step: steps.map((text, index) => ({
'@type': 'HowToStep',
position: index + 1,
text,
})),
};
return (
<article>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(jsonLd) }}
/>
<h1>
How to Connect {source} to {target}
</h1>
<p>
This guide walks through wiring {source} to {target}. The integration
uses {authMethod} authentication and takes about {setupMinutes} minutes
to set up end to end.
</p>
<h2>Setup steps</h2>
<ol>
{steps.map((step, index) => (
<li key={index}>{step}</li>
))}
</ol>
<h2>One thing to watch</h2>
<p>{caveat}</p>
</article>
);
}
Because the <script> tag lives inside the component's returned markup, Next.js renders it at build time on every generated page, and each page ships a HowTo block built from its own steps. The structured content differs per page for the same reason the visible content does: it is built from per-row data, so it reinforces the uniqueness signal rather than stamping the same JSON onto every URL.
Add per-page metadata with generateMetadata, which also receives params as a Promise in Next.js 15. Unique titles and descriptions per route are a baseline requirement, not an optimization.
// app/integrations/[slug]/page.tsx continued
import type { Metadata } from 'next';
export async function generateMetadata({
params,
}: {
params: Promise<{ slug: string }>;
}): Promise<Metadata> {
const { slug } = await params;
const integration = getIntegration(slug);
if (!integration) {
return {
title: 'Integration not found',
};
}
const { source, target, setupMinutes } = integration;
return {
title: `Connect ${source} to ${target} (${setupMinutes}-Minute Setup)`,
description: `Step-by-step guide to integrate ${source} with ${target}, including auth, setup steps, and one common gotcha.`,
alternates: {
canonical: `/integrations/${slug}`,
},
};
}
The alternates.canonical line points each page at itself. That tells Google this URL is the canonical version, which is exactly what you want as long as the pages are genuinely distinct. The moment they are not, a self-referencing canonical does nothing, which brings us to the real problem.
Why hundreds of near-identical pages get merged
This is the part most programmatic SEO tutorials skip, and it is the part that decides whether the project works.
Google does not index pages it considers duplicative of pages it already has. Its guidance on duplicate content is explicit: when it finds a cluster of pages that are substantially the same, it picks one canonical version and drops the rest from results. The threshold is not "byte-identical." It is "same intent, same information, swapped nouns." A template that changes only the city name, the product attribute, or two words in a boilerplate paragraph trips this every time. You can ship 500 routes and watch 480 of them land in the "Crawled, currently not indexed" or "Duplicate without user-selected canonical" buckets in Search Console.
The fix is not a clever canonical tag or a sprinkle of synonyms. The fix is that each page has to carry information that genuinely does not exist on the other pages. A real stripe-to-slack page describes Stripe's webhook retry behavior and Slack's payload format. A real github-to-notion page describes Notion's rate limit and OAuth scope requirements. Those are not interchangeable sentences. They are different facts, and different facts are what make a page its own page.
That requirement is also what makes the project hard to run at scale. Writing one good integration guide is easy. Writing 500 of them, each with accurate per-pair steps and a real, verified gotcha, is a content production problem, not a templating problem. This is the wall teams hit: the Next.js code generates 500 routes in an afternoon, and then someone has to fill 500 rows with real, distinct, correct information, or the whole batch gets merged. It is one reason a managed programmatic SEO pipeline generates each variant as a real article with its own keyword data and its own accuracy pass, rather than gluing a spreadsheet to a template. Per-page uniqueness is the entire point, and it is the only thing that keeps the pages indexed.
Whether you write the content yourself or generate it, the engineering rule is the same: if you cannot point at a paragraph and say "this fact is only true on this page," that page is duplicate content waiting to be merged.
Adding real signal per page
Three concrete practices move a generated page from "template fill" to "indexable."
First, carry real per-variant data, not placeholders. The steps and caveat fields in the data model exist for exactly this reason. A field like caveat: 'Be careful.' repeated across rows is worse than no field, because it adds duplicate text. The field is only worth having if its value differs meaningfully per row.
Second, run an accuracy pass. Programmatic content scales mistakes as fast as it scales pages. If the source data says Notion allows 10 requests per second and the real limit is 3, you have just published that error 200 times. Validate the data at the source. A build-time guard catches the obvious failures, and a cross-row check catches the subtler one: a caveat that is long enough to pass a length check but is still copy-pasted boilerplate.
// scripts/validate-integrations.ts
import { getAllIntegrations } from '../lib/integrations';
function validate() {
const rows = getAllIntegrations();
const seenSlugs = new Set<string>();
const caveatCounts = new Map<string, number>();
const errors: string[] = [];
for (const row of rows) {
if (seenSlugs.has(row.slug)) {
errors.push(`Duplicate slug: ${row.slug}`);
}
seenSlugs.add(row.slug);
if (row.steps.length < 2) {
errors.push(`${row.slug}: needs at least 2 real steps`);
}
if (row.caveat.trim().length < 20) {
errors.push(`${row.slug}: caveat is too thin to be unique`);
}
// Cross-row check: the same caveat reused across pages is duplicate text,
// even when each copy is long enough to pass the length floor above.
const normalized = row.caveat.trim().toLowerCase();
caveatCounts.set(normalized, (caveatCounts.get(normalized) ?? 0) + 1);
}
for (const [caveat, count] of caveatCounts) {
if (count > 1) {
errors.push(
`Caveat reused on ${count} rows: "${caveat.slice(0, 40)}..."`,
);
}
}
if (errors.length > 0) {
console.error(errors.join('\n'));
process.exit(1);
}
console.log(`Validated ${rows.length} integrations.`);
}
validate();
The length floor catches only the crudest thin values. The duplicate-caveat scan is what catches boilerplate that is long enough to slip past it, which is the failure mode that quietly merges pages. Wire the script into your build ("prebuild": "tsx scripts/validate-integrations.ts") so a thin or repeated row fails CI instead of shipping.
Third, drop zero-demand variants. The monthlyVolume field on each row is your filter. There is no reason to generate a page for a pair nobody searches for. Those pages add crawl budget cost and dilution with no upside. Filter them out before they become routes:
// lib/integrations.ts add a filtered accessor
const MIN_VOLUME = 50;
export function getIndexableIntegrations(): Integration[] {
return INTEGRATIONS.filter(
(integration) => integration.monthlyVolume >= MIN_VOLUME,
);
}
Then use getIndexableIntegrations() in generateStaticParams and the sitemap. The pages you do generate now all have demand, real steps, and a real caveat. That is a defensible set, not a duplicate-content liability. Treating the data layer as the place where quality is enforced (validate, dedupe, filter) is the bit of SEO automation that keeps a growing row count from degrading into thin pages.
A dynamic sitemap
A dynamic sitemap tells Google which generated URLs exist and how to find them. In the App Router you create app/sitemap.ts, export a default function that returns MetadataRoute.Sitemap, and Next.js serves it at /sitemap.xml. Build it from the same filtered data source so the sitemap and the routes never drift apart.
// app/sitemap.ts
import type { MetadataRoute } from 'next';
import { getIndexableIntegrations } from '@/lib/integrations';
const BASE_URL = 'https://example.com';
export default function sitemap(): MetadataRoute.Sitemap {
const integrations = getIndexableIntegrations();
const integrationRoutes: MetadataRoute.Sitemap = integrations.map(
(integration) => ({
url: `${BASE_URL}/integrations/${integration.slug}`,
lastModified: new Date(),
changeFrequency: 'monthly',
priority: 0.7,
}),
);
return [
{
url: BASE_URL,
lastModified: new Date(),
changeFrequency: 'weekly',
priority: 1,
},
...integrationRoutes,
];
}
Because the sitemap pulls from getIndexableIntegrations(), the zero-demand rows you filtered out never appear, and the URLs you submit are exactly the URLs you statically generated. Submit /sitemap.xml in Search Console and watch the indexing report, not the page count, to judge whether the set is working.
Wrapping Up
The engineering for programmatic SEO is small: a typed data source, one [slug] route with generateStaticParams, per-page metadata, inline JSON-LD, and a dynamic app/sitemap.ts. You can have it running in an afternoon. The hard part is everything that keeps the pages out of the duplicate bucket: real per-variant facts, a build-time validation pass, dropping variants nobody searches for, and a self-referencing canonical on each page.
Concrete next steps to ship this safely:
- Start with 20 to 50 rows of genuinely distinct data, not 500 thin ones. Confirm they get indexed in Search Console before you scale the row count.
- Make the validation script fail the build on duplicate slugs, thin fields, or reused caveats, so quality can only go up as you add rows.
- Watch "Duplicate without user-selected canonical" in the indexing report. If it climbs as you add pages, your rows are too similar, not your code.
If the content-production side is the part you would rather not build and maintain yourself (the real per-page data, the accuracy pass, the gate that refuses to publish thin pages), that is the kind of work an AI SEO agent like The SEO Agent is built to handle, leaving you to own the Next.js routing and rendering. Either way, a generated page earns its place in the index only when it says something the other pages cannot.
