Google finds most pages by following links. But links miss pages — new content not yet linked from anywhere, pages deep in your navigation, URLs with parameters. A sitemap.xml solves this. It's a direct map you hand to Google saying: "here are the pages that matter."
What Is a Sitemap?
A sitemap is an XML file at the root of your domain that lists URLs Google should know about. It doesn't guarantee indexing — Google still decides what to crawl and index — but it dramatically speeds up discovery, especially for large sites and new content.
The basic format:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://yoursite.com/</loc>
<lastmod>2025-11-01</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://yoursite.com/about</loc>
<lastmod>2025-10-15</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The Four Tags Explained
<loc> — Required
The full URL including protocol (https://). Must be URL-encoded. This is the only required tag.
<lastmod> — Recommended
The date the page was last meaningfully changed, in YYYY-MM-DD format (W3C datetime). Google uses this to prioritise recrawling. Don't fake this — if you set every page's lastmod to today, Google learns to ignore it.
<changefreq> — Optional hint
How often the content typically changes. Values: always, hourly, daily, weekly, monthly, yearly, never. Google treats this as a hint, not a directive. Use daily for news, weekly for regularly updated blogs, monthly for stable pages, yearly for archive content.
<priority> — Optional hint
Relative priority of this URL vs others on your site, from 0.0 to 1.0. Default is 0.5. This is only meaningful in relation to other URLs on your own site — it doesn't affect ranking vs other sites. Use 1.0 for your homepage, 0.8–0.9 for main sections, 0.5–0.7 for individual posts.
Where to Put It
Place your sitemap at: https://yourdomain.com/sitemap.xml
Then reference it in your robots.txt:
Sitemap: https://yourdomain.com/sitemap.xml
And submit it directly in Google Search Console: Sitemaps → Add a new sitemap.
Sitemap Index Files
A single sitemap can contain a maximum of 50,000 URLs and must be under 50MB. For larger sites, use a sitemap index file that points to multiple sitemaps:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://yoursite.com/sitemap-pages.xml</loc>
<lastmod>2025-11-01</lastmod>
</sitemap>
<sitemap>
<loc>https://yoursite.com/sitemap-blog.xml</loc>
<lastmod>2025-11-01</lastmod>
</sitemap>
</sitemapindex>
Submit the index file URL to Google Search Console — it will discover all child sitemaps automatically.
What to Include (and What Not to)
Include:
- All canonical, indexable pages
- Pages you want Google to know about
- New content that hasn't been linked to yet
Exclude:
- Pages with
noindexmeta tag (contradiction) - Duplicate content / thin pages
- Admin pages, login pages, staging URLs
- Redirect URLs (list the destination instead)
- Pages blocked in robots.txt
Including non-indexable or redirected URLs wastes crawl budget and trains Google to distrust your sitemap data.
Framework-Specific Generation
Next.js (App Router)
// src/app/sitemap.ts
import { MetadataRoute } from 'next'
export default function sitemap(): MetadataRoute.Sitemap {
return [
{ url: 'https://yoursite.com', lastModified: new Date(), changeFrequency: 'weekly', priority: 1 },
{ url: 'https://yoursite.com/about', lastModified: new Date(), changeFrequency: 'monthly', priority: 0.8 },
]
}
Astro
// src/pages/sitemap.xml.js
export async function get() {
const pages = ['/', '/about', '/blog'];
const body = `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${pages.map(p => ` <url><loc>https://yoursite.com${p}</loc></url>`).join('\n')}
</urlset>`;
return { body, headers: { 'Content-Type': 'application/xml' } };
}
Common Mistakes
Listing redirected URLs. If /old-page redirects to /new-page, list /new-page in your sitemap — not /old-page.
Not updating lastmod. Stale lastmod dates signal to Google you're not maintaining the file. Update it programmatically with each deploy.
Including every URL. Parameter-heavy e-commerce sites often accidentally include thousands of filter URLs (?color=red&size=l). Use canonicalize and Disallow to control which versions Googlebot sees.
Setting all priorities to 1.0. If everything is highest priority, nothing is. Use priority meaningfully.
Forgetting to submit. A sitemap that exists but hasn't been submitted to Google Search Console may still be discovered eventually — but submission gets it processed within hours.
Generate Yours in 30 Seconds
Use our Sitemap Generator to build a valid sitemap.xml. Add URLs one by one with changefreq and priority settings, or paste a bulk list of URLs. Preview the XML output, then download and deploy to your server root.