SEOlust
Technical

Sitemap Validator

Validate XML sitemaps. Check syntax, URL format, protocol compliance, and sitemap.org standards instantly.

All tools
0 characters • 0 lines

📚 Example Sitemaps

📄 URL Set
Standard sitemap
📑 Sitemap Index
Multiple sitemaps
🖼️ With Images
Image extensions
📰 News Sitemap
Google News

🗺️ What is a Sitemap?

  • Purpose: Tells search engines which pages exist on your site.
  • Format: XML file following sitemaps.org protocol.
  • Location: Usually at https://example.com/sitemap.xml
  • URL Limit: Maximum 50,000 URLs per sitemap file.
  • File Size: Maximum 50MB uncompressed.
  • Elements: loc (required), lastmod, changefreq, priority (optional).
  • Sitemap Index: Can reference multiple sitemap files.
  • Extensions: Supports images, videos, news, mobile annotations.

How to Use Sitemap Validator to Check XML Sitemaps

Validate XML sitemaps instantly. Check syntax, URL format, protocol compliance, and sitemap.org standards. Supports URL sets, sitemap indexes, and extensions. Free sitemap validation tool.

Getting Started

Validate sitemaps in seconds.

  • Paste Sitemap: Copy your sitemap XML content into the text area.
  • Click Validate: Tool analyzes syntax, structure, and compliance.
  • Check Results: See if sitemap is valid with errors highlighted.
  • Review Statistics: Total URLs, errors, warnings at a glance.
  • Fix Errors: Follow specific recommendations for issues.
  • Check Warnings: Optional improvements for better results.
  • View URLs: See complete list of all URLs in sitemap.
  • Test Again: Revalidate after making corrections.

What Gets Validated

Comprehensive sitemap checking.

  • XML Syntax: Checks if XML is well-formed and parseable.
  • Root Element: Verifies <urlset> or <sitemapindex> present.
  • Namespace: Checks correct xmlns declaration.
  • Required Elements: <url> must have <loc>, <sitemap> must have <loc>.
  • URL Format: Validates http/https URLs under 2048 characters.
  • Date Format: lastmod must be W3C Datetime (YYYY-MM-DD).
  • Priority: Values must be 0.0 to 1.0 if present.
  • Changefreq: Must be valid value (always, hourly, daily, etc.).
  • URL Limit: Maximum 50,000 URLs per sitemap file.
  • Duplicates: Detects repeated URLs in same sitemap.

Sitemap Types

Two main sitemap formats supported.

  • URL Set: Standard sitemap with <urlset> root containing <url> elements.
  • Sitemap Index: Master sitemap with <sitemapindex> root referencing other sitemaps.
  • URL Set Use: Single sitemap for up to 50,000 URLs.
  • Index Use: Multiple sitemaps for large sites or organization.
  • Mixed Not Allowed: Cannot combine <url> and <sitemap> in same file.
  • Tool Detects: Automatically identifies type and validates accordingly.
  • Extensions: Supports image, video, news, mobile annotations.
  • Both Valid: Can validate either type with same tool.

XML Syntax Validation

First check performed on sitemap.

  • Parser: Uses JavaScript DOMParser for accurate checking.
  • Error Detection: Catches unclosed tags, missing elements, syntax errors.
  • Well-Formed: XML must follow strict syntax rules.
  • Common Issues: Missing closing tags, unescaped characters, encoding problems.
  • 100% Accurate: XML parsing is deterministic and precise.
  • Must Pass: Cannot proceed to protocol validation until XML valid.
  • Error Display: Shows parsing error message from browser.
  • Fix First: Correct XML syntax before addressing other issues.

Required Elements

Mandatory fields for valid sitemaps.

  • URL Set Requires: <urlset> root, at least one <url>, each <url> has <loc>.
  • Sitemap Index Requires: <sitemapindex> root, at least one <sitemap>, each <sitemap> has <loc>.
  • <loc> Content: Must be valid absolute URL starting with http:// or https://.
  • No Relative URLs: /page.html not allowed, must be https://example.com/page.html
  • URL Length: Maximum 2048 characters per URL.
  • Protocol: Only http and https protocols allowed.
  • Escaping: Special characters must be XML-escaped (&amp; not &).
  • Encoding: UTF-8 encoding recommended in XML declaration.

Optional Elements

Recommended but not mandatory fields.

  • lastmod: Last modification date in W3C format (YYYY-MM-DD or ISO 8601).
  • changefreq: How frequently page changes (always, hourly, daily, weekly, monthly, yearly, never).
  • priority: Priority 0.0 to 1.0 relative to other URLs on your site.
  • lastmod Purpose: Tells search engines when content was updated.
  • changefreq Purpose: Hints at crawl frequency (not binding).
  • priority Purpose: Relative importance within your site only.
  • Not Required: Can omit all optional elements.
  • Recommended: lastmod helps search engines prioritize crawling.

URL Count Limits

Maximum URLs allowed per sitemap.

  • Standard Limit: 50,000 URLs maximum per sitemap file.
  • File Size: 50MB uncompressed maximum.
  • Compression: Can gzip sitemap to reduce size.
  • Large Sites: Use sitemap index with multiple sitemaps.
  • Example Split: 100,000 URLs = 2 sitemaps + 1 index.
  • Tool Checks: Validates URL count and shows error if exceeded.
  • Best Practice: Keep under 40,000 URLs for safety margin.
  • No Minimum: Can have single URL in sitemap.

Duplicate URL Detection

Finding repeated URLs in sitemap.

  • Problem: Same URL listed multiple times wastes crawl budget.
  • Detection: Tool checks for exact URL matches.
  • Case Sensitive: /Page and /page are different URLs.
  • Parameters Matter: /page?id=1 and /page?id=2 are different.
  • Warning Shown: Displays each duplicate URL found.
  • Impact: Search engines may ignore duplicates.
  • Fix: Remove duplicate entries, keep most recent lastmod.
  • Common Cause: Merging sitemaps without deduplication.

Date Format Validation

lastmod date format requirements.

  • W3C Datetime: YYYY-MM-DD format required minimum.
  • Full ISO 8601: YYYY-MM-DDThh:mm:ss+00:00 also valid.
  • Examples Valid: 2024-05-17, 2024-05-17T10:30:00Z.
  • Invalid: May 17 2024, 17/05/2024, 2024-5-17.
  • Timezone: Optional but recommended in full format.
  • Tool Validates: Regex pattern matching for format.
  • Search Engines: May ignore incorrect date formats.
  • Best Practice: Use YYYY-MM-DD for simplicity.

Priority and Changefreq

Optional hint values for search engines.

  • Priority Range: 0.0 (lowest) to 1.0 (highest) only.
  • Priority Default: 0.5 if not specified.
  • Priority Relative: Only matters within your own site.
  • Changefreq Values: always, hourly, daily, weekly, monthly, yearly, never.
  • Changefreq Hint: Tells crawlers expected update frequency.
  • Not Binding: Search engines may ignore both values.
  • Tool Validates: Checks priority in range and changefreq is valid value.
  • Recommendation: Use lastmod instead of changefreq when possible.

Sitemap Index

Master sitemap for multiple sitemaps.

  • Purpose: References multiple sitemap files from one index.
  • Root Element: <sitemapindex> instead of <urlset>.
  • Contains: <sitemap> elements with <loc> pointing to sitemaps.
  • Can Have lastmod: Optional last modification date.
  • Use Case: Sites with 50,000+ URLs needing multiple sitemaps.
  • Organization: Can split by section (products, blog, pages).
  • Tool Validates: Checks all sitemap references are valid URLs.
  • Cannot Mix: Index cannot contain <url> elements, only <sitemap>.

Common Errors

Frequent sitemap mistakes detected.

  • Invalid XML: Unclosed tags, missing quotes, syntax errors.
  • Missing <loc>: Every URL must have location.
  • Invalid URL: Malformed URLs, relative paths, wrong protocol.
  • Wrong Root: Using <urls> instead of <urlset>.
  • Missing Namespace: xmlns attribute not present.
  • Too Many URLs: Exceeding 50,000 URL limit.
  • Invalid Dates: lastmod in wrong format.
  • Invalid Priority: Value outside 0.0-1.0 range.
  • Mixed Elements: Combining <url> and <sitemap> in same file.
  • Duplicate URLs: Same URL listed multiple times.

FAQ

How accurate is this sitemap validator?
95-100% accurate for sitemap validation. Checks XML syntax (100% accurate), protocol compliance (95%), URL format (95%), and all sitemap.org specifications. Cannot check URL accessibility (requires fetching each URL).
Can this fetch my live sitemap automatically?
No, browser security (CORS) prevents fetching from external sites. Visit yoursite.com/sitemap.xml, copy the XML content, and paste here. This keeps validation private and instant.
What is the difference between urlset and sitemapindex?
urlset is a standard sitemap with <url> elements listing pages. sitemapindex is a master sitemap with <sitemap> elements pointing to other sitemap files. Use index for sites with 50,000+ URLs.
Why does my sitemap show warnings but is still valid?
Warnings are recommendations for improvement (like adding lastmod dates) but do not make sitemap invalid. Errors must be fixed; warnings are optional but recommended for better SEO.
Can I have more than 50,000 URLs?
Not in a single sitemap file. Split into multiple sitemaps and create a sitemap index. Example: 100,000 URLs = sitemap1.xml (50,000) + sitemap2.xml (50,000) + sitemap-index.xml.
What does "duplicate URL" warning mean?
Same URL appears multiple times in your sitemap. This wastes crawl budget and confuses search engines. Remove duplicates, keeping the entry with most recent lastmod date.
Are priority and changefreq required?
No, both are optional. Search engines may ignore them. lastmod is more useful than changefreq. Priority only matters relative to other pages on your site, not across different sites.
Can I use relative URLs like /page.html?
No, all URLs must be absolute with full domain: https://example.com/page.html. Relative URLs are invalid in sitemaps and will cause errors.
What date format should I use for lastmod?
Use YYYY-MM-DD format (e.g., 2024-05-17) for simplicity. Full ISO 8601 with time is also valid (2024-05-17T10:30:00Z) but not required. Avoid formats like May 17, 2024.
Does validation guarantee Google will index my pages?
No. Valid sitemap helps search engines discover pages but does not guarantee indexing. Google considers content quality, robots.txt, canonical tags, and other factors for indexing decisions.

Related tools

Pro tip: pair this tool with Open All URLs and Robots.txt Generator for a faster SEO workflow.