XML Sitemaps Are Overrated. Do You Really Need Them?
XML sitemaps are valuable for many websites, but they are unnecessary for others. A quick Google search for “sitemap XML,” will return tons of sites promoting the significance of XML sitemaps and how any site that is “serious” about SEO needs one.
But I’m here to show you why that blanket prescription might be garbage — and why many small and medium-sized sites do not need XML sitemaps. With a strong internal linking strategy, sound site architecture and limited developer bandwidth, XML sitemaps are simply overrated for many sites on the web.
XML Sitemaps are Overrated
What Is Sitemap XML?
Before I share why XML sitemaps are overrated, for those of you new to SEO, it might be helpful to go over what they actually do. By Google’s own account: “A sitemap is a file where you provide information about the pages, videos and other files on your site and the relationships between them. Search engines like Google read this file to more intelligently crawl your site. A sitemap tells Google which pages and files you think are important in your site and also provides valuable information about these files: for example, for pages, when the page was last updated, how often the page is changed and any alternate language versions of a page.”
In other words, an XML sitemap is a map of your site that helps search engines and Googlebot Google’s crawler, find and read your important pages (with hopes of indexing and ranking them highly). Therefore, for sites that implement XML sitemaps, make sure any page you want indexed in Google is included in your XML sitemap. Notice that I did not say all pages of your site. For example, if you have very similar pages necessary for users — but not for Google — do not include both in your sitemap XML, as it may cause cannibalization, and Google will get confused as to which page to rank for the relevant queries.
XML vs. HTML Sitemaps
In the SEO world, you will hear about both XML and HTML sitemaps. These are different sitemaps, and while they serve similar purposes, they have different functions.
HTML sitemaps are more intended for users, but search engines still use them to find URLs and distribute PageRank, or SEO link juice among the URLs. Their formatting lends themselves to users, as your audience can actually view the page. At Guaranteed Rate, our HTML sitemap focuses on getting users to our most important pages, as well some other insightful pages on our site.
Given that most webmasters link their HTML sitemap in the footer, that tells Google that your HTML sitemap is an important page. Since Googlebot can get to it easily from the homepage and footer on every page, it means Google can quickly crawl the links on your HTML sitemap (if they are follow links), see the content on those pages and pass link juice.
XML sitemaps are for search engines. In addition to thousands of URLs on the coded page, it usually contains a last modification date to tell Google the last time this page was updated. In addition, some XML sitemaps also include a priority number and change frequency. However, Google openly says that they ignore those two fields.
Compared to an HTML sitemap, an XML sitemap tells Google all important pages on a site and the priority at which Googlebot and other search engine bots should crawl them. To see what it looks like, check out one of Built In’s XML sitemaps.
Why Many Sites Do Not Use XML Sitemaps
Gary Illyes is a webmaster trends analyst at Google. In other words, he is one of the top SEOs at Google. As recently as 2019, Illyes himself said XML sitemaps are the second discovery option for Googlebot.
You know what’s first? Hyperlinks! That’s right.
Illyes, who works as one of the top two SEOs at the company many of us are trying to impress, is telling you that hyperlinks (and good site architecture) are the first place Googlebot looks to discover new content and crawl updated existing content.
In addition to the above, below are more reasons why small and medium-sized sites do not need to invest in XML sitemaps:
- Hyperlinks matter more.
- Site architecture helps Googlebot and users.
- Building dynamic XML sitemaps is difficult.
- Google says only “really large” sites need a sitemap XML.
Let’s unpack each of these in detail.
Hyperlinks Matter More
As you learned above, Google uses hyperlinks before XML sitemaps to find new URLs and re-crawl existing URLs. That means that, as long as you are internally linking to all important pages — including new pages — Google will be able to find your content, crawl it and hopefully index and rank it.
This is why it’s vital to train your content teams on internal linking and anchor text. If a writer is writing a story and mentions a phrase or topic that is talked about on another blog or evergreen page on your site, they should hyperlink the text. The text to hyperlink, also called anchor text, should reflect a keyword phrase that page is targeting. This will not only help Google find and crawl the URL but also possibly rank it for that anchor text term. Furthermore, it also tells the user a bit more about what type of page they are going to if they tap the link.
If you have not already, as an SEO, it’s your job to train all content teams on effective writing and SEO best practices — and internal linking is chief among them.
Site Architecture Helps Googlebot and Users
With blogging, it’s easy to link to related articles. However, when you have dynamic pages (such as thousands of tech companies, products with various sizes and so on), it simply is too much work to manually add internal links to every single one of those pages. This is where good site architecture and linkpacks come in.
Proper site architecture ensures Googlebot and users can get to every page on your site as quickly as possible. This is why you will often see many links in a header or content hubs, like you see on Built In’s Tech Topics page. When a user or bot gets to the main page, they can easily find all pages related to that topic.
As a rule of thumb, users and Googlebot should be able to get to all pages within seven clicks (or taps) from the homepage. However, as John Mueller, the other top SEO at Google, said in a 2018 Google webmaster office hours session, the sooner a user can get to the URL — also known as a store — the better.
“So especially if your homepage is generally the strongest page on your website, and from the homepage it takes multiple clicks to actually get to one of these stores, that makes it a lot harder for us to understand that these stores are pretty important,” Mueller explained. “On the other hand, if it’s one link from the homepage to one of these stores, then that tells us that these stores are probably pretty relevant and that probably we should be giving them a little bit of weight in the search results as well.”
In addition to site architecture, linkpacks have been a rising star tactic for SEOs over the last 10 years. As you can see at the bottom of Built In’s NYC job board, they introduce a linkpack to get to the most important job pages on the site. Linkpacks help users by quickly finding relevant pages on the site, but more important for search engines, they fast-track bot-crawling to the most valuable job categories on the site and help it find other related URLs.
Building Dynamic XML Sitemaps Is Difficult
As someone who has worked at both large companies (more than 1,000 employees) and small (fewer than 20 employees), I know what it’s like to fight for developer resources. Even at large companies, you are competing with SEM, social, email, brand and various other departments to get your technical SEO items in. More often than not, you have to prioritize or push things out.
Unfortunately, building dynamic XML sitemaps — ones that update whenever a new page is created or altered, ones that include and exclude certain pages on your site and ones that include URLs from subdomains and various other rules necessary to build an accurate XML sitemap — is very difficult. I have seen developer teams take anywhere from three weeks to three months to build an accurate dynamic XML sitemap.
Given the level of effort to build it and the trade-off of having a sound site architecture and internal linking practice, I firmly believe SEOs are better off focusing on other proven SEO trends and tactics.
Google Says Only Very Large Sites Need a Sitemap XML
Google has sent mixed messages in this regard. In 2011, Mueller said smaller sites do not need XML sitemaps: “With a site of that size, you don’t really need a sitemap file. We’ll generally be able to crawl and index everything regardless.” However, in 2020, Mueller said making an XML sitemap is a minimal baseline for a serious site. Well, small sites can be serious too.
In addition, if you look at Google’s documentation, they say you might need a sitemap if:
- Your site is really big.
- Your site has a large archive of content pages that are isolated or not well linked to each other.
- Your site is new and has few external links to it.
- Your site has a lot of rich media content (video and images) or is shown in Google News.
On the same page, Google says you might not need an XML sitemap if:
- Your site is “small”: By small, we mean about 500 pages or less on your site.
- You’re on a simple site hosting service like Blogger or Wix.
- Your site is comprehensively linked internally.
- You don’t have many media files (video, image) or news pages that you need to appear in the index.
Given that a top SEO at Google once said Google can handle smaller sites without XML sitemaps, and their own documentation clarifies the types of sites that need them, I would think long and hard before dedicating developer resources to building one. Instead, focus on producing great content with a strong site architecture and internal linking strategy.
XML Sitemaps Will Not Help With Indexing
XML sitemaps help search engines find and crawl URLs. However, just because they crawl it, that does not mean it will be indexed. What I mean is that, after Googlebot crawls a page, it then decides whether or not it should be in the index and how well it will rank. Crawling does not guarantee indexing.
As you can imagine, there are a lot of bad and thin content sites on the web. There are also sites that copy content from other sites. If all search engines indexed all pages on the web, users would have a hard time weeding out the crap. As such, Google, Bing and other search engines have standards for content.
Given all this, if you have an indexing problem, where Google is choosing not to index your content — which you can see in Google Search Console — creating an XML sitemap will not help. Instead, you should be focusing on creating better pages that are more valuable than your competitors.
Make no mistake: Many sites need and should invest time in building an XML sitemap or using an XML sitemap generator. However, there are many small to medium-sized sites out there where the effort is not worth the return. Rather than focusing on helping Google crawl your pages with a sitemap XML, focus on internal linking, site architecture and making sure your content is the best out there.