The abbreviation XML stands for eXtensible Markup Language. This is a collection of codes which are used to describe digital content. The most widely known is hypertext markup language (HTML) used in the formatting of web pages.

An XML sitemap is a file that explains the structure of a website and highlights its most significant pages. It is essentially a table of contents but when correctly arranged, it can help search engines to comprehend the internal logic of the website’s construction and to find key pages during regular crawling operations. Google (and other search engines) crawl the internet looking for important pages to add to their indexes and a sitemap is a very effective way of helping them not only to find those pages but also to assess their content.

XML sitemaps are of some benefit for all sites in improving SEO performance but they are indispensable for very large sites simply because of the size and complexity of the site structure. They can also compensate for the weakness of internal links, the occurrence of orphan pages (pages not linked to any other page within the site), and the absence of strong external links.

Search engines are also responsive to new content. A site that is not regularly updated may fade in the rankings but even if it is refreshed frequently, there is no guarantee this will be picked up. An XML sitemap is able to tell Google when content is renewed or replaced and give an indication as to the significance of the new content. Dynamic web pages gain priority in indexing and therefore rank more highly.

The Structure of an XML Sitemap

The sitemap gives the page’s location in the website, or URL, indicates when it was last updated, the frequency of updates and its priority relative to other pages on the site. Not only does this help make sense of the site to search engine crawlers but, given the proliferation of borrowed content on the internet, specifying the most recent date of updating can confirm that content is original and not duplicated from elsewhere.

The sitemap uses a number of tags to communicate with the search engines: location (loc.), last modified (lastmod.), change frequency (changefreq.) and priority (priority). There are some limitations on sitemaps, such as a maximum of 50,000 URLs and a limit of 50MB on the size of uncompressed files. Compression can save bandwidth but the limit applies to the unzipped file.

Once a sitemap has been created, it needs to be submitted to Google Search Console, Bing Webmaster Tools and other search engines. It should also feature in the site’s robots.txt file which will direct search engines to the sitemap.

While there is no guarantee that an XML sitemap will deliver every potential benefit, creating one significantly increases the chances of significantly improved search rankings.

Top