What Is Duplicate Content?

Content on one webpage that is identical or very similar to content on another webpage is considered duplicate content. The similar content could be on the same website or on different websites, but either way it would still be classed as duplicate content.

Duplicate content can cause SEO issues, as search engines generally want to show distinct and unique results to their users. This problem manifests in four major ways:

Your webpages with duplicate content get filtered out of search results.
Search engines pick which webpage to rank out of every one that has duplicate content, taking control out of your hands.
Link equity for your webpages with duplicate content gets diluted through each page instead of being focused on just one page.
Your website will get penalised either with search ranking demotions or complete removal from a search engine’s index.

Google, in particular, has stated on numerous occasions that they don’t have a duplicate content penalty. However, it’s important to recognise that their definition of “duplicate content” is reserved for content that isn’t intentionally duplicated to manipulate search rankings. Google will penalise websites that have duplicate content for the purpose of abusing their search algorithm.

There are multiple reasons for having duplicate content issues, but they can all be boiled down to two major causes: poor onsite architecture and offsite content duplication.

Onsite Duplicate Content

The root cause of onsite duplicate content is that search engines consider every URL unique. Fortunately, this is entirely under your control and is thus easy to fix.

Slight URL Changes

Even the tiniest variations in a URL are treated as wholly unique URLs by search engines, including:

http://website.com and https://website.com
www.website.com and website.com
m.website.com and www.website.com
website.com/page and website.com/page/
website.com/page and website.com/PAGE

Such variations also stack on top of each other, exponentially increasing the number of duplicate pages.

http://website.com/page/
https://www.website.com/page
https://website.com/PAGE
www.website.com/page/
m.website.com/page

…and so on and so forth can all be counted as completely unique URLs. So if all addresses load up a page that looks exactly the same, search engines will think there is a duplicate content issue.

Products and Parameters

eCommerce websites usually run into duplicate content problems with how they generate product pages through onsite search results, sorting, and product categorisation such as:

website.com/boots.html and website.com/boots.html?sort=price&style=hiking
website.com/outdoors/shoes/boots.html and website.com/shoes/outdoors/boots.html

They may be different URLs but they have the same or mostly similar content. This type of duplicate content issue can happen to any kind of website with deep categorisation, a search function, and allows users to sort search results.
Parameter problems extend to URLs with tracking codes as well like so:

website.com/page and website.com/page?utm_source=twitter

Fixes for Onsite Duplicate Content

301 Redirects

Set up the pages that have duplicate content so that when users visit those pages, they get redirected to the “original” page.

Rel=”canonical” Attribute

Add rel=”canonical” to the HTML head section of each page with duplicate content and replace their URLs with the URL of the original page, so that search engines understand which page to direct link equity to.

Meta “noindex,follow”

Add the meta robots tag content=”noindex,follow” to the HTML head section of each page with duplicate content, so that these pages don’t get included in search engines’ indices but still get crawled by search engine bots.

Offsite Duplicate Content

Sometimes other sites will have the exact same or similar content as some of your site’s pages. This typically happens out of a lack of effort, but there are rare cases where other sites do it with malicious intent.

Manufacturers’ Product Descriptions

It’s quite common for ecommerce sites to sell a number of similar products. It becomes a problem when online retailers rely on product descriptions provided by manufacturers. These pre-written texts can show up on many different ecommerce sites and are considered duplicate content.

Content Copiers and Scrapers

There are unscrupulous websites that try to quickly work their way up search rankings by republishing content from other websites without permission or by directly copying content and passing it off as their own. Fortunately, this problem tends to fix itself because search engines nowadays can easily tell when this is happening and penalises offenders automatically.

Fixes for Offsite Duplicate Content

Unique product descriptions

Take the time to have unique product descriptions written for your product pages. The effort is worth it to avoid the risk of running into duplicate content problems.

Linking to original content

Include a link to your original content within that content so that low-effort content copiers and scrapers end up linking back to your site when they steal your content. You can do this in the text or with a rel=”canonical” tag in the code.

Address Duplicate Content

Search engines have gotten better at detecting duplicate content without penalising unintentional occurrences. However, you should still take steps to avoid the issue altogether or address it immediately. There still are negative consequences that your site might face, such as diluted link equity and suboptimal search rankings.