On-page SEO: How to deal with duplicate content

Keystone CopySEO

SEO & Duplicate Content

 

Duplicate content may not be what you think it is.

But it is a big problem.

 

In these excerpts from my Fast Track SEO Course we’ve skipped a lot of essential on-page SEO guides to get to this issue.

(But don’t worry the complete course is available for just £4.99 from Amazon).

 

The reason we’re in such a rush is simply that the issue of duplicate content is huge.

And most of us are not even aware that we have a problem.

 

On-page SEO: How to deal with duplicate content

 

 

Let’s not mince words: Google hates duplicate content.

With a vengeance.

Much of its Panda update involved uprooting such content from its indexes as fast as a black and white bear can chomp through a bamboo shoot. And many big brands got caught in the net and saw their rankings crushed by its paw print.

Why was this – had they all been relentlessly plagiarising other sites?

Not necessarily.

 

Let’s try to understand how duplicate content can appear on your site and how you can deal with it.

 

Here’s the thing: what is considered duplicate content includes much more than you may imagine. For starters there is:

  • Content plagiarised from other sites.
  • ‘Lazy content’, for example, where product descriptions have been used verbatim from a supplier’s database by you and probably numerous other vendors.
  • The same copy being used on different pages of your own site.

But that’s not all.

 

The fact is that a lot of content is duplicated inadvertently at the URL level of your site.

And we mean a lot.

It is far from unusual for an SEO audit to reveal that a site with just 30 pages is actually serving up over 500 URLs. And, of course, it is actually the same 30 pages of content that are being duplicated endlessly all over the 500.

Enter Google, stage left, pulling its index from under your site’s feet. The curtain drops on your search traffic.

The end.

(Until you do something about it.)

 

How does duplicate content become a problem?

 

 

A URL can have numerous subtle variants.

Let’s count some:

  1. http://www.duplicatecontent.com

AND http://duplicatecontent.com

  1. http://www.duplicatecontent.com/oops

AND http://www.duplicatecontent.com/oops/

  1. http://www.duplicatecontent.com/oops

AND http://www.duplicatecontent.com/oops?affiliate=thanksmate

  1. http://www.duplicatecontent.com/oops

AND http://www.duplicatecontent.com/oops?ref=”thanksmate”

 

Now multiply them by, say, a site of 200 pages and you have a problem.

A BIG problem.

 

The issue here is that to us all these variants are really the same page.

But to a bot a webpage is any unique URL that it happens to meet on its crawl.

It treats tracking codes (eg ?affiliate=”thanksmate” ) as separate pages, trailing slashes ( / ) as separate pages and subdomains ( www ) as separate pages.

 

What does Google do with duplicate content?

 

 

It used to be that Google would place any duplicate content in a separate index (its supplemental index) for what it considered ‘junk’. Anything stored in here it did not bother ranking,  nor did it assign any authority to these pages.

Phew, you say: so, all those extra URLs just in effect get ignored?

That was then, but this is now: post-Panda duplicate content can have a negative effect on how Google views your entire site.

Those seemingly harmless ‘extra’ URLs could be harming your site greatly. They may be impacting on the quality and authority that Google assigns to your domain regardless of how hard you worked to optimise them in every other way.

Ouch!

Pick yourself up – it’s easy to fix.

 

Using the canonical tag

 

 

The canonical tag is used in the head section of your pages to tell search engines that certain pages are related to other pages.

What’s more they identify the daddy page and ask the search engine to pass any authority over to the big boss and to ignore those other mere striplings of pages.

Here’s what you do.

 

Into your head section of an offending page place:

<link rel=“canonical” href=“your preferred URL, aka your canonical URL, goes here” />

 

Job done and problem solved.

 

To www or to not to www?

 

 

We mentioned above that duplication can occur by serving up pages with www (http://www.yoursite.com) and without (http://yoursite.com).

This is a problem that can actually be solved using Google’s Search Console. (If you don’t have access – sign up now here: it’s a really handy set of tools for your site.)

The truth is it doesn’t matter if you use www or not – as long as you are consistent.

And the best way to ensure you are consistent is to tell Google your preference. You can find instructions to do this here.

The rest will simply happen!

 

 

Find out more

  • The complete guide to Panda, Penguin and Hummingbird is here.
  • The canonical guide to canonicalisation from Moz can be found here.
  • Jon Earnshaw shows how canonical tags have a direct effect on the SERPs here.

All the posts in this series can be found here.

For the complete version of this Fast track SEO course head over to Amazon.

It’s yours for less than a fiver!

 

seo course