There is a reason a book has no duplicate pages and a machine has no duplicate parts. They’re not needed. This is just how search engines view content. It’s not very usable or informative for search results pages to contain multiple copies of the same thing.   This is why your website may not be listed if a search query turns up other pages with content exactly like yours.  However, there are grey areas to this, and you may not be aware it’s even happening to you.

What You Need to Know

Here are some of the main types of duplicate content that can occur on a website and what you can do to avoid them.

1. Copyright Violation

> What It Is: Using content that is not yours or free to use > Severity: High > Consequences: Potential lawsuit and blacklisting from search engines > Remedy: Just Say No to Plagiarism.  It’s always best to write unique content, but in cases where you want to reprint someone else’s content, remember to credit your source and pay fees where necessary. (And don’t expect the version on your site to show up in the search results.)

2. Thin Affiliate Sites

> What It Is: Examples of Thin Affiliate sites include websites created simply out of product listings. > Severity: Medium/High > Consequences: Some professionals have observed that Google’s slap on the wrist for this is penalizing 50 places in the search results, without being blacklisted (unless link farming is at play). > Remedy: The rule of thumb here, is that if the affiliate site is not offering any new content, then it is deemed to be ‘thin’. Therefore, try to offer something up front such as reviews, blog posts, tips etc.

3. Massive Duplication

> What It Is: Intentional or otherwise, multiple copies of the same page > Severity: Medium > Consequences: When search engines start to sense foul play such as spam tactics, a site runs the risk of being penalized in the search results. If over 70% of your content is duplicated this could trigger the alarm bells with search engines. > Remedy: Simply try to avoid it. Write additional extra unique content, on pages that are the same or very similar, to reduce the impact of this happening. A good way to avoid duplicate pages when you are switching to a new website domain is with 301 Re-directs, which tell browsers that a page permanently moved from one URL to another.

4. Unintentional Site Architecture / Pagination

> What It Is: Multiple versions of the same content such as long and short versions, product sort orders, etc. > Severity: Medium > Consequences: This one is less obvious and often only developers can make a fix. There is a risk that search engines won’t crawl your entire site efficiently. A typical example is sort orders on product pages such as price, review rating, etc. > Remedy: Talk to your web developer about using correct canonical tags, so only 1 version of a page is indexed. Google’s Matt Cutts defines this as “the process of picking the best URL when there are several choices.” So help them out.

5. Snippets

> What It Is: Often, summary content for menu pages > Severity: Low > Consequences:  So long as these are part of a larger page of content, this should not be an issue with search engines. > Remedy: Mix up your snippets and keep them relevant to the larger picture.

6. Slogans

> What It Is: Taglines for branding often found in a website header or footer. If it’s text on a page, it can end up looking like duplicated content. > Severity : Low > Consequences: The repetition of a primary key phrase that you want to perform well for, could dampen it’s performance and result in the ‘wrong’ landing page being served up in search results. > Remedy: Use your slogan in an image, in a header, so that it can’t be spidered by search engines. Use it once as a text link either on the home page or an about page in most cases, especially if it contains target keywords.

7. Duplicate Title Tags

> What It Is: Using the same keywords across title tags > Severity: Low > Consequences: Using the same keywords across title tags is called keyword cannibalization, whereby search engines have to choose between multiple pages to serve up in the results page. > Remedy: Make sure you have unique titles, Meta Data and page content. You can also ask your developer to hide pages from being indexed by adding a NoIndex tag to particular pages. Also watch out for printer-friendly pages, which are built with session IDs in the URL so these need to have a NoIndex tag.

I’d appreciate any comments, counter points or other examples I might have missed!