Duplicate content is an important aspect of Website Performance that cannot be underestimated. As well as being an essential part of a search engine’s algorithm, it is a term Webmasters should be especially aware of in order to make the most out of their website. It is, in essence, a measure of the originality, and thus the individual value of web content material.
What is Duplicate Content?
According to Google, duplicate content is defined by blocks of text that are identical or closely resemble texts on other websites:
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match or are appreciably similar. Mostly, this is not deceptive in origin.”
Shortly put, duplicate content refers to the same content being repeated either within a website (internal duplicate content) or across different websites (external duplicate content) that is reachable via multiple URLs.
Why does duplicate content pose a problem?
A user probably wouldn’t realise that their content appears elsewhere, however, this duplication flies in the face of an important part of the process used by search engines in determining their rankings. There are two important decisions the search engine must make when dealing with duplicate content:
Which content version is most relevant for a particular search query
Which content version should be shown in the search results.
In a Video on duplicate content Google’s Matt Cutts explains that Google does not view duplicate content as spam, nor is it penalised. However, he also stresses that Google only wants one of these versions in its search results, meaning the other, which may well be the original version, can be left out of the top rankings.
Why does Google not like duplicate content?
It is Google’s imperative to show its users only the most relevant of content that has most value for their search. Because of this, it is not only identical duplicate content that poses a problem, but also texts that repeat the same information in slightly different wordings, without adding any new value. Eric Enge presents a good example of this in his article ‘The Concept of Sameness and Why It Should Matter’ on searchengineland.com. Here he demonstrates how three different text examples are ‘semi’-duplicate, as, despite being worded differently, no additional reader value is added across the three.
For you as a Webmaster it is important to bear in mind that your content should be unique and useful, to both your target audience, and to Google itself.
3 Reasons Why One Should Avoid Duplicate Content:
- Poor or no rankings
- Less Web-traffic
- Unsatisfactory user experience
When is it NOT duplicate content?
Content in multiple languages is not considered to be duplicate. The same is true of direct quotes and passages, as long as they are correctly punctuated as such. Proper semantic referencing should never be forgotten:
<blockquote>Quote- <cite>Name of the author or source</cite></blockquote>
Reasons for Duplicate Content.
Depending on whether it is internal or external, here are a few instances in which duplicate content can appear.
External:
- Through the publishing of manufacturer product information on several different websites.
- When the content of one domain is ‘scraped’ (plagiarised) by another.
Internal:
- Through the use of large portions of repeated text on multiple pages (often the case in Online-Shops)
- Through domains and subdomains (http://examplesite.com and http://www.examplesite.com)
- When multiple URLs direct to the same website:
- http://examplesite.com
- http://examplesite.com/
- http://examplesite.com/index.htm
- Through the use of filter functions (size, colour, brand etc.) on Online-Shops
- http://examplesite.com/category/smartphones/?sort=price (sorting by price)
- http://examplesite.com/category/smartphones/?sort=brand (sorting by brand)
- http://examplesite.com/category/smartphones/print (printable version)
- Through Session-IDs designed to track website visitors
- Through tracking-links – adding tracking code to the URL to control the flow of traffic.
- Scraper Websites
As you can see, there are many ‘tasks’ to take into account to ensure your website content is original and unique. In our next article we will be giving you additional information and some helpful tips on how to uncover, address and avoid content duplication.