While technical SEO is essentially a logical discipline, it can sometimes present problems for which there seems to be no one right answer.
A great example of this comes in a problem most SEOs will encounter at some point: how do you avoid duplicate content issues between a legitimate series of paginated pages?
One of the most common issues faced by large eCommerce sites is that their paginated content, a very commonly used feature allowing large number of products to be listed, is likely to trigger a duplicate content flag:
Often these page will have a filtering system, using parameters to generate new content (e.g. colour, price range, etc). In turn this creates the potential for the number of duplicate pages to increase exponentially, if improperly managed. Since duplicate content can be seen as a way of manipulating a search engine to improve rankings, this can result in some severe problems.
Canonical tags are the traditional way of telling a search engine which pages are important and which should be ignored. But there is special markup for specifying a page is part of a series and should be treated as such. Do this markup solve the duplicate content issue? If not, what else is needed?
Many people, many minds
The question is this: when using pagination markup, is it also necessary to use a canonical tag? And if so, to which URL should the canonical tag point?
Google doesn’t give us much insight into this. The only section linking pagination markup with canonicals is cryptic:
So you can include both – but should you? It turns out the answers out there can be conflicting.
The technical support for a tool we use extensively recommended that we use both pagination and a canonical – and the canonical should point at page one in each case.
Various SEO specialists contributing to this (very useful) Moz Q&A forum say no canonical is needed – Google knows how to treat your pages with just the pagination markup.
Then there others saying you should use canonicals, but they should be self-referring. This might seem counter-intuitive, because you’re telling the search engine that each page is important, possibly confusing it with regards to which page to rank.
However, this is actually the most commonly used markup, and the option we have on all the eCommerce projects we work with. There’s a good reason for this, well put by one contributor to the Moz forum:
When you have parameters used to filter products, you can produce thousands of iterations of a single page. Here’s an example of a old URL slug for site we manage, which has since undergone a migration to a flatter structure:
With the possibility of filtering for brand and price, as well as category and other things, you can end with a huge number of pages. If these get indexed, you’ll have some problems. The self-referring canonicals prevent further duplicates being created, while the pagination tags tell search engines how to treat them as links.
This suggests the following is sensible markup for paginated pages that can be altered using parameters:
On page 1:
<LINK rel=”next” href=”http://www.mysite.com/category/?cp=2″>
<LINK rel=”canonical” href=”http://www.mysite.com/category/”>
On page 2:
<LINK rel=”next” href=”http://www.mysite.com/category/?cp=3″>
<LINK rel=”prev” href=”http://www.mysite.com/category/”>
<LINK rel=”canonical” href=”http://www.mysite.com/category/?cp=2″>
On page 3:
<LINK rel=”next” href=”http://www.mysite.com/category/?cp=4″>
<LINK rel=”prev” href=”http://www.mysite.com/category/?cp=2″>
<LINK rel=”canonical” href=”http://www.mysite.com/category/?cp=3″>
This markup is implemented in probably the majority of well-constructed eCommerce sites. And yet…
Is it really necessary?
So what’s Google’s first recommendation it comes to paginated content?
Since pagination is so common crawlers have become very good at understanding it, so perhaps it isn’t really that necessary after all.
Not only that, another Moz contributor suggests here (unconfirmed as yet) that Google doesn’t index paginated pages anymore, removing more or less any chance of a real duplicate content issue (as opposed to one that is simply flagged by a crawling tool).
So what do we do?
Google doesn’t really seem to care much what SEOs do. They’re trying to deliver search results that are good for users. The impression is often given that for questions like this, you can do what you want really – and if you’re really trying to trick them, they’ll know.
However, we like to play it safe. A properly coded and elegantly made site irons out the possibility of technical issues, creating more room to concentrate on other things (content, link building, PPC). And since Google doesn’t like to give too much away, there could be unforeseen problems should you go down the “do nothing” route – for small but widely experienced issues such as this, we feel it’s a good idea to err on the side of caution.