Excluding Category & Tag Pages from Google Index for Increased Authority

Excluding Category & Tag Pages from Google Index for Increased AuthorityGoogle is leading you on.

Not all the time, but in the case of excluding certain pages from being indexed, it is best not to heed Google’s advice (link). Google claims that unless you have been duplicating content deliberately to manipulate the search engines, your site will not be penalized. Actually, Google’s exact words were “It’s unlikely that one of those ways (indexing category pages and tag pages) will be a penalty”. You see, Google claims that many CMS’s like WordPress are arranged in a way that there will automatically be pages that have duplicate content (including category pages and tag pages). Most people do not have this sort of duplicate content on purpose to manipulate the search engines; therefore, if all goes as planned, you will not be penalized for it. Instead, Google will automatically choose the best page with the most authority to add to the SERPs. But if you don’t want to take any chances with Google penalties, we recommend, for most sites, to exclude category and tag pages from being indexed from Google.

The article that I was referencing from Google Webmaster Central Blog is dated September 12, 2008. It is obvious that the issue of duplicate content in SEO is not a new one. But back in 2008, things were different (other than the fact that we could not access the internet from our cell phones… how did we live!?). Back then, it was best to have the greatest number of indexed pages possible on your site for great SEO results. The reasoning was that if there were more indexed pages, Google bots would come back faster and more often, helping these sites rank, well, faster and more often.

SEO is much different today. It has transformed into a field of overall marketing and relationship building rather than just deploying a few tricks for great ranking and traffic. Matt Cutts even released a video yesterday questioning whether we should even continue to call it Search Engine Optimization. He suggested a new name: “Search Experience Optimization”. That has a nice ring to it, if you ask me. SEO really began to transform into search experience optimization after the first Panda and Penguin algorithm updates. Thin content and spammy link building would no longer bring the results that once was the case. Google needed these rules to truly be able to discern the mass amount of content and the web and improve the relevance and authority of the content listed on the first page of the SERPs.

google data centerMost of us have seen these viral images of Google’s data center.  This is why there need to be rules. As an engineer at Google, your job is to answer these questions:

1) How do we index these pages more efficiently? And,

2) How do we rank them more authoritatively?

The answers came in the form of updates like Panda and Penguin, micro data and micro formats, etc.

So, the question still remains: Should you de-index category pages and tag pages for increased authority? As with most SEO questions, the answer is, it depends. But generally, if your site has less than 300 posts on the blog and/or less than a 30 on SEOMoz Trust, de-indexing category pages and tag pages should increase your authority overall. If you have an e-commerce site, we recommend that for sites with a lot of products in a lot of different categories, only index the most important and valuable ones. De-indexing these pages will increase your authority because they most likely do not have any authority and are not adding anything to you site. If anything, they would be pulling down the overall authority.

Let’s get specific.

Here is a detailed game plan for de-indexing these low authority pages from your site.

1. Remove all tag pages from the dynamic sitemap.

  • WordPress: This is extremely simple if your site is on a WordPress CMS. Simply go to the settings for your XML Sitemap and uncheck Tag Pages.

Excluding tag and category pages google xml sitemap

  • Static Site: If you have a static site, you can de-index the tag pages by editing your robots.txt file. Simple copy and paste this code into the file (found through your FTP):

Disallow: /tag/

2. De-index all category pages that are low authority

  • WordPress: Instead of de-indexing ALL of the category pages by changing the settings on your XML Sitemap, you can de-index specific ones by adding a no-index meta tag or disallow from robots.txt file.
  • Static Site: For static sites, you can use the same approach.

3. Cross-reference all of these pages with Google Analytics to see if they are getting any significant traffic.

  • No traffic? Keep these pages de-indexed and you should see an increase in authority for your site.
  • Getting some traffic? You have two choices:

                       a)    Add relevant content to these pages. Provide interesting and helpful information for your readers that are different from any other content on your site. This will actually make them authoritative pages and you will not have to worry about de-indexing them

                       b)   Or, 301 redirect these high traffic tag or category pages to existing content that is very similar. You don’t want to simply redirect to the homepage because that does not foster good user experience. Redirect to a specific page on your site relating to the category or tag and you will be able to effectively shift the attention away from your category and tag pages and still keep readers on your site.

Rules of Thumb

For every lesson, there are rules of thumb to follow. Generally, I would suggest following these rules of thumb for indexing or de-indexing category and tag pages from your website:

1. Tag Pages: Each tag page should have a minimum of 5 blog posts listed under that tag. Therefore, only keep a tag page if it is an important page with quality content.

2. Category Pages: If you have category pages, only index them if they have 3 good, legitimate backlinks each. You can check your backlinks per page on Open Site Explorer.

Conclusion

There is no easy or direct answer. But in order to find the right answer for your specific situation, look into the traffic for your category and tag pages, whether they have any authority, and how many blog posts your site has. By answering these questions, you should be able to find the best answer for your individual situation. If you are looking for a more in depth analysis of your site authority, get an instant report or contact us today.

November 26, 2012

Written by Sonja Stein

Sonja Stein is an SEO Marketing Specialist at Optimum7 in charge of research, analysis, and implementation of SEO strategies. Her main responsibilities include extensive keyword research, niche term research, backlink analysis, and competitive analysis for both local and full-blown SEO clients. Sonja graduated magna cum laude from Northeastern University with a Bachelor of Arts Dual Degree in Economics and International Affairs. While completing her 5-year bachelors degree program, she completed 3 challenging 6-month internship programs. In her most recent co-op experience, Sonja was hired as a Marketing Analyst in Miami and has been hooked on online marketing ever since. Since moving from Boston to Miami to begin her career at Optimum7, she has focused her attention outside of work on enjoying all of the outdoor activities that Miami has to offer. She also enjoys cooking and photography.

This entry was posted in Google Ranking and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Posted January 10, 2013 at 11:54 am | Permalink

    Although there’s some good advice here, you shouldn’t be recommending use of robots.txt to stop Google indexing pages:
    a) This will stop the page being crawled, but Google can still index pages it hasn’t crawled, if there are links to the page indicating its content.
    b) It can stop link juice flowing around the site, harming your rankings and authority.

    Instead, the noindex robots meta tags should be used, which avoids both these issues.

  2. Posted February 9, 2013 at 2:08 am | Permalink

    Thanks for this really informative post. I’ve been debating the issue of indexing my tag pages or not, primarily, partly since I normally create tags based around the title (and not the h2 tags: which I let Google generate for me). I’ve been doing this for about 2 years, but I’m probably not as experienced or knowledgeable as you in this field. I’ve been at dmoz.org quite a few times, and have a decent ideal of some of the in debth ideas that float about.

    It’s known among most that simply writing awesome content is the best way to go with a good strong title keyword. H2 tags can be leveraged as well, only if the writer knows what they’re doing… OR, just doesn’t care enough to worry about it while treating it more of as a desert. But after reading this, I think I’m just going to index the posts only and leave out the indexing of the tags out of the sitemap. I’m just going to leave this part up to Google in the long run, and let them do what they want. The main thing is for them to immediately index each of the posts.

    This includes Bing as well.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>