Making Sense of Duplicate Content and Page Titles in WordPress (WordPress Setup Part 2)

So you’ve read WordPress Setup Part 1 and setup WordPress so it has nice, pretty, descriptive URLs. Now you’re done right? Well, not exactly. WordPress default installs are great for crawlability, meaning that because it has links all over the place, the search engines can always find a path to any article. On the bad side, they can often find six or ten paths to any article. Once upon a time (okay, before WordPress 2.3), you had to worry about actual posts having multiple URLs, but that issue has pretty much disappeared. There is typically only one path to a page, but this doesn’t mean you can’t end up with duplicate content and wasted link juice.

So when viewed from the point of view of the post, there is no duplicate content. But not from the point of view of the text on those pages, that text can appear at many addresses, though there is only one that you want to come up in the search results in Google for that material. Because of the way WordPress lists the most recent posts on the front page, in the category pages, in the archives pages and so forth, the text, or at least the text above the <!--more--> comment, shows on every one of those pages (the <!--more--> comment defines how much of the post text ends up on those pages).

This means that you effectively have duplicate content, that is identical content that appears on multiple URLs. In a bad case, this will get some semi-random URL listed in the search engine instead of the one canonical (that is “authoritative, recognized, accepted”) URL that you want the search engines to use to get to that specific page on your site. It might also list both your preferred canonical URL and one or more of the others. That sounds good, because you could just take over the Google listings with your ten different URLs for your page of elephant jokes, but the problem is that it will split the power of those pages (call this Page Rank if you want). This might be even worse than listing the wrong page, because rather than one page in the top-10 in Google, you’ll have a page back at number 50 and another back at number 75 and so on. Nobody reads those pages. Why? Because you’ve ended up dividing up your inbound links and confusing the search engine robot. It’s just a robot—don’t make it think too hard!

For example, let’s say you just wrote a post on The Big Bad List of Elephant Jokes and you assign it a post slug of “elephant-jokes” and you put it in the categories “elephants” and “jokes” and you tag it as “humor”. You write it in June of 2020. This means that Goohoo! finds it at

  • (b/c it shows up on the home page as the most recent post)
  • (b/c it’s the most recent post in that category)
  • (ditto)
  • (ditto)
  • (because it’s at the top of your June 2020 archives)
  • (because this is the actual URL).

You don’t really want to do this. You want one canonical URL that reaches any given chunk of content. It’s better for you, your visitors and the search engines. So basically, you want to only index the “real”, that is canonical, URL.

Sorting the Canonical URL and Duplicate Content Issues

How do you do that? You could disallow the search engines from your archive and category pages using a robots.txt file. This will work, but the problem is that if you don’t get crawled before a post gets pushed off your home page, you might never get that post indexed (unless you generate a sitemap perhaps).

So what do you do? Simple, you install the incredible Headspace2 plugin. I used to use and recommend a hacked combination of the SEO Title Tag plugin and the All-in-one SEO Pack. That’s a powerful combo too, but not as powerful as Headspace2 and they need a minor hack (actually just a manual database change) to work together. I don’t say Headspace2 is incredible lightly, but this is just a great idea that is well-executed.

I got a fatal error when I installed H2, version 3.3.16, but that’s because the headpsace/plugins.php file needed to be executable by “owner” and I had the wrong file permissions on it. You can change that simply from your FTP client (try Filezilla if you don’t have an FTP client). If you’ve been using AIOSP, by the way, you can import all your data via the Headspace2 options.

Once you install this plugin (installs like any WP plugin; instructions in the readme file that comes with the download), you need to go in and enable some modules. This is a complex and powerful plugin and not all of it is enabled by default.

  • From your WordPress admin area, go to Options » Headspace2 » Modules
  • Look over at the “Disabled” list. Drag and drop any of these modules into the “Simple” section. I have the following activated currently:
    • No Index/No Follow — essential for sorting the duplicate content issue
    • Page Title — essential for the second part of this how-to.
    • Page Description — Let’s you create a custom meta description, which will get to in a second.
    • More Text — Instead of a generic “Read more” for a continued article, you can customize the text so it’s something like “Read more about sorting out duplicate content…”
    • Tags — lets you tag your pages and puts these tags in your meta keywords.
  • Now that you have the modules enabled, you’ll be able to control the indexing of all your pages. At edit or creation time, you can keep a single page out of the search indexes, which is useful for things like Contact pages and things like that. More importantly, though, we’ll get rid of all those category and archive pages and make them more or less invisible to the search engines.
    • Go back to the Headspace2 “Page Settings”. You should see a list that includes:
      • Archives
      • Categories
      • Search Pages
      • Tag Pages
    • For each of those listed above (not all the ones listed by Headspace2), click on it and, at the bottom of the options, you can see two check boxes. Check the No Index box, but not the No Follow box. Save. This tells the search engine (Google, Yahoo, etc) that it shouldn’t even bother to keep a record of the content of that page, but that it should follow those links on through to the actual pages you want indexed. If you check the No Follow box, you would prevent the search engine from even finding those pages that you really want indexed.
    • Note that you can also edit the page title and other information for those pages. We won’t bother right now, but it’s something to keep in mind in case you want to customize any of this.

Sorting out Meta Titles

H2 has another great utility: it lets you set unique meta titles (that’s the one that appears in the upper browser title bar, not the one the reader sees on the page) that are different from your H1 heading title. You can also craft meta descriptions and meta keywords and, in fact, any meta information. It will add additional text entry boxes that let you set your keywords, description and title on the post edit/creation screen.

The meta title is really key and the only one that really really really matters. This is what appears in the big bold text in the search results. This is the first thing about your page that most people will see. You want to make it count and you don’t want to simply duplicate what you have for the post heading. Above all, under no circumstances should the average blogger have a site where the meta title looks like this: My Site Name | Name of My Post. Nobody cares about the name of your stupid site and it’s also not descriptive in the least if you have a name like mine. It makes your titles look less unique and harder to tell apart if your visitor has several pages of your site open in different browser tabs or windows.

Why would you want your meta title to be different from your post title? Well, Google’s top search quality engineer, Matt Cutts, pointed out in his WordPress SEO video that varying these two gives you two chances to match terms. You can use subtly different wording, looking to use alternate spelling (changes and changing in Matt’s example) or related terms (photos and pictures and images for example).

This is actually not why I do it, though.The meta title appears in the search results, so it needs to give the user some information scent. There’s only so much room to be clever. However, in your RSS feed or on page, where you’ve already got the users there, you might want to just give them something funny or clever, but perhaps that does not make the general idea of the article immediately obvious. In many cases, such as a how-to article like this, my two titles might be similar. But when I write some humor or political commentary, I might want to have an H1 heading that is engaging, but not necessarily descriptive in the same way the meta title is.

Meta Title
Longer, more descriptive title that should say: “I answer your question. I am the page that you’re looking for. Come look at me.”
H1 Heading Title
Might be even longer (on this page I’ve added the “WordPress Setup Part 2”) or very short. It might be pithy, ironic or a mystery whose real meaning is only revealed as the reader goes down the page. The user is on the page already and has a view of the text that follows. The H1 text should say “Read on! I’m funny. I’m interesting. I’m good for a laugh or a solution.” It’s not necessarily a summary.

What if I already have pages without unique titles?

So now if you’ve never written a post and you don’t want to set titles for categories, you’re all set, but what if you are trying to fix up an old site, or you want to attach titles to category pages? Simple. Just leave the Options panel and head on over to Manage » Meta-data. You’ll see that H2 gives you a list with the Post Title (what appears on the page) fixed and the Page Title (what appears in the browser bar) editable. Now, look at the upper right corner of the screen. Headspace lets you mass edit almost everything—page title, post-slug, custom “more” text and everything. This is an amazing management tool.

Other Meta Tags

Meta Keywords

Who cares about these? The search enignes don’t pay attention anymore, so it’s just a waste of bandwidth, right? Perhaps, but things change and you may someday find these useful for your own internal search algorithms or what have you. I do this for my benefit, not the search engines. I write my title first, which keeps me on topic. I write keywords last, to see how I did. But of course you can ignore it. Since you’re using Headspace, you just generate your tags, which have uses for helping your visitors find related posts and so forth, and these will become meta keywords, so why not (if it’s not worth being a tag, I don’t bother to add extras).

Meta Description

Search engines don’t use this either, right? Probably not for ranking (how high you are in the results), but they might use it for relevance (trying to figure out the actual content of your post, assuming the description matches the rest of the page). More importantly, the will use it for the snippet that appears in the search results in some cases. An example would be where the algorithm tells the engine that your page is on elephant jokes, but it doesn’t find the word on the page so it can’t find a relevant snippet. What does it use? If you have no meta description, it might use nothing or it might just start grabbing your navigation text (I’ve had that happen on image pages). If you have the description, you control what appears in these cases instead of depending on SE magic.


By using Headspace2, you save yourself tons of headaches, lots of theme-hacking, and make your site more usable for visitors and search engines alike. If done right, your duplicate content issues and duplicate title issues will be totally resolved.

13 Responses to “Making Sense of Duplicate Content and Page Titles in WordPress (WordPress Setup Part 2)”

  1. > I used to use and recommend a hacked combination of the SEO Title Tag plugin and the All-in-one SEO Pack

    Why did you need the combination and not only one of them?

  2. Hey uberdose! I’m honored that you stopped by. The All-in-one SEO Pack is one of the great WordPress plugins of all time and can save so many headaches. So I hope no offense is taken because I’m now using Headspace2! There’s a lot of overlapping functionality.

    Anyway, maybe I missed something with the AIOSP, but I didn’t think it had any functionality to batch edit the titles. So I used to add in SEO Title Tag for the batch operations. This is not really necessary if you install the AIOSP right from the get go, but for an older site where you might be changing lots of titles, it’s nice.

    I’m traveling at the moment, but the minor DB “hack” (not really a hack, just a setting change) is because AIOSP offers the ability to change the label for storing the title, but in the admin interface the field is read-only, so you have to go into the MySQL command line and change the option directly in the DB. For some WordPress users, that might actually be beyond them.

    As for AIOSP versus Headspace2, I have to say it’s a pretty close contest, but as far as I can tell H2 does everything that AIOSP does and more. Notably

    • Batch operations on everything – titles, descriptions, post slugs and more
    • one-click installation for Google Analytics, Mint, Statcounter. Personally, this is not enough to get me to choose one over the other, but it does save a little effort modifying the theme
    • Ability to turn plugins and javascript on and off for single pages (don’t ever see myself doing this, so I don’t actually care)
    • Modify the “more” text. I think this is sort of nice actually.
  3. Thanks for the write-up! I’m the author of HeadSpace and I’m curious about the installation problems you mentioned. There should be no need to change any permissions to get the plugin working. It’s possible that something to do with the host environment, or even the transfer method, may have caused the issue. If there’s anything you can identify then let me know and I’ll try and modify the plugin so it doesn’t happen.


  4. Hi Tom, no offense taken :) But if you’re missing something I want to know so I can improve. Thanks for your points, they are very likely to end up in a release.

  5. Sorry for the delayed response. I’m traveling right now and don’t have regular net access.

    The installation problem I had was a permissions problem that caused a fatal error. For some reason, when I unzipped the Headspace2 archive and FTPed it to the site everything had read-only perms (i.e. no write and no execute, even for “owner”). That caused a, if I recall, “failed to open stream” fatal PHP error. I’m not sure why the permissions were wrong, but of course that can happen when moving something from Windows to *nix because the permission systems are incompatible (and I don’t even remotely understand the Windows permissions system; I’m not even sure it’s a system!). I don’t know as there’s anything you can do about it, but it’s not something that usually happens. When I changed permissions, it went fine.
    As for any other installation problems with Headspace2, I only know what I saw in your comments.

    >>Uberdose. I’m not exactly a high traffic site and don’t expect to ever become one, but I’m happy to have you announce any releases in the comments.

    I guess as this post gets outdated I’ll try to update it for future versions of both, since I think these are just killer plugins and I’m sorry more people don’t know about them. Forget about SEO – they make it more pleasant to visit sites.

  6. Russell

    Hi Tom,

    As far as I can tell, the latest HeadSpace2 has no “No-Index” option, only “No Follow” for links.

    Are you still using HeadSpace2 and the “No-Index” option? I wonder why it has been removed from HeadSpace2, and if you think it still necessary, what’s an alternative plugin?

    Many thanks.

  7. Hey Russell,

    You have to enable the Meta Robots module under the Page Modules. Then when you go back to the Page Settings screen for the Headspace2 settings, it will let you set defaults for various types of pages and when you edit/create a page/post it will let you set various robots meta tags.

    As for whether it’s necessary or not, it depends. Do a search and see what comes up. Sometimes some of your boilerplate pages, like the contact page, can come up pretty high because it’s linked from every other page. If you don’t want that, you can take it out.

    People debate back and forth on keeping category pages and archive pages out of the index. The key thing is that if certain types of junk content are clouding up your search results, this is one way to get them out of the index. In general, though, I use it only for things that I never want in the index (I recently put up a demo of a site that will eventually live on another domain, so I blocked crawling entirely at the robots.txt level).

  8. Russell

    Ah, Meta Robots! That done it, thanks alot Tom. Can now follow your great advice. Cheers.

  9. Well thanks for the question, which will hopefully make this information better for future readers!

  10. I just wanted to let everyone know, as I had trouble with it, that editing meta titles en masse is now done by clicking “Tools” from the sidebar and then “Meta-data.” It took me a while to find it and hopefully I can spare some other people the same waste of time.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>