Category in Permalinks Considered Harmful

I was not aware that other people didn’t know about this until recently, but since it seems to be little known, I thought I’d write a post on the topic.

Chain LinksIn WordPress, you should never start the custom permalink string with any of these %postname%, %category%, %tag%, or %author%. (Unless you know what you’re doing, of course. :) )

Meaning that “%category%/%postname% ” is a bad custom permalink string. So is just “%postname%” for that matter.

Why? Well, it has to do with how the WordPress Rewrite system works.

Rewriting Explained

See, when you request a URL from a WordPress site, WordPress gets the URL and then has to parse it to determine what it is that you’re actually asking for.

It does this by using a series of rules that are built whenever you add new content to WordPress. Generally the list of rules is pretty small, but there are specific cases that can cause it to balloon way out of control.

Normal Rules

Let’s say you’re using a normal permalink string, like my preferred “%year%/%postname%”. The rules that are generated will look like this:

/robots.txt (for the privacy settings)
/feed/* (for normal feeds of any kind)
/comments/* (for comments feeds)
/search/* (pretty url for searches, not often used)
/category/%category% (category archives)
/tag/%tag% (tag archives)
/author/%author% (author archives)
/%year%/%month%/%day% (with each of those after year being optional)
/%year%/%postname% (this is the permalink string you define)
/%pagename% (any Page)

The way that system works is that it compares the URL it has to each of those in turn, from the top to the bottom. When one of them fits, then WordPress knows what to display and how to do it.

Note that the order I listed those in is is significant. Each one from the top down is less specific than the previous one.  For example, “robots.txt” matches only that, while “/feed/*” matches anything starting with /feed/. And so on down the list. The %postname%, %category%, %tag%, %author%, and %pagename% will match any string, but the other WordPress % ones will only match numeric fields. Like %year% is always a number.

Notice that the last one is %pagename%. This basically matches everything, because %pagename% can be anything at all. Even hierarchical pages like /plugins/whatever/something will cause this to match. It’s the fall-through position. And then, if that page doesn’t actually exist on your site, then this causes the query to trigger the 404 condition internally, which causes your theme’s 404.php to load up.

Pretty simple and straightforward, really.

Problem Rules

The problem comes in when you try to use a non-number for the beginning of your permalink string. Let’s examine those last two rules closer:

/%year%/%postname% (this is the permalink string you define)
/%pagename% (any Page)

What if you used “%category%/%postname%” for your custom permalink string? Now those last two rules are these:

/%category%/%postname% (this is the permalink string you define)
/%pagename% (any Page)

That violates our main rule, doesn’t it? That each one should be less specific than the one above it? Because %category% can match any string too, just like the %pagename% can… With this set of rules, there’s no way to view any of the Pages. Not good.

So, WordPress detects this condition and works around it. Internally, this sets a flag called “use_verbose_page_rules”, and that triggers the rewrite rebuild to make this set of rules instead:

/robots.txt (for the privacy settings)
/AAA
/BBB
/CCC (one of these for each of your Pages)
/feed/* (for normal feeds of any kind)
/comments/* (for comments feeds)
/search/* (pretty url for searches, not often used)
/category/%category% (category archives)
/tag/%tag% (tag archives)
/author/%author% (author archives)
/%year%/%month%/%day% (with each of those after year being optional)
/%category%/%postname% (this is the permalink string you define)

Now we have basically the same set of rules, except for those new ones at the top. Every Page now gets its own very specific rule, and this satisfies our main condition once again.

Pages

But what if you have a lot of Pages? I once read a post by a person who had over 50,000 Pages on his site. That is a special case obviously, but consider our lookup system. We’re going through these rules one at a time. With our first method, our rule list was only 10 rules, maximum. With this new method, you add a rule for every single Page you make. Going through 50,000 rules takes a lot longer than going through 10. And even just building that list of rules can take a long time.

Basically you’ve created a performance issue. Your Pages now won’t scale to unlimited numbers. Your site’s speed is linearly dependent on the number of Pages you have.

This is a bad thing.

Conclusion

Firstly, it’s really not any better for SEO to have the category in there, or to have just the postname there by itself. And anybody who tells you differently is wrong. If you disagree with me, then no, I’m not interested in arguing this point with you; you’re just wrong, period, end of discussion.

Secondly, shorter links are great and all, but hell, why not use a real shortlink? WordPress 3.0 now has a ShortLink API that defaults to using ?p=number links on your own domain. These will actually work for any WordPress site, even ye back unto WordPress 2.5. WordPress 3.0 just makes it nicer and easier to use these with the Shortlink API (as well as allowing plugins to make this automagically use services like wp.me or bit.ly). So use that instead.

The conclusion is, in general, just don’t do it. Leave a number, or something static, at the beginning of your permalink string and you’ll never have any sort of problems. But if you really MUST do this sort of thing, then keep your number of Pages low. Don’t try it when you have more than, say, 30-50 Pages.

Addendum

Okay, so I actually simplified things for this post. It’s actually worse than this, as verbose page rules can add much more than one rule per page, as this post demonstrates (he gets 11 per extra page!).

Shortlink:

65 Comments

  1. Ryan says:

    if you have a website with the keyword “seo” in the domain then a post written on effective article titles under the category “article marketing”, how can using the %category%/%post%/ not be of benefit to seo? I completely disagree with you on that one. Keywords in the url are taken into consideration by Google…

    • Otto says:

      Google cares about content, not keywords. And the URL is just an identifier to them. You’d better focus spending your time on making your content semantically structured properly, with things like H1 and H2 and H3 and so forth. Google DOES assign more weight to headers and such and those will give you far more SEO benefit than screwing with your URL structure ever will.

      Content matters. Context does not.

      • BobF says:

        Check out the talk given by Google’s Matt Cutts at WordCamp 2009 (a video of the talk is available at Matt’s blog). Matt was very clear: Google looks at the words in the URLs of your posts and uses them as factor in determining relevance, and therefore he advises using %postname% in your WordPress permalinks.

        • Otto says:

          I’ve seen that and traded emails with Matt Cutts on this topic. He doesn’t advise using ONLY %postname%.

          There’s nothing wrong with using postname. Just use something else at the beginning of the string. Look at this site, I use %year%/%postname%. Works fine.

  2. Mac_Boy says:

    Hey Otto, I understand your advice. Thanks for explaining a potential problem.

    For those of us who use WordPress as a CMS for web sites and for catalogs, your numeric-first approach is awkward to do.

    Would a plugin (action hook) be possible to catch all that filtering? The plugin could avoid all the rules with its own set of rules.

    Just wondering how we could scale to a catalog that would have more than a 1000-records of custom post types. (Such as a Movie catalog).

    Thanks! :-)

    • Otto says:

      It’s not necessarily “numeric-first”. I just prefer it that way. Really, it’s “anything but those 4 items” first. A custom permalink of “post/%postname%” would be perfectly acceptable, as it fits the “less specific” rule as well.

  3. FurfurRising says:

    woops I’ve always used just the url followed by the postname, guess I’ll have to revise that.
    Nice post Otto.
    Jay

  4. [...] permalink structures, and without me trying to babble my way through, i’d rather refer you here to Otto’s post regarding permalinks and the rewrite rules, which certainly has taken my [...]

  5. Chris says:

    Hey Otto,

    Now you said dont use the %category%/%postname% unless “you know what your doing”

    If you look at this site for example http://net.tutsplus.com/ to me it looks like they use the above structure to display thier posts, this site is massive so how are they getting away with the performance issues?

    Also I have seen on large sites that the page dosent seem to fully load until you scroll down, is this for performance?

    Cheers
    Chris

    • Otto says:

      They’re probably not using many (or any) Pages, so they don’t really have to worry about it too much.

      It’s really the large numbers of Pages that cause the issue. If you don’t have any Pages, then this doesn’t matter.

      • oh so its Pages, if your site is built of mainly posts then could /%category%/%postname%/ be used?

        Could you force pages to use a different permalink to the posts?

        • Otto says:

          No, it’s sort of a combo deal… Having large numbers of Pages AND using one of those things at the start of the permalink string is the cause of the issue. Don’t do both of those together and you have no real problem.

          • cool thanks for clearing that up :)

            Cant wait to test out the Twitter Connect i havent had chance to look at the SFC or twitter one in a while :)

            I have a new project on that is aimed at Students so want the best use of Social Media tools

            Do you know if in WP3 URL’s are different, I have read about use of sub domains and built in WPMU – im hanging on for WP3 before i launch my next site

          • Otto says:

            The Multi-Site stuff in 3.0 does behave differently, but the same basic issue with verbose page rules still exists.

  6. Will says:

    I never knew this so thanks for writing about it. The trouble I am having is that if I update the permalink structure on my site with 758 posts and 17 pages, the posts can’t be found. You get the “Firefox has detected that the server is redirecting the request for this address in a way that will never complete” message. I alos tried using Dean’s permalink migration plugin and that did not help. So I set it back to /%category%/%postname%/ and all is fixed for now. Why does it not let me change the permalink to one with the recommended numerical start? Thanks!

  7. Grant Palin says:

    This is an interesting subject. I’m doing some work on updating my WP-driven site, and had contemplated adjusting the default permalinks to include ‘blog’ at the start, so that post URLs would appear to be within a blog section of the site. If the text at the start of the custom permalink is static, does that avoid the performance issue that would occur if using %category%?

    At the same time, would that affect _page_ permalinks? I have a number of pages already, and may be adding a bit of a hierarchy, possibly making use of custom types. Would these be affected by the permalink settings, or are they independent?

  8. ScruffyDan says:

    Thanks for a good write up of the problem. This is the clearest I have seen.

    Once quick question, if I add some static text to my permalink structure so that I end up with the following domain.com/archive/%postname% will that solve the issue?

    The reason I ask is that when looking here:
    http://core.trac.wordpress.org/ticket/8958

    Denis-de-Bernardy comments that there is a separate bug that affects such permalimks, but doesn’t elaborate. Any idea what he was talking about?

  9. Bill Frank says:

    I like the idea of Permalinks for Posts to start with blog and then the title of the blog. But I would like to have my Page Permalinks with names based on the page names rather than just the default numbers. My site has about 50 pages now. Perhaps it will grow to 100 over time, but not much more. How would I do this?

    • Otto says:

      I don’t understand the question. Pages are always at the root level. Like example.com/pagename. This is fixed and unalterable.

      If you want your posts to start with blog, then set the permalink string to blog/%postname% or what have you.

  10. pkazmercyk says:

    I’m a complete WP novice, but I read the article and other people’s responses with great interest. I believe this was touched on, but want to be sure before I proceed with my first site. I was considering the following for permalinks:

    /%year%/%post_id%/%postname%/

    Because I plan on referencing posts on my site from printed materials, it would be easier for someone to type in sitename.com/2010/1234/ than a long, hyphenated URL, since I believe that even if you leave the postname off of the URL, it will take you to the same post as if you included the postname (sitename.com/2010/1234/this-is-the-name-of-the-post/). And when clicking from within the site, the postname would be in the URL (as above), helping SEO.

    I could also leave off %year% or include %month%, no?

    Is this logical, or am I making a newbie error with dire consequences?

  11. Brian says:

    Hi Otto – it’s Brian again. Another excellent article on WP nitty-gritty setup.

    It seems that almost everyone else is suggesting /%category%/%postname%/ for a Permalink structure, and I am glad that you have taken the time to explain why this is a bad idea.

    Would /%post_id%/%category%/%postname%/ be a better Permalink structure, for those people who insist on having /%category%/%postname%/ in their URL?

    This structure begins with a number, and does not seem to be unadvisable in the Codex: http://codex.wordpress.org/Using_Permalinks?PHPSESSID=b2c1824c59a06e7f2f2697e4b499a974

    Doesn’t the Post ID already tell WP exactly what content to find in the database and display to the user? Then the additonal /%category%/%postname%/ at the end of the link would simply be for show, SEO, etc.

    Thanks again!

    • Chris says:

      This would be a good way of getting around it, but now with WordPress 3 and its custom post types what happen now?

      The way I have seen custom post types work is for example you have a Real Estate website, and posted a property to Rent, the structure would be http://www.example.com/rent/postname/ ?

      And that dosent have numbers

      • Otto says:

        Custom post type URLs are like Pages and subject to the same issues as Pages are. The Custom Permalink string only applies to real posts, but if you use custom post types, and start your custom string with one of those four, you will get this same performance issue.

        Just don’t do it. It’s that simple.

      • Brian says:

        an even better idea, might be:
        /%MLS_id%/%category%/%postname%/

        Where %MLS_id% is the Multiple Listing Service id number of the property, and %category% is ‘for-rent’, ‘for-sale’, or ‘residential’, ‘commercial’, etc.

        Now, how would you go about defining a %MLS_id% structure tag in WordPress?

        • Otto says:

          add_rewrite_tag(‘MLS_id’,’[0-9]+’) would do the trick, I think. You’d have to add it using a function hooked to the init action. Your code could then retrieve the incoming variable with get_query_var(‘MLS_id’);. This is a bit of advanced rewrite programming, of course.

    • Otto says:

      As long as you don’t start with one of those four things, it’s fine.

  12. Will says:

    The advice here is great for new blogs, but for long existing sites that used “%category%/%postname%” from the start it is not so easy. I have talked to a few other people besides myself that tried to change this and we all have the problem that all posts published previous to the permalink change can’t be found. Does anyone have a link to instructions to changing the permalink string on an existing site? Thanks.

  13. [...] in your WordPress permalinks is not a smart idea. In fact, it can do more harm than good. Otto explains in “Category in Permalinks Considered Harmful:”In WordPress, you should never start the custom permalink string with any of these [...]

  14. Randy says:

    Would there be any drawback/issue to using a permalink structure of %post_id%-%postname%? I see people often talking about using %post_id%/%postname%, but not-so-much the former. I think it would be beneficial in a couple ways.. 1) The link starts with a numerical value. 2) It points to the root level, and not inside a “folder” at the root level (like the latter arrangement — example.com/444/my-blog-post). So what’s the problem with example.com/444-my-blog-post?

    I’ve tested it out and it SEEMS to work, but I’m sorta cautious since I can’t find a lot of support for this structure. My thought was that it might get confused if you had numbers in the title of a post. But I tested this out too and it worked. I had something like “Hello World” >> example.com/1-hello-world. And I created a post with the title “1 Hello World”. It came out as example.com/2-1-hello-world.

    Is there a problem that could arise I’m just not thinking about?

  15. Mark says:

    Hi Otto

    For us relative newcomers to WP and just starting up our sites, would
    “%post_id%/%category%/%postname%” perhaps be the best approach in that it eliminates the internal WP problems you speak off, has some seo value, and would be understood?

    Thanks Otto

    Mark

  16. Paul says:

    Hi Otto

    If u don’t mind, please consider looking at this for a bit.

    - I have around 300 Pages and Zero Post.
    - with only %postname% structure.
    - My sites is on a cheap shared host and I have WP super cache installed.

    After reading your article here, I wonder why my site still operates normally.

    because it doesn’t have any post in it ( just yet ) ?
    or it’s WP super cache ?

    and suppose, I decide to have posts as well in the future.
    Can I use /0/%postname% ?
    ( the number zero and then postname )

    or it must be more than one digit number ?

    Thanks

    • Otto says:

      Oh no, it will work fine, it’ll just be rather slow because of the work it has to do with that sort of structure. At some point, you’ll run into a wall where it just bogs down too much.

      • Paul says:

        Thank you very much for this article.

        I admit that after reading your article, I’m in fear of my site went down if I keep adding Pages ( now almost 300 pages.) I will have to change the permalink structure at some point soon to avoid performance issue mentioned above.

        But, in case, if WordPress come up with a new rewrite system ( I saw a few tickets about this in the trac ) that would be really really great because then a lot of people including me don’t have to change anything to the sites that they already have made.

  17. Robin Macrae says:

    Otto, what do you think of this technique as a way to avoid the performance issue and use the /%category%/%postname% structure?

    Don’t specify a category base. Use WP No Category Base. Add what you would have used as a category base to the permalink structure as a text string prefix.

    In my case, I want to use as the base the string topics so the structure would be topics/%category%/%postname%.html.

  18. Robin Macrae says:

    Top Level Categories (Filipe Fortes) is an alternative to WP No Category Base.

    Any chance of a reply, Otto?

    • Otto says:

      That doesn’t solve the problem. The performance issue will be there regardless, if you start the permalink string with one of the things I mention above.

      Note that for it to be a problem also requires you to have lots of Pages. If you don’t have those too, then it’s not really an issue.

  19. Robin Macrae says:

    I don’t understand why starting my permalink structure with topics (i.e., /topics/%category%/%postname%.html) with no Category base doesn’t implement your recommendation to leave a number, or something static, at the beginning of the permalink string.

    Why doesn’t that qualify as something static?

    What am I missing?

  20. Naweed says:

    I run a local business directory, my current permalink uses the postname. Now I’m worried about what to use because the structure of the post should really be “location”, “business category” “business name”.. now how do I get this to work. I’m also using the new wordpress multi-site and the annoying thing is “/blog/” keeps getting in the way and ruining my permalink structure causing the post to go in 404.php.

    I really need help with this.

  21. [...] Category in Permalinks Considered Harmful » Otto on WordPress. Bookmark, Share and Enjoy: [...]

  22. john says:

    Thank you very much otto for this very insightful article.

    You mentioned about starting the permalink of a blog post with a number. I have tried using mysite.com/%post_id%/%postname%/ and I think there is a bug. As the blog post can be viewed on mysite.com/%post_id%/%postname%/ as well as mysite.com/%post_id%/. It doesn’t redirect.

    Am I missing something or is there a work around on this?

    • Otto says:

      If you use post id, then that’s specific enough for it to be able to find the post, and so it will work with just the number.

      It should redirect due to the canonical URL handler, but perhaps you have a plugin or something else that disabled that.

      • john says:

        I tried this with a fresh install, without any plugins activated whatsoever and using the latest wordpress v3.01 + the 2010 theme.

        Using this permalink /%post_id%/%postname%/ gives your post duplicate content as it is viewable on the following two url’s:

        1.) mysite.com/1/hello-world/
        2.) mysite.com/1

        Try it on a fresh install yourself and see what you think. Only if it’s worth bothering you, that’s all.

        Cheers mate.

  23. Trent says:

    Otto I have a question about building a large site structure.

    The structure has 3 levels
    1) Buy / Rent
    2) State (%postname% Or Static)
    3) City (%postname% Or Static)

    What I was planning to do would be do create pages within pages for every level using custom permalinks Rent/Arizona/Phoenix. I am trying to understand what would be the best way to set up the site structure using Categorizes, Pages and Posts. Is it beneficial to use them all together such as?:

    1) Catagory
    2) Page
    3) Post

    Or to keep them all as Pages and not use custom links at all and let tiny links control them?

    The site will have 1500+ Pages and I would like to do what will best perform for the site.

    Any help with this question will be greatly appreciated!

  24. Wham says:

    Do these same rules apply to taxnoomies and post-types structures? I feel there is no documentation or follow-up or warning for taxonomy and post-type pages, and these will be used a whole lot more in the future and become a problem for some people ( who are now starting to use WP as a CMS).

    Just wanted to know. Thx, Otto!

  25. [...] Category in Permalinks Considered Harmful [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

Connect with Facebook

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>