How the Postname Permalinks in WordPress 3.3 Work

December 22, 2011, 4:49 pm

So, I first wrote about this topic on the wp-hackers list back in January 2009, explaining some of the scaling issues involved with having ambiguous rewrite rules and loads of static Pages in WordPress. A year later the same topic came up again in the WPTavern Forums, and later I wrote a blog post about the issue in more detail. That post generated lots of questions and responses.

In August 2011, thanks to highly valuable input from Andy Skelton which gave me a critical insight needed to make it work, and with Jon Cave and Mark Jaquith doing testing (read: breaking my patches over and over again), I was able to create a patch which fixed the problem (note: my final patch was slightly over-complicated, Jon Cave later patched it again to simplify some of the handling). This patch is now in WordPress 3.3.

So I figured I’d write up a quick post explaining the patch, how it works, and the subsequent consequences of it.

Quick Summary of the Problem

The original underlying problem is that WordPress relies on a set of rules to determine what type of page you’re looking for, and it uses only the URL itself to do this. Basically, given /some/url/like/this, WordPress has to figure out a) what you’re trying to see and b) how to query for that data from the database. The only information it has to help it do this is called the “rewrite rules”, which is basically a big list of regular expressions that turn the “pretty” URL into variables used for the main WP_Query system.

The user of the WordPress system has direct access to exactly one of these rewrite rules, which is the “Custom Structure” on the Settings->Permalink page. This custom string can be used to change what the “single post” URLs look like.

The problem is that certain custom structures will interfere with existing structures. If you make a custom structure that doesn’t start with something easily identifiable, like a number, then the default rewrite rules wouldn’t be able to cope with it.

To work around this problem, WordPress detected it and uses a flag called “verbose_rewrite_rules”, which triggers everything into changing the list of rules into more verbose ones, making the ambiguous rules into unambiguous ones. It did this by the simple method of making all Pages into static rules.

This works fine, but it doesn’t scale to large numbers of Pages. Once you have about 50-100 static Pages or so, and you’re using an ambiguous custom structure, then the system tends to fall apart. Most of the time, the ruleset grows too large to fit into a single mySQL query, meaning that the rules can no longer be properly saved in the database and must be rebuilt each time. The most obvious effect when this happens is that the number of queries on every page load rises from the below 50 range to 2000+ queries, and the site slows down to snail speed.

The “Fix”

The solution to this problem is deeper than simple optimizations. Remember that I said “WordPress relies on a set of rules to determine what type of page you’re looking for, and it uses only the URL itself to do this”. Well, to fix the problem, we have to give WordPress more input than just the URL. Specifically, we make it able to find out what Pages exist in the database.

When you use an ambiguous custom structure, WordPress 3.3 still detects that, and it still sets the verbose_page_rules flag. However, the flag now doesn’t cause the Pages to be made unambiguous in the rules. Instead, it changes the way the rules work. Specifically, it causes the following to happen:

The Page rules now get put in front of the Post rules, and
The actual matching process can do database queries to determine if the Page exists.

So now what happens is that the Page matching rules are run first, and for an ambiguous case, they’ll indeed match the Page rule. However, for all Page matches, a call to the get_page_by_path function is made, to see if that Page actually exists. If the Page doesn’t exist in the database, then the rule gets skipped even though it matched, and then the Post’s custom structure rules take over and will match the URL.

The Insight

The first patch I made while at WordCamp Montreal used this same approach of calling get_page_by_path, but the problem with it was that get_page_by_path was a rather expensive function to call at the time, especially for long page URLs. It was still better than what existed already, so I submitted the patch anyway, but it was less than ideal.

When I was at WordCamp San Francisco in August, hanging around all these awesome core developers, Andy Skelton commented on it and suggested a different kind of query. His suggestion didn’t actually work out directly, but it did give me the final idea which I implemented in get_page_by_path. Basically, Andy suggested splitting the URL path up into components and then querying for each one. I realized that you could split the path up by components, query for all of them at once, and then do a loop through the URL components in reverse order to determine if the URL referred to a Page that existed in the database or not.

So basically, given a URL like /aaa/bbb/ccc/ddd, get_page_by_path now does this:

SELECT ID, post_name, post_parent FROM $wpdb->posts WHERE post_name IN ('aaa','bbb','ccc','ddd') AND (post_type = 'page' OR post_type = 'attachment')

The results of this are stored in an array of objects using the ID as the array keys (a clever trick Andrew Nacin pointed out to me at the time).

By then looping through that array only once with a foreach, and comparing to the reversed form of the URL (ddd, ccc, bbb, aaa) you can make an algorithm that basically works like this:

foreach(results as res) {
  if (res->post_name = 'ddd') {
    get the parent of res from the results array
     (if it's not in the array, then it can't be the parent of ddd, which is ccc and should be in the array)
    check to make sure parent is 'ccc',
    loop back to get the parent of ccc and repeat the process until you run out of parents
  }
}

This works because all the Pages in our /aaa/bbb/ccc/ddd hierarchy must be in the resulting array from that one query, if /aaa/bbb/ccc/ddd is a valid page. So you can quickly check, using that indexed ID key, to see if they are all there by working backwards. If they are all there, then you’ll eventually get to parent = zero (which is the root) and the post_name = ‘aaa’. If they’re not there, then the loop exits and you didn’t find the Page because it doesn’t actually exist.

So using this one query, you can check for the existence of a Page any number of levels deep fairly quickly and without lots of expensive database operations.

Consequences

There are still some drawbacks though.

In theory, you could break this by making lots and lots of Pages, if you also made their hierarchy go hundreds of levels deep and thus make the loop operation take a long time. This seems unlikely to me, or at least way more unlikely than somebody making a mere couple hundreds of Pages. Also, WordPress won’t let you use the same Page name twice on the same level, so you’d really have to try for it to make this take too long.

If you try to make a URL longer than around 900K or so, the query will break. Pretty sure it’d break before that though, and anyway most people can’t remember URLs with the contents of a whole book in them. 😉

This also adds one SQL operation to every single Post page lookup. However, this is still better than having it break and try to run a few thousand queries every time in order to build rewrite rules which it can’t ultimately save. And the SQL being used is relatively fast, since post_name and post_type are both indexed fields.

Basically, for the very few and specific cases that had the problem, the speedup is dramatic and immediate. For the cases that use unambiguous rules, nothing has changed at all.

There’s still some bits that need to be fixed. Some of the code is duplicated in a couple of places and that needs to be merged. The pagename rewrite rule is a bit of a hack to avoid clashing, but it works everywhere even if it does make the regexp purist groan with dismay (for critics of this, please know that I did indeed try to do this using a regexp comment to make the difference instead of the strange and silly expression, but it doesn’t work because the regexp needs to be in a PHP array key).

Anyway, there you have it. I wrote the patch, but at least 5 other core developers contributed ideas and put in the grunt work in testing the result. A lot of brain power from these guys went into what is such a small little thing, really. A bit obscure, but I figured some people might like to read about it. 🙂

Shortlink:

Category: Code, Other, WordPress | Comment (RSS) | Trackback

65 Comments

Daniel says:

December 22, 2011 at 5:35 pm

So does this means that I can get rid of the post id from the url and it will still run fast? Or is this for pages only?

Reply to this comment
- Otto says:
  
  December 22, 2011 at 8:16 pm
  
  Yes. To the first question.
  
  Reply to this comment
Anatol Broder says:

December 22, 2011 at 5:46 pm

You should more frequently meet your colleagues. A real life talk can be a huge inspiration.

What is about the speed of %year%/%postname%? This rule still untouched in 3.3, right?

Reply to this comment
- Otto says:
  
  December 22, 2011 at 8:15 pm
  
  I just got back from that sort of a meetup, actually.
  
  And yes, the normal methods are unchanged.
  
  Reply to this comment
Jacob says:

December 23, 2011 at 1:50 am

I love the “if (res->post_name = ‘ddd’)” part 😉

Reply to this comment
Nathan Briggs says:

December 23, 2011 at 8:14 am

Thank you, Otto & gang for all your hard work

Reply to this comment
Matthijs says:

December 23, 2011 at 12:38 pm

Thanks a lot for the hard work on this. I dealt with this issue back then. Was happy with all the help back then, even more happy now that it’s fixed.

And great you wrote this down. Gives everybody some insight in the problem, its history, the solution and last but not least the process and work involved in fixing such an issue.

Reply to this comment
Barrett Golding says:

December 24, 2011 at 9:23 am

Otto, 1st: thanks for this excellent work.

2nd, you mentioned “WordPress won’t let you use the same Page name twice on the same level”. i’ve often wondered what logic uses re: when to allow identical slugs and when to force-add a “-2” to the slug.

are you saying, this is allowed?:
/example.com/page-parent-name/example-slug-name
/example.com/cpt-name/cpt-child-pg-name/example-slug-name

but if on “same level”, it’ll result as?:
/example.com/page-parent-name/example-slug-name
/example.com/cpt-name/example-slug-name-2

and does that unique-slug-per-level rule apply only to pages, or would it apply to both posts & pages on same level, eg, with a permalink structure like %category%/%postname%</code? i should probably just read thru core to find these answers, but if time permits, thanks for any insight you can provide.

Reply to this comment
- Otto says:
  
  December 24, 2011 at 9:36 am
  
  The rules are varied.
  
  – Attachment slugs must be unique across all attachments.
  – Page slugs, or the slugs for any hierarchical post_type, must be unique within their particular tree.
  – Post slugs, or the slugs for any non-hierarchical post_type, must be unique among all posts.
  
  Reply to this comment
  - Barrett Golding says:
    
    December 26, 2011 at 10:01 am
    
    thanks, Otto, and looks like the function that does this is wp_unique_post_slug() in /wp-includes/post.php.
    
    (also slug-related: wp_term_post_slug() in /wp-includes/taxonomy.php)
    
    Reply to this comment
Suzette says:

December 27, 2011 at 6:56 pm

I did have the problem that this upgrade fixes. I run over 200+ WordPress websites, and one of them has over 900 pages without a blog. I am so glad this is fixed, because it caused us a lot of grief until we figured out the problem with the postname permalink.

Reply to this comment
How the Postname Permalinks in WordPress 3.3 Work - Monday By Noon - Monday By Noon says:

December 29, 2011 at 7:20 am

[…] How the Postname Permalinks in WordPress 3.3 Work » Otto on WordPress. […]

Reply to this comment
Mark says:

December 30, 2011 at 11:35 am

Just so I fully grasp what you’re saying (sorry newbie here)…so if I create 2000+ pages on my site and keep hierarchy between 1 to 3 levels deep, I shouldn’t run into any major issues?

Based on your explanation, it seems like the only way it would break is if my hierarchy started getting out of control (hundreds of levels deep) and not necessarily the number of pages on my site.

Thanks for your explanation!

Reply to this comment
- Otto says:
  
  December 30, 2011 at 11:38 am
  
  Right. The bottleneck is no longer on the number of pages, but on the depth of the hierarchy, and on the number of pages sharing the same slugs but in different trees. It should scale to a lot more pages, basically. We tested the original patch with a test site containing 4000 pages and had no troubles with the patch applied.
  
  Reply to this comment
  - Mark says:
    
    December 31, 2011 at 11:06 am
    
    Thanks for the reply. So if I have a hierarchy of pages that uses the same slugs in different trees, that could cause a problem?
    
    Example…
    example.com/blue/green/red
    example.com/purple/green/red
    example.com/yellow/green/red
    example.com/pink/green/red
    
    So in the example above, the last two levels (slugs) are the same across the different trees. If I’m understanding correctly, this could potentially cause an issue?
    
    Thanks for your help! 🙂
    
    Reply to this comment
    - Otto says:
      
      December 31, 2011 at 11:11 am
      
      Not unless you had hundreds (or thousands) of them, no.
      
      Reply to this comment
      - Mark says:
        
        January 1, 2012 at 7:56 pm
        
        And by hundreds or thousands, are we talking about the slugs themselves or the number of trees? Thanks for all your help! 🙂
        
        Reply to this comment
      - Julian Weisz says:
        
        January 12, 2014 at 3:39 pm
        
        Hello there, but if i use different custom post types with same slug and /%postname%/, it’s interfering. So that does not work in my case observed on 3.8
        
        Is that not working for CPT?
        
        Julian
        
        Reply to this comment
        
        Otto says:
        
        January 12, 2014 at 3:41 pm
        
        Custom Post Types have a higher priority than Pages. You can’t have the same slug for a Page and a CPT.
        
        Reply to this comment
        
        Julian Weisz says:
        
        January 12, 2014 at 4:00 pm
        
        Ok thank you, didn’t recognize that you already answered. But that’s not the issue. I actually didn’t setup a page. There are two different custom post types. But when i use the same name in the post of each then one permalink is not going to be solved.
        
        Reply to this comment
        
        Julian Weisz says:
        
        January 12, 2014 at 3:49 pm
        
        Example:
        I use CPT ‘Building’ and CPT ‘Room’ and set a first post to ‘Green’ of type Building and a second post to ‘Green’ of type Room i will get two permalinks
        
        example.com/building/green and example.com/room/green but both points two one url
        
        Reply to this comment
        
        Otto says:
        
        January 12, 2014 at 3:51 pm
        
        No, that works fine for me. You must have registered the post types incorrectly somehow.
        
        Reply to this comment
        
        Julian Weisz says:
        
        January 12, 2014 at 4:02 pm
        
        I used the Easy Custom Conten Types plugin from Pippin Williams. Ok, will raise this issue there. Thank you though!
        
        Julian Weisz says:
        
        January 12, 2014 at 5:08 pm
        
        Hello Sir, figured it out it seems not to be plugin related. It is dependend on the way you use ‘with_front’.
        
        First i registered post types with ‘with_front’ => true. After a while i registered post types with ‘with_front’ => false. Then that issue appeared. After flushing rules everything worked well. So i didn’t knew that you have to flush when you have to register new CPT inbetween work with different ‘with_front’ config.
        
        Julian
        
        Otto says:
        
        January 12, 2014 at 5:20 pm
        
        You have to flush whenever the rewrite rules change in any way. Building the rewrite rules is an expensive operation which takes a lot of time and queries. So they are saved in the database. Flushing dumps them and forces it to rebuild them. They can also be rebuilt just by visiting the Settings->Permalink page (without saving anything).
        
        Reply to this comment
Link Picks: Tweepi, SEO, Social Media and WordPress Permalinks says:

January 1, 2012 at 5:33 am

[…] there was an issue of a performance problem with sites that have large numbers of pages. This Postname Permalinks issue was solved with WordPress 3.3, and Yoast explains in his post How to change your WordPress Permalinks. He has a button on that […]

Reply to this comment
R. Richard Hobbs says:

January 2, 2012 at 11:24 am

Thank you for the informative article and valuable work on WordPress.

I hope my question is not off-topic: My permalinks are currently defined as postname per your article – /category/ and /tag/ are currently 404 not found altho */tag/whatever/ or */category/whatever/ both work just fine. That said, is it safe to create some sort of archive pages using these permalinks? …or other suggestions? many thanks.

Reply to this comment
Monte Martin says:

January 2, 2012 at 9:37 pm

I just read your original article on why pretty permalinks could be harmful. I was shocked to find out that my default /%postname%/ structure wasn’t the perfect SEO tool I thought.

I am having mixed emotions now. Obviously, I am very glad to see the issue has been fixed, but did it really have to wait until right before I FINALLY figured out why I have speed issues with larger sites?

Reply to this comment
Otto Explains Permalinks In WordPress 3.3 says:

January 3, 2012 at 12:01 pm

[…] the ability to use %postname% as the permalink setting without taking a hit in performance. Otto goes into in-depth detail with regards to the patch he wrote to fix the problem which involved lots of help from […]

Reply to this comment
Paul says:

January 3, 2012 at 2:33 pm

Hi Otto

How about the page’s slug begins with number, now fixed too ?

mysite.blah/book01/chapter-1/007

Is this a yes or no ?

Reply to this comment
- Otto says:
  
  January 3, 2012 at 3:05 pm
  
  No, slugs cannot be all numeric. This causes interference with many, many of the built in rules. Date segments are recognized by being numbers, as are page numbers. Changing that would mean making a much more fundamental change in the way the whole thing works.
  
  Best to just not use all-numeric slugs.
  
  Reply to this comment
  - Paul says:
    
    January 3, 2012 at 10:45 pm
    
    Your answer seems to imply that there are 2 cases.
    
    All numberic slug.
    mysite.blah/001
    
    and
    
    Slug begins with numbers.
    mysite.blah/001-chapter
    
    You are saying NO to the first one, but YES to the second one
    
    am I correct ?
    
    Reply to this comment
Japh says:

January 3, 2012 at 5:34 pm

Thanks for the run through of the process, Otto. The problem is something I came across by accident and later on mentioned at WordCamp Melbourne 2011. So glad to it’s fixed, and even happier to know how 🙂

Reply to this comment
Mike Schinkel says:

January 3, 2012 at 11:47 pm

@Otto: You didn’t happen to ask Andy where he got the insight about “splitting the URL path up into components and then querying for each one,” per chance? I only ask because of how vocal somebody was about doing it this way in WordCamp San Fran…

Reply to this comment
- Otto says:
  
  January 4, 2012 at 5:40 am
  
  Pretty sure it’s unrelated to your URL routing ideas. Frankly, I think such a system is complete overkill, along with being ultimately confusing and unnecessary.
  
  Of course, this may be related solely to the way you are presenting the idea. That ticket, for example, is confusing as hell. Arrays within arrays within arrays, tons of magic variables using undefined methodologies, several layers of objects and classes.. It’s craziness.
  
  A large scale change of this nature is highly unlikely to get anybody interested. Iteration of the existing system, on the other hand, is more likely to get taken seriously.
  
  The patch I made was simply improvements to what was already there. What you’re proposing is a wholesale redesign and reworking, along with all the breakage and testing and pain that that implies. This probably won’t fly when people don’t see what we have as broken in some way. Heck, I don’t think it’s broken enough to use your ideas.
  
  If you want something like your ticket implies, then you should sell it as enhancements and/or improvements to existing code. Small steps, not giant leaps.
  
  Reply to this comment
  - Mike Schinkel says:
    
    January 4, 2012 at 8:00 am
    I’m pretty sure it was my presentation of my ideas in those tickets that has confused you. Admittedly it was less than coherent. I blame my lack of understanding the exact architecture needed for a backward compatible solution; when I don’t fully understand something I write about it in order to gain insight. And yes, my writing in those cases can be very hard to follow.
    
    My key and original insight 21 months ago was to analyze URLs not as a whole like WordPress’ rewrite currently does with a linear list of regular expressions but instead that we should look at each path segment, and that’s exactly what you mentioned above when you said "splitting the URL path up into components". Two secondary insights I had at the time were to 1.) leverage semantic context generically in addition to syntax, which is exactly what you did with Pages albeit not generically, and to allow for many and extensible methods for matching, i.e. by hook, by MySQL lookup, by caching, by transient lookup, by option lookup, by testing array keys for existence, and finally by RegEx. Everything else about my tickets was merely attempts at implementation. So by now please seperate those insights from the rest which understandably confused.
    
    So I’m pretty sure it was my URL routing work that got Andy thinking about looking at each page segment because he unexpectedly-to-me referenced my ticket in his 2011 SF WordCamp talk after which he was gracious enough to sit down with me and discuss my URL routing ideas and then he posted this comment in support of the foundational ticket I’ll discuss next.
    
    Ultimately with my comments on this post I’m trying to get you to reconsider something you were against when we spoke face-to-face at the pre-WordCamp SF 2011 hackfest. We discussed this ticket of mine which only seeks to get a single 'wp_parse_request' hook added. I’m not sure if you remember but you objected to the ticket, and I quote from memory: "Because people might use it."But let me be clear I not looking for credit on my insight; as I know you are highly influencial among the core developers I just want your to support in 3.4 for adding capabilites that currently don’t exist in WordPress to allow us to extend WordPress’ URL routing system. Allowing $wp->parse_request() to be extended would let many people experiment with better ways of URL routing, ways that could take the insight you applied specifically to Pages and instead apply it generically.
    
    So ironically given your "Small steps, not giant leaps" comment that’s exactly what I have plans to do with my own efforts. I had pushed forward with using the code I posted on ticket #18726 and found that the specific implementation had signficant downsides and so I closed my own ticket as "won’t fix". My new goal in my code is to continue subclassing WP to support a replaceable parse_request() and then implement much simpler code in the replacement parse_request() that I posted in the ticket. I also realized while implementing my code that the URL needs to be matched starting with the right-most path segment not starting from the root and that was the same as your insight.
    
    One additional insight I’ve had is we could create a URL caching plugin to cache the resultant $wp->query_var array that is normally processed by parse_request(). By implementing routing cache for the top "N" most visited web pages then most page loads could omit the need for any additional MySQLs lookup. Why force the re-evaluation of the URL for each page load for popular URLs? Using your example the following array could be cached and then if matched bypass $wp->parse_request():
    
    array( 'aaa/bbb/ccc/ddd' => array( 'pagename' => 'aaa/bbb/ccc/ddd' ), ... )
    
    Of course it would be so much better if we didn’t have to subclass WP in order to bypass parse_request() since two plugins that subclass WP can’t both run on the same site. If we had the hook I’ve requested then others could explore similar ideas giving us a much better chance that we’d be able to evolve to a robust and flexible URL routing system from the efforts of many developers before, say WordPress 4.0. And that’s exactly what happened with you and Andy at the hackfest. Why not make empower the rest of us to collaborate on this instead of just you and Andy?
    Pretty please…? 🙂
    Reply to this comment
    - Otto says:
      
      January 4, 2012 at 8:21 am
      
      I’m not sure what the benefit would be of splitting the path into components specifically for the case of routing. I used it in this case not for routing, but for determining whether or not a given page exists with a faster SQL query.
      
      Whenever I read your posts on the topic, it seems like you think that the URL path should determine what is displayed, and I disagree with that. See, I think that the URL should determine what is queried for, and then display should be based on the content that results from that query.
      
      In other words, URL path becomes query variables. The means by which you want to get from A to B is thus far unclear to me. I prefer simpler solutions over complex ones, and an array of regexp’s (with the occasional special case) seems pretty darned simple to me.
      
      What is it that you want to do which cannot be accomplished with this method? See, you’ve presented a lot of code in various tickets, but the problem is that all that code is dense and obscure. I don’t know what it is that you’re trying to accomplish. Often you seem to be trying to make things abstract and generalized, and you end up writing dozens of wrapper functions making things simpler and simpler to the point of obscurity. If I can’t tell what it’s doing and why, then I can’t really argue for it.
      
      And no, I still reject the idea of a hook to pre-empt the rewrite rules, because such a thing doesn’t lead to improvement. If you want rewrites to be better, tell us a) in what way they don’t work now and b) what you want them to do. Don’t lead with the chin here, try to explain why they don’t work and how they could work better without scrapping the whole thing or bypassing the system completely. Incremental improvements, not world-changers.
      
      Reply to this comment
      - Mike Schinkel says:
        
        January 4, 2012 at 9:07 am
        
        If you break the URL by path segments, one benefit it to reduce the number of preg_match()s that need to be attempted as the current approach causes an explosion of match patterns the more post types and end point you have to match. Even better you can use other matching methods for each segment which opens up the ability to have different items at the same level in the site, i.e.
        
        [pre]
        /about/ post_type=”page”
        /mike-schinkel/ post_type=”person”
        /public-finance/ post_type=”practice-area”
        /news/ taxonomy=”category”
        /featured/ taxonomy=”post_tag”
        [/pre]
        
        If you think I’m saying that the URL path should determine what is displayed vs. what is queried for then either I’ve done a horrible job or explaining or you really have never read/contemplated what I have been writing (you did answer the above comment really fast… Thanks for formatting it, btw. 🙂
        
        “In other words, URL path becomes query variables.”
        
        That’s EXACTLY what my stuff does. The ; how about we instead focus on this fact?
        
        [pre]
        $wp->parse_request() -> $wp->query_vars
        [/pre]
        
        When you boil it down, everything I’m working on is premised on that individual fact; everything else including your array of regexps are all just implementation details.
        
        But even so, that doesn’t matter. All we need is that 'wp_parse_request' hook and then each of us can try out ways to make it better, including URL caching where the results of a cache hit is a value to be assigned to $wp->query_vars.
        
        What I want to do that we can’t already do is be able to bypass execution of $wp->wp_parse_request(). Ignore all the code you think is dense and obsure; just focus on the hook needed to bypass $wp->query_vars.
        
        At the point I don’t want to bypass the rewrite system, I learned that would be too hard, I simply want to front end it for special use-cases and fall back to it if the special use-cases don’t match; after all, isn’t that what hooks are generally for?. Two use cases, one that I’ve already explained:
        
        1.) Caching of frequently used URLs
        2.) Abilility to control URL routing for special sections of the site without first having to fail on all existing URLs.
        
        For an example of #2 here is a set of URLs that we needed for a project; %person%, %practice-area% and %news-item% are slugs for 3 different post types, and I’m only showing a small subset of the URLs we needed to route:
        
        [pre]
        /%person%/
        /%person%/news/
        /%person%/news/%news-item%/
        /%practice-area%/
        /%practice-area%/news/
        /%practice-area%/news/%news-item%/
        [/pre]
        
        My code needed to match those URLs did lookups to see if it was a person or a practice area or a news-item, and if it was a news item verify it associated with the person or practice area, and if a /%..%/news/ URL looked for post type of 'microsite_page' that had as it’s parent a post with a slug of person or practice-area post type, and if just a /%..%/ then it verified it was indeed a valid person or practice area. And if none of those matched, then just delegate to WordPress’ standard $wp->parse_request().
        
        But don’t first force me to run all the standard match attempts before I get to see if my match attempts work.
        
        And you can’t do the above without bypassing the current WordPress URL routing system.
        
        I’m AM asking . I always here “Do it in the plugin first then we can look at it.” ALL I’m asking for here is incremental improvement in the ability to do it in a plugin first. How you could view that as a bad thing is completely foreign to me?
        
        Reply to this comment
    - Otto says:
      
      January 4, 2012 at 8:27 am
      
      BTW, have you considered using the after_setup_theme action to replace the $GLOBALS[‘wp_rewrite’] with your own version, subclassed from the WP_Rewrite class? Using that, you could implement whatever rewrite system you want, in a plugin. Your caching notion could be implemented in that manner.
      
      Reply to this comment
      - Mike Schinkel says:
        
        January 4, 2012 at 9:19 am
        
        I considered it early on and the issue was it still ran code I was trying to bypass. I could look at it again, but my subclassing is currently working so I don’t need to, and if you propose that why not just support a 'wp_parse_request' hook?
        
        Reply to this comment
Dustin says:

January 4, 2012 at 9:26 am

Thanks for the update Otto. Quick couple questions:

-Did the changes in WP 3.3 resolve performance issues with other structures such as /%category%/%postname%/ or did it just improve /%postname%/?

-Also, which do you think is a better option (with 3.3) for performance and SEO between /%postname%/ and /%category%/%postname%/? I currently use /category/postname, but am considering switching to just postname, mainly because I routinely assign posts to multiple categories, which can result in some confusing permalinks. Could switching to postname still improve my site speed? I am a bit worried about the effect on SEO though (but who knows..maybe just postname would be better for SEO).

Thanks!

Reply to this comment
Kurt says:

January 4, 2012 at 10:31 am

Thanks for all the work you put into this update!

Thanks for the explanation too.

Reply to this comment
roswel says:

May 3, 2012 at 5:00 pm

There seem to be a bug with Permalinks in 3.3. When we use postname the top level pages throw 404 error because the script does not know whether it is a page or a post.

The query string sets the name instead of pagename for the top level pages causing the is_page to set as false. Not sure if rewrite.php needs some fixing but my fix was to add and extra in parse_query function below

if( ” == $qv[‘pagename’] ) $qv[‘pagename’] = $qv[‘name’] ;

Also set $this->is_page as below

$this->queried_object =& get_page_by_path($qv[‘pagename’]);

if ( !empty($this->queried_object) ) {
$this->queried_object_id = (int) $this->queried_object->ID;
$this->is_page = true;
}

Reply to this comment
- Otto says:
  
  May 3, 2012 at 5:16 pm
  
  I can’t reproduce this. If you only use the %postname% as the custom slug, then the code we added into 3.3 takes care of this. See, the page_rewrite rules are in front of the post_rewrite rules. So it will first match as a Page because the URL pattern matches.
  
  So any URL with just the top-level will first match against Pages and have pagename set. The code in rewrite.php then uses the get_page_by_path function to actually check to see if that is a real Page. If the Page exists, then pagename is set and the query happens as normal. If the Page does not exist, then the rewrite match is thrown away and it continues on with the normal Post rewrite rules.
  
  Your change to parse_query wouldn’t have any effect, because parse_query doesn’t happen until *after* the rewrites have already taken place. So if it makes it to the parse query and name is set instead of pagename, then get_page_by_path doesn’t think the page exists, in which case you have other underlying problems.
  
  Reply to this comment
Arie Putranto says:

May 23, 2012 at 4:21 pm

The simplest and fastest implementation for better performance is to add a field on *_posts table of WordPress db to save COMPLETE permalink being used rather than only saving page name slugs. So for each requests WP should look on this field to find the exact matched post before crawling around using regex. Then, make a function where anytime user update their permalink structure, the data will also being updated.

That way, things will be easier …

Reply to this comment
- Otto says:
  
  May 23, 2012 at 4:29 pm
  
  Most permalinks don’t point to “posts” at all, actually. There’s things like tag and category links, custom post types have their own modifiable link structures, author links, custom taxonomies, etc. Storing the permalink list as an add-on to the posts table wouldn’t really save you any lookup time for anything except a very narrow use-case, and even if you stored the permalinks in their own table, the additional queries to rebuild this table and restructure it as new things were added would be a big performance killer. The dynamic system being used with PHP and regular expressions is a lot more complex, but definitely faster and more flexible, especially when you want to restructure things.
  
  Reply to this comment
CY says:

July 27, 2012 at 7:35 am

Hi there

I am not a technical person. I am just setting up my new WP blog today. I’m looking at the permalink settings and don’t know what to do. I read all the analysis above and I am LOST.

Otto, could you please let me know what permalink structure to use that is best for performance? I really just want to dive into the site and populate my content quickly.

Btw, I have installed WP version 3.4.1. Not sure if it helps with your assessment. Thank you in advance. I hope to hear from you soon.

Thank you again.

Reply to this comment
- Otto says:
  
  July 27, 2012 at 7:40 am
  
  The point of all of the above is that it no longer matters which setup you use, they’re all pretty much the same now from a performance standpoint.
  
  Reply to this comment
  - CY says:
    
    July 27, 2012 at 7:48 am
    
    Thanks very much for your prompt reply Otto. If its no longer about performance, then could you please advise what permanlink structure should I used for SEO purposes?
    
    My initial thoughts were on performance after all the hooha in the discussion. Now that performance is no longer a hindrance, then SEO is next in my priority.
    
    Thanks again Otto.
    
    Reply to this comment
  - gaurav kaushik says:
    
    December 31, 2012 at 7:02 am
    
    hiii
    can anybody explain what should be the permalink of the custon post type because i think it make the wordpress complete cms and on the web custom post type becoming very popular but unfortunately about their permalink there is not any detailed information by the wordpress team and …otto can you explain something about the performance for permalink of custom post type..if i have more than one post type and there is thousands of post in each post type …what should be the best structure for performance?
    
    Reply to this comment
  - santiago says:
    
    September 30, 2013 at 9:42 pm
    
    Is completely the same for all of the structures in performance? ??? Even for very large sites with any number of posts specifically? ??
    
    Thanks for this great content!
    
    Reply to this comment
Navneet says:

July 28, 2012 at 1:31 am

I am looking for a best permalink structure for my new blog. I am much intended to use category slug and then postname slug. But if that would slow down the wordpress then if i use postid in last of permalink tree would it be beneficial or will it affect my permalink seo?? Waiting for qucik reply!

Reply to this comment
Mike says:

July 31, 2012 at 9:58 am

Can anyone clarify; is this patch now included in a default WP install or do we have to download it and apply?

thanks

Reply to this comment
- Otto says:
  
  September 11, 2012 at 8:30 am
  
  The patch is already in core and has been since WordPress 3.3.
  
  Reply to this comment
Martin says:

September 11, 2012 at 10:36 am

Hi Ptto,
thanks to your diligent report. Surely many WP-users are a bit happier now. One thing still confuses me, maybe you can clear it up. On the one hand you say “The point of all of the above is that it no longer matters which setup you use, they’re ALL (!!!) pretty much the same now from a performance standpoint.”
On the other hand you answer to Anatol “And yes, the normal methods are unchanged.”

Let’s assume, I choose the custom structure with “postname” and when creating a new page I EDIT the permalink, for example replacing the last two words by a keyword. What influence have those edits, does it still profit from your 3.3-improvement?

Thanks,

Martin

Reply to this comment
- Otto says:
  
  September 11, 2012 at 10:39 am
  
  I don’t understand your question. This doesn’t have anything to do with editing the permalink (post-slug). That wouldn’t make any difference from any angle, and there never was any performance problem associated with it.
  
  Reply to this comment
  - Martin says:
    
    September 11, 2012 at 11:38 am
    
    Thanks for the fast response.
    So whatever I change by editing, it still remains a “postname-permalink” in the eyes of WP, profiting from the performance improvements. O.k., fine.
    
    What permalink-structures STILL can cause slow speed with a lot of pages? All permalinks using textbased structure tags (i.e. category), except “postname”?
    
    Reply to this comment
    - Otto says:
      
      September 11, 2012 at 11:40 am
      
      Umm.. None of them. That’s pretty much what this post explains.
      
      Using an ambiguous custom permalink structure for the single-post URLs will now only result in 1 extra query. No slowdowns of any kind.
      
      In other words, this is now considered a fixed problem. This post details the technical behavior behind the “fix”.
      
      Reply to this comment
      - santiago says:
        
        September 30, 2013 at 9:54 pm
        
        You use the year and then the postname. Is this only your postslug like or there any advantaje in performance with your structure? ??
        
        Reply to this comment
Martin says:

September 11, 2012 at 3:28 pm

Everywhere (also in the codex) the fix is referred to “postname”, so people like me might not understand, why ALL formerly slow structure tags are faster now (as you say in the last comment).

Sorry if it is just due to my lack of knowledge.

Reply to this comment
What is the Best WordPress Permalink Settings for Your Website? says:

October 12, 2012 at 3:34 pm

[…] One of the WordPress developers blogged about the performance improvement in WordPress 3.3. […]

Reply to this comment
The Best Wordpress Permalink Settings For SEO | Build Website TipsBuild Website Tips says:

January 17, 2013 at 4:19 am

[…] it is not a case anymore. Otto himself already update this problem about a year later in his post How the Postname Permalinks in WordPress 3.3 Work. But if you still cautious about this then you can follow the /%post_id%/%category%/%postname%/ […]

Reply to this comment
Removing the Slug in a Custom Post Type Permalink / Cross Eye Coder says:

February 26, 2014 at 9:25 pm

[…] in WordPress shared. Though the performance issues that plagued using the postname permalink was resolved with WordPress 3.3, it was still sort of tricky to implement it. I found a few tutorials online but I decided to find […]

Reply to this comment
How the Postname Permalinks in WordPress 3.3 Work » Otto on WordPress · Japh says:

January 24, 2017 at 2:00 am

[…] How the Postname Permalinks in WordPress 3.3 Work » Otto on WordPress […]

Reply to this comment
The Best WordPress permalink structure for Optimal SEO and Performance says:

April 7, 2020 at 4:12 pm

[…] one of the core contributors in the WordPress development team confirms it on his blog when he […]

Reply to this comment