Archive for December 2011

Still getting emails about this one, so here’s a quick rundown on how to do it.

First, if you were already using a Fan Page, then you are not affected at all and don’t have to do anything. Please stop emailing me and asking for confirmation. Thanks. 🙂

Now, if you were using your Application’s Wall as your Page (like I was doing and even recommended), then Facebook is killing off the “Wall” of your Application. This is not a big deal, actually, and you can migrate your Fans to a new Page rather easily.

Step One: Create a new Page. Visit this page to do so. Note: You MUST select “Brand or Product”, and in the dropdown you MUST select “App”. This is not optional. You have to do this to migrate your Fans.

Also note that you must make the name of the Page EXACTLY THE SAME as the name of the application. This is important, don’t try to rename your stuff yet.

Step Two: After you’ve created the page, you’ll want to connect it to your site (using SFC, naturally). First, get the ID number of your new Page. You can find this in the URL of the “Edit Page” link on that Facebook Page. Once you have the ID number, put it into the “Facebook Fan Page” field on the SFC Settings screen and save. While you’re on this Edit Page link on Facebook, you can upload your logos, configure it, etc. Note: Do NOT select a new Vanity URL. The migration will migrate your old one if you had one.

Step Three: Configure SFC. If you’re using the Publisher, for example, you may have to click the grant permissions button again to have it get the new access token for the page. You may need to turn on auto-publishing to the page. Stuff like that. For the most part, SFC is pretty good at configuring itself for this, the Fan Box will automagically switch over, etc.

Step Four: Test. Make a new Post and see if it publishes to your Page. Try the Manual Publisher boxes. Verify that it’s working, basically. While you’re at it, you might go and manually publish some of your older posts to the Page, since the migration will not migrate the content on the wall.

Step Five: Migrate. Visit your application’s profile page. If you don’t see the box below, wait a day or two and it will eventually appear:

Use that migrate link and you’ll get a popup box allowing you to select a Page.

WARNING: If you get a popup that says “You don’t have any eligible Facebook Pages to migrate to”, then STOP RIGHT NOW. Do NOT click migrate. You only get one chance at this, if you mess it up then it’s broken forever.

If you have a Page, and it’s a “Brands or Products/App” page, and it has the EXACT same name as your Application, then you will be given a dropdown to select that Page. Otherwise, you’ll get the bad message. Click Cancel in such a case, fix your Page, and then try again. Only when you have the dropdown and have selected your page should you continue.

Step Six: Patience. Once you’ve selected your new Page and clicked Migrate (and remember, you only get one shot at this!), then after a while, a few things will happen:

a) Your Fans of the Application will slowly be migrated to be Fans of the new Page instead.

b) If you had a vanity URL on the Application Page and did not have one on the Fan Page, then the vanity URL will get migrated too.

c) Your Application Wall will disappear forever (this happens instantly) and any links to it will redirect to your Fan Page.

And that’s it. You’re done. Works fine with SFC. The next version of SFC will remove the publishing to Application Pages entirely, as well as the (now misleading) wording.

 

Shortlink:

Facebook is getting rid of Application Profile Pages, and allowing people who are using them to transfer their subscribers to normal Fan Pages. SFC will be changing soon to adapt to this change, but the existing Fan Page support in SFC works fine and can be used right now.

I’ve tested out this migration process on one of my pages, and it works fine. Here’s what you have to do to make it work with SFC if you were not using a Fan Page already (note, if you were using a Fan Page already, then you’re done and must change nothing at all).

1. Create a new Fan Page in the Brands/Product -> App category.

2. Give it the same name as your App (exactly the same, mind you).

3. Set up the new fan page however you like. Take its ID number and put that into SFC, then use the Manual publisher to fill out the wall with some of the older posts (the wall content will NOT be migrated when you do the migration).

4. When you visit your app’s wall, you’ll get the migration message (eventually). You can use this to migrate all the people who have liked your application to having liked the new Fan Page. If you used a vanity URL, this will transfer also *if* you don’t put a vanity URL on the Fan Page.

After you’ve migrated the likes and changed SFC to be publishing to the Page, you can continue on as normal. Nothing else about SFC changes. Since Facebook will be eliminating App Profile Walls entirely in February, I’ll be removing support for them from SFC entirely before then. Expect that change to be in SFC 1.3.

Shortlink:

So, I first wrote about this topic on the wp-hackers list back in January 2009, explaining some of the scaling issues involved with having ambiguous rewrite rules and loads of static Pages in WordPress. A year later the same topic came up again in the WPTavern Forums, and later I wrote a blog post about the issue in more detail. That post generated lots of questions and responses.

In August 2011, thanks to highly valuable input from Andy Skelton which gave me a critical insight needed to make it work, and with Jon Cave and Mark Jaquith doing testing (read: breaking my patches over and over again), I was able to create a patch which fixed the problem (note: my final patch was slightly over-complicated, Jon Cave later patched it again to simplify some of the handling). This patch is now in WordPress 3.3.

So I figured I’d write up a quick post explaining the patch, how it works, and the subsequent consequences of it.

Now you have two problems.Quick Summary of the Problem

The original underlying problem is that WordPress relies on a set of rules to determine what type of page you’re looking for, and it uses only the URL itself to do this. Basically, given /some/url/like/this, WordPress has to figure out a) what you’re trying to see and b) how to query for that data from the database. The only information it has to help it do this is called the “rewrite rules”, which is basically a big list of regular expressions that turn the “pretty” URL into variables used for the main WP_Query system.

The user of the WordPress system has direct access to exactly one of these rewrite rules, which is the “Custom Structure” on the Settings->Permalink page. This custom string can be used to change what the “single post” URLs look like.

The problem is that certain custom structures will interfere with existing structures. If you make a custom structure that doesn’t start with something easily identifiable, like a number, then the default rewrite rules wouldn’t be able to cope with it.

To work around this problem, WordPress detected it and uses a flag called “verbose_rewrite_rules”, which triggers everything into changing the list of rules into more verbose ones, making the ambiguous rules into unambiguous ones. It did this by the simple method of making all Pages into static rules.

This works fine, but it doesn’t scale to large numbers of Pages. Once you have about 50-100 static Pages or so, and you’re using an ambiguous custom structure, then the system tends to fall apart. Most of the time, the ruleset grows too large to fit into a single mySQL query, meaning that the rules can no longer be properly saved in the database and must be rebuilt each time. The most obvious effect when this happens is that the number of queries on every page load rises from the below 50 range to 2000+ queries, and the site slows down to snail speed.

The “Fix”

The solution to this problem is deeper than simple optimizations. Remember that I said “WordPress relies on a set of rules to determine what type of page you’re looking for, and it uses only the URL itself to do this”. Well, to fix the problem, we have to give WordPress more input than just the URL. Specifically, we make it able to find out what Pages exist in the database.

When you use an ambiguous custom structure, WordPress 3.3 still detects that, and it still sets the verbose_page_rules flag. However, the flag now doesn’t cause the Pages to be made unambiguous in the rules. Instead, it changes the way the rules work. Specifically, it causes the following to happen:

  1. The Page rules now get put in front of the Post rules, and
  2. The actual matching process can do database queries to determine if the Page exists.

So now what happens is that the Page matching rules are run first, and for an ambiguous case, they’ll indeed match the Page rule. However, for all Page matches, a call to the get_page_by_path function is made, to see if that Page actually exists. If the Page doesn’t exist in the database, then the rule gets skipped even though it matched, and then the Post’s custom structure rules take over and will match the URL.

The Insight

The first patch I made while at WordCamp Montreal used this same approach of calling get_page_by_path, but the problem with it was that get_page_by_path was a rather expensive function to call at the time, especially for long page URLs. It was still better than what existed already, so I submitted the patch anyway, but it was less than ideal.

When I was at WordCamp San Francisco in August, hanging around all these awesome core developers, Andy Skelton commented on it and suggested a different kind of query. His suggestion didn’t actually work out directly, but it did give me the final idea which I implemented in get_page_by_path. Basically, Andy suggested splitting the URL path up into components and then querying for each one. I realized that you could split the path up by components, query for all of them at once, and then do a loop through the URL components in reverse order to determine if the URL referred to a Page that existed in the database or not.

So basically, given a URL like /aaa/bbb/ccc/ddd, get_page_by_path now does this:

SELECT ID, post_name, post_parent FROM $wpdb->posts WHERE post_name IN ('aaa','bbb','ccc','ddd') AND (post_type = 'page' OR post_type = 'attachment')

The results of this are stored in an array of objects using the ID as the array keys (a clever trick Andrew Nacin pointed out to me at the time).

By then looping through that array only once with a foreach, and comparing to the reversed form of the URL (ddd, ccc, bbb, aaa) you can make an algorithm that basically works like this:

foreach(results as res) {
  if (res->post_name = 'ddd') {
    get the parent of res from the results array
     (if it's not in the array, then it can't be the parent of ddd, which is ccc and should be in the array)
    check to make sure parent is 'ccc',
    loop back to get the parent of ccc and repeat the process until you run out of parents
  }
}

This works because all the Pages in our /aaa/bbb/ccc/ddd hierarchy must be in the resulting array from that one query, if /aaa/bbb/ccc/ddd is a valid page. So you can quickly check, using that indexed ID key, to see if they are all there by working backwards. If they are all there, then you’ll eventually get to parent = zero (which is the root) and the post_name = ‘aaa’. If they’re not there, then the loop exits and you didn’t find the Page because it doesn’t actually exist.

So using this one query, you can check for the existence of a Page any number of levels deep fairly quickly and without lots of expensive database operations.

Consequences

There are still some drawbacks though.

In theory, you could break this by making lots and lots of Pages, if you also made their hierarchy go hundreds of levels deep and thus make the loop operation take a long time. This seems unlikely to me, or at least way more unlikely than somebody making a mere couple hundreds of Pages. Also, WordPress won’t let you use the same Page name twice on the same level, so you’d really have to try for it to make this take too long.

If you try to make a URL longer than around 900K or so, the query will break. Pretty sure it’d break before that though, and anyway most people can’t remember URLs with the contents of a whole book in them. 😉

This also adds one SQL operation to every single Post page lookup. However, this is still better than having it break and try to run a few thousand queries every time in order to build rewrite rules which it can’t ultimately save. And the SQL being used is relatively fast, since post_name and post_type are both indexed fields.

Basically, for the very few and specific cases that had the problem, the speedup is dramatic and immediate. For the cases that use unambiguous rules, nothing has changed at all.

There’s still some bits that need to be fixed. Some of the code is duplicated in a couple of places and that needs to be merged. The pagename rewrite rule is a bit of a hack to avoid clashing, but it works everywhere even if it does make the regexp purist groan with dismay (for critics of this, please know that I did indeed try to do this using a regexp comment to make the difference instead of the strange and silly expression, but it doesn’t work because the regexp needs to be in a PHP array key).

Anyway, there you have it. I wrote the patch, but at least 5 other core developers contributed ideas and put in the grunt work in testing the result. A lot of brain power from these guys went into what is such a small little thing, really. A bit obscure, but I figured some people might like to read about it. 🙂

 

Shortlink:

Missed the news last week or so, but Twitter added oEmbed provider support to their API. While previous methods have existed to easily post tweets (such as Blackbird Pie), oEmbed is built into the WordPress core.

However, since Twitter didn’t implement oEmbed discovery, and WP has discovery off by default anyway, you have to resort to a small bit of code to make it work. Here’s that bit of code:

add_filter('oembed_providers','twitter_oembed');
function twitter_oembed($a) {
	$a['#http(s)?://(www\.)?twitter.com/.+?/status(es)?/.*#i'] = array( 'http://api.twitter.com/1/statuses/oembed.{format}', true);
	return $a;
}

Here’s what happens when you put that code in a plugin (for example) and just paste a twitter URL into a post:

It handles RT’s pretty neatly, I think. 🙂

This may make it into WordPress by default in the next version. Too bad they came out with it too late for inclusion in WordPress 3.3.

Shortlink: