Posts tagged ‘filter’

My post about how to customize WordPress images with tricks like greyscale and such got me lots of feedback, so I figured I might as well turn it into a plugin.

The ImageFX plugin allows you to customize the image sizes from WordPress or custom ones for your theme, by applying filters to them.

The way it works is basically identical to my original post on the topic, only it allows the filters to be defined on a per-image-size level. It also allows the addition of a “slug” to be appended to the image filename, which is useful for cases where you want to have two images at the same size, but with different filters.

Since it was easy to do, I went ahead and created several other simple image filters that you can use for your images:

  • Greyscale (black and white)
  • Sepia tone (old-timey!)
  • Colorize with red, yellow, green, blue, or purple
  • Photonegative
  • Emboss
  • Brighten
  • Greyscale except red, green, or blue (classy!)

Here’s some examples. This a pic of me, Nacin, Rose, and Matt at WordCamp San Francisco. I ran it through the sepia, blue colorize, and grey-except-red filters.



These are some of the default filters included, but since I could, I went ahead and made it easily expandable too. All you have to do to define a filter is to create a function to do the image filtering you want, then call the imagefx_register_filter() function to add it.

To implement your own custom filter, you can do it like this:

imagefx_register_filter('custom-name','my_custom_filter');
function my_custom_filter(&$image) {
 // modify the $image here as you see fit
}

Note that the $image is passed by reference, so you don’t have to return it. This is because the $image resource takes up a lot of memory, so to save on memory usage, you are manipulating it in place, sort of thing.

You can use any of the image functions in PHP to change the image however you like. The filters I’ve implemented are mostly pretty simple. You can see them all in the filters.php file, in the plugin.

Caveats: The plugin will only filter JPG images, to avoid the overhead of recompressing PNGs and to avoid breaking animated GIF files. Also note that I haven’t tested these filters extensively. They’re only a starting point, sort of thing. I spent all of about 20 minutes writing them, so don’t expect miracles. 🙂

You can download version 0.1 of the plugin from the WordPress plugin directory: http://wordpress.org/extend/plugins/imagefx/

Enjoy!

Shortlink:

Seems that some people don’t know much about kses. It’s really not all that complicated, but there doesn’t seem to be a lot of documentation around for it, so what the hell.

The kses package is short for “kses strips evil scripts”, and it is basically an HTML filtering mechanism. It can read HTML code, no matter how malformed it is, and filter out undesirable bits. The idea is to allow some safe subset of HTML through, so as to prevent various forms of attacks.

However, by necessity, it also includes a passable HTML parser, albeit not a complete one. Bits of it can be used for your own plugins and to make things a bit easier all around.

Note that the kses included in WordPress is a modified version of the original kses. I’ll only be discussing it here, not the original package.

Filtering

The basic use of kses is as a filter. It will eliminate any HTML that is not allowed. Here’s how it works:

$filtered = wp_kses($unfiltered, $allowed_html, $allowed_protocols);

Simple, no?

The allowed html parameter is an array of HTML that you want to allow through. The array can look sorta like this:

$allowed_html = array(
	'a' => array(
		'href' => array (),
		'title' => array ()),
	'abbr' => array(
		'title' => array ()),
	'acronym' => array(
		'title' => array ()),
	'b' => array(),
	'blockquote' => array(
		'cite' => array ()),
	'cite' => array (),
	'code' => array(),
	'del' => array(
		'datetime' => array ()),
	'em' => array (), 'i' => array (),
	'q' => array(
		'cite' => array ()),
	'strike' => array(),
	'strong' => array(),
);

As you can see, it’s rather simple. The main array is a list of HTML tags. Each of those points to an array of allowable attributes for those tags. Each of those points to an empty array, because kses is somewhat recursive in this manner.

Any HTML that is not in that list will get stripped out of the string.

The allowed protocols is basically a list of protocols for links that it will allow through. The default is this:

array ('http', 'https', 'ftp', 'ftps', 'mailto', 'news', 'irc', 'gopher', 'nntp', 'feed', 'telnet')

Anything else goes away.

That $allowed_html I gave before may look familiar. It’s the default set of allowed HTML in comments on WordPress. This is stored in a WordPress global called $allowedtags. So you can use this easily like so:

global $allowedtags;
$filtered = wp_kses($unfiltered, $allowedtags);

This is so useful that WordPress 2.9 makes it even easier:

$filtered = wp_kses_data($unfiltered);
$filtered = wp_filter_kses($unfiltered); // does the same, but will also slash escape the data

That uses the default set of allowed tags automatically. There’s another set of defaults, the allowed post tags. This is the set that is allowed to be put into Posts by non-admin users (admins have the “unfiltered_html” capability, and can put anything they like in). There’s easy ways to use that too:

$filtered = wp_kses_post($unfiltered);
$filtered = wp_filter_post_kses($unfiltered); // does the same, but will also slash escape the data

Note that because of the way they are written, they make perfect WordPress filters as well.

add_filter('the_title','wp_kses_data');

This is exactly how WordPress uses them for several internal safety checks.

Now, this is all very handy, but what if I’m not filtering? What if I’m trying to get some useful information out of HTML? Well, kses can help you there too.

Parsing

As part of the filtering mechanism, kses includes a lot of functions to parse the data and to try to find HTML in there, no matter how mangled up and weird looking it might be.

One of these functions is wp_kses_split. It’s not something that is useful directly, but it is useful to understand how kses works. The wp_kses_split function basically finds anything that looks like an HTML tag, then passes it off to wp_kses_split2.

The wp_kses_split2 function takes that tag, cleans it up a bit, and perhaps even recursively calls kses on it again, just in case. But eventually, it calls wp_kses_attr. The wp_kses_attr is what parses the attributes of any HTML tag into chunks and then removes them according to your set of allowed rules. But here’s where we finally find something useful: wp_kses_hair.

The wp_kses_hair function can parse attributes of tags into PHP lists. Here’s how you can use it.

Let’s say we’ve got a post with a bunch of images in it. We’d like to find the source (src) of all those images. This code will do it:

global $post;
if ( preg_match_all('/<img (.+?)>/', $post->post_content, $matches) ) {
        foreach ($matches[1] as $match) {
                foreach ( wp_kses_hair($match, array('http')) as $attr)
                	$img[$attr['name']] = $attr['value'];
                echo $img['src'];
        }
}

What happened there? Well, quite a bit, actually.

First we used preg_match_all to find all the img tags in a post. The regular expression in preg_match_all gave us all the attributes in the img tags, in the form of a string (that is what the “(.+?)” was for). Next, we loop through our matches, and pass each one through wp_kses_hair. It returns an array of name and value pairs. A quick loop through that to set up a more useful $img array, and voila, all we have to do is to reference $img[‘src’] to get the content of the src attribute. Equally accessible is every other attribute, such as $img[‘class’] or $img[‘id’].

Here’s an example piece of code, showing how kses rejects nonsense:

$content = 'This is a test. <img src="test.jpg" class="testclass another" id="testid" fake fake... / > More';
if ( preg_match_all('/<img (.+?)>/', $content, $matches) ) {
        foreach ($matches[1] as $match) {
                foreach ( wp_kses_hair($match, array('http')) as $attr)
                	$img[$attr['name']] = $attr['value'];
                print_r($img); // show what we got
        }
}

The resulting output from the above:

Array
(
    [src] => test.jpg
    [class] => testclass another
    [id] => testid
    [fake] =>
)

Very nice and easy way to parse selected pieces of HTML, don’t you think?

Overriding kses

Want to apply some kind of filter of your own to things? WordPress kindly adds a filter hook to all wp_kses calls: pre_kses.

function my_filter($string) {
	// do stuff to string
	return $string;
}
add_filter('pre_kses', 'my_filter');

Or maybe you want to add your own tags to the allowed list? Like, what if you wanted comments to be able to have images in them, but (sorta) safely?

global $allowedtags;
$allowedtags['img'] = array( 'src' => array () );

What if you want total control? Well, there’s a CUSTOM_TAGS define. If you set that to true, then the $allowedposttags, $allowedtags, and $allowedentitynames variables won’t get set at all. Feel free to define your own globals. I recommend copying them out of kses.php and then editing them if you want to do this.

And of course, if you only want to do a small bit of quick filtering, this sort of thing is always a valid option as well:

// only allow a hrefs through
$filtered = wp_kses($unfiltered, array( 'a' => array( 'href' => array() ) ));

Hopefully that answers some kses questions. It’s not a complete HTML parser by any means, but for quick and simple tasks, it can come in very handy.

Note: kses is NOT 100% safe. It’s very good, but it’s not a full-fledged HTML parser. It’s just safer than not using it. There’s always the possibility that somebody can figure out a way to sneak bad code through. It’s just a lot harder for them to do it.

Shortlink: