Tags vs. Categories

WordPress features both tags and categories.

This tends to confuse new users (and rightly so), so here’s a quick overview of both, and on when to use them.

On Tags And Categories

In the WordPress jargon, tags and categories are a sexy disguise for things that used to be called keywords.

Historically, WordPress (or rather, the software it was based on) has featured categories because… per specs, that’s how keywords are called in RSS feeds. No, honestly. If you open this site’s feed in your browser, you’ll find things like this:

<category>keyword</category>

Frequently, they’re tags; not categories.

You’ll read every here and there that the two differ in that categories are keywords that you can organize in a hierarchy. However…

Categories Hierarchies Are Misleading

Suppose that you have a parent category and a child category on a site. Place a post in the child category. Then that post will also show in the parent category.Quite natural isn’t it? Maybe not. It certainly isn’t for most of my customers; nor was it for me. I, for one, would expect that post to not show in the parent category. The debate rages on…

On related debates, consensus do get reached. Suppose you’ve two child categories and place your posts in one or either — but never the parent category. Place a category widget configured to show the hierarchy in a sidebar. Until very recently, and in spite of the behavior outlined above, that widget would have hidden the parent category on grounds it had zero posts in it. Go figure…

Then, the WordPress API related to categories is — at the very best — dysfunctional.

Users who binge on categories, for instance, frequently use posts to organize their site’s important content (they should be using static pages instead), and either of /%postname%/ or /%category%/%postname%/ as their permalink structure. The latter two are buggy. Avoid like Plague.

Then, you cannot order categories in an arbitrary manner at the time I’m writing this — to do so, you need to edit WordPress core files, or bypass the WordPress API altogether.

Lastly, WordPress 2.3, which introduced tags, also introduced the idea that categories and tags should share the same database tables. In the future, someone will likely reverse the latter decision.

Why? Sparing you the gory details, using three tables to do something you could do with two plus two tables is very questionable. It adds a needless join in queries, and it creates collisions. If, for instance, you’d like to have a WordPress category (with caps) a wordpress tag (without), renaming one will also rename the other.

Category Hierarchies Are Meaningless

Adding to this, the idea of introducing a tree-based hierarchy in a cloud of keywords is questionable. Some experts would certainly disagree with the points that follow, but I feel comfortable writing them for having delved into linguistics, cognitive science, and artificial intelligence at one point in my life.

To make a long story very short, the interesting aspect of keywords is not so much that you’re organizing them in a hierarchy; it is that you’re organizing them at all.

Any competent bookstore vendor will tell you that some keywords (e.g., psycholinguistics) belong in several categories (linguistics and psychology). Most keywords really have several parents; and several children. So, to start with, a graph of keywords is more meaningful to consider than a tree of keywords. (As an aside, the case for weighted relations (0% to 100%) is very strong as well.)

Now, from a semiotics standpoint, a language is a set of keywords (shape would be a more appropriate term, since gestures, odors, touch and unnamed concepts count as well) to which an individual has assigned rules (grammar, meaning, both as in what they imply). And from a linguistics standpoint, a language is a subset of individual’s languages that a community has agreed upon in order to communicate information.

The idea that the English language is just a bunch of keywords on which we’ve agreed on what they imply might be disturbing to you. After all, at school, we learn the ages old idea that grammar and vocabulary are different animals. But scratch the surface, and you’ll spot that the recent breakthroughs in AI, algorithms that learn, mimic or adapt, involve algorithms that — to an extent — take this for granted.

In any event, this ultimately raises one question: Can any such graph of keyword be suitable at all to organize your environment — or a subset thereof?

There’s an inconvenient mathematical theorem (Godel) that, in layman’s terms, implies there isn’t: No matter how you organize things, there will always be something that you’ll need a new keyword for to make it fit. For a real life example, look no further than the law: Ever noticed how the quantity thereof inflates? (By the way, where the above-mentioned AI algorithms find their limits lies in their inability to define new keywords in a non-random manner.)

The point here was to highlight another recipe for premature hair loss: If you’ve had sleepless nights trying to find the “perfect” hierarchy for your site, stop looking. It’s painfully hopeless.

On Whether To Use Tags Or Categories

Leaving the (broken) hierarchical aspect aside, tags and categories are the same. So the question of whether to use one or the other arises.

In theory, it’s a matter of taste, since posts are the only pieces of data that you can assign tags and categories to. The user interface to enter the first is a text field that lets you create tags on the fly; the user interface to enter the second assumes you’ll click on checkboxes.

In theory still, there is a catch. Whether you use tags or not, you need to use categories. This, because WordPress always assigns a category to blog posts. Even if you do not link to these pages in your template, Google finds them via your google sitemap. If you do not distribute categories all over, you’ll need to worry about duplicate content.

In practice, I recommend to use tags. Especially if you’re a Semiologic Pro user.

To start with, Autotag, Related Widgets, and Silo Widgets all implement page tags. And Related Widgets will build on these tags to generate lists of related posts and pages. So there is a clear benefit in tagging your posts and you pages.

Then, Opt-In Front Page lets you treat your main category (Blog) as your blog itself. That category’s url will point to your blog’s main page as a bonus. If it’s the only category on your site, you don’t need to worry about the duplicate content issues described above.

You could still create a few extra categories, of course. In particular, you may want to create asides feeds, or build content-specific feeds for specific purposes. One such feed, for Semiologic SEO users, could be a Highlights category: Posts from that category are then automatically highlighted in archives lists. Browse this site’s WordPress-related blog archives to see this work.

Lastly, you may eventually have thousands of keywords on your site. On the one hand, having throngs of tags will slow down the editor because of WordPress’ autocomplete scripts. On the other, having throngs of categories will slow down the editor because it’ll take a while to load them. You cannot turn off the category editor. But you can disable tag autocompletion.