Taxonomy. That's what sets Drupal apart, and makes it so much more useful than many of its alternatives. But it's unduly intimidating at first: let's peek under the hood to see how to take advantage of it.
The taxonomy mechanism is the heart of what makes Drupal so different from most other content management systems. But experience in the drupal support channel shows it is not always well understood.
Taxonomy data model
First things first, although the service provided by the taxonomy mechanism ("categories" at various places in the Drupal UI) is simple, the implementation requires no less than six tables (see diagram) for basic features:
- The main table is
term_data. This is where the terms used for classification are defined. Every term is given a unique term identifier, or
- The second most important table is
vocabulary. Each of the terms in
term_databelongs in exactly one vocabulary, to which it is linked by the
- For vocabularies allowing it, term hierarchy is defined, obviously enough,
term_hierarchy, in which each tid has one row for each of its parent tids, or one row with (virtual) parent tid 0.
- Terms are mostly used to classify the basic unit of content in Drupal,
node. This is the purpose of the
term_nodetable, which implements the term/node relationship. Note that they can be used for other purposes like user classification (more on that below).
- Synonyms are handled through the use of the
term_synonymtable, in which each row links to an existing tid and defines a new name for it.
- For vocabularies in which this option has been enabled, the
term_relationallows for the inter-linking of terms: each row defines a pair of tid values as describing related terms.
- Drupal allows vocabularies to be limited to some node types.
This is implemented by the
There is an implied integrity relationship :
node/type must match
for every instance of
term_node/tid. This is currently implemented
in code by drupal modules.
In the current implementation (4.6.4), even if hierarchy is not used,
each tid will also have at least one row in this table, with
parent = 0, to show it is a "root" node.
It may have more than one parent, which prevents
term_hierarchy table by just a
parent column in the
IMHO, since this is essentially an implementation artefact that
costs significant data space, it does not seem poised to remain
in place for very long.
As questions on the support channel suggest, the use of drupal categories, as implemented by the taxonomy module may not be guided enough: I had a case yesterday where the user wanted additional code to prevent terms in one category (i.e. vocabulary) to be used on a node along with other terms from another category on the same node. In most cases, this points to an information architecture problem at a higher level: if terms are mutually dependent, like these terms that had to be exclusive, then they belong in the same classification axis, meaning the same vocabulary.
This is where the hierarchical nature of Drupal classifications comes in handy: instead of defining a set of specialized vocabularies with dependence on other vocabularies, all it takes is for one to define a hierarchical vocabulary, within which specialized subtrees will be implemented as children of higher level terms, thus ensuring mutual exclusion.
In short, if there is one only word to remember when designing an information taxonomy, or in layman's terms when configuring categories on Drupal, this word is orthogonality. Proper orthogonal category design will often save a lot of time implementing case-specific rules in code.
There is an introduction to the concept in the "Derived meanings/Computer Science" section on orthogonality at wikipedia.
Taxonomy beyond nodes
Although the taxonomy system in 4.6.x Drupal is geared towards use in nodes,
it can be put to other uses. As a proof of concept, Karoly Negyesi
has created the "userstag" module enabling the use of taxonomies on users.
Use your favorite search engine to query for
drupal userstag chx
for the current URL. This module uses a term_
similar in purpose to the
term_node table in regular taxonomy use.
Note that this is NOT supported code, or even contributed code,
and as such should not be used on a production system unless you are
ready to maintain it or have it maintained.
Warning: the previous version of this post contained an error, noticed and fixed by Killes (confusion between synonyms and related terms). Thanks to him
2005-11-28 - Update : this page, as well as other in the Grokking Drupal series of my blog, is now available on drupal.org. The version on drupal.org will probably be updated with time, whereas this one probably won't be.