For some months now, I've been noticing that the Audean wiki, which I use as a live documentation site for various aspects of my sites, appeared comparatively rarely in Google search results, although it was referenced in various places and Google cache info (
cache:-prefixed queries) showed the site was indexed.
Now, the Audean Wiki is based on Splitbrain's Dokuwiki very convenient Open Source wiki, which often appears in relation with Drupal for documentation purposes, and it appears there are three problems with a default Dokuwiki installation, which prevent effective search engine optimization:
Here's how to overturn these hurdles.
doku.php?<path>Although Google itself has been able to follow this type of links for years, other engines are not as efficient. It is therefore advisable to use turn clean URLs on in Dokuwiki. This is achieved using either Apache configuration with
mod_rewrite, for instance in a
.htaccessfile, or the
conf/local.php. Setting this variable to 1 enables "internal" processing of URLs by the wiki engine, producing URLs without the question mark defining parameters. First hurdle crossed.
conf/local.php. Note this will only work if the first step of using rewrite has already been implemented. specific explanation pagefor this problem, along with various server-dependent solutions. In Dokuwiki's case, though, the solution is simple: this is achieved using the
conf/local.php. The new page creation dialog will now be returned along with a 404 status.
There is a catch
In this last case of new page creation, well-behaved browsers like Opera and Firefox have no problem with this and display the page creation page normally. However, there remains a broken browser, which ignores the RFC 2616 HTTP standard. Quoting from section 10.4, first paragraph of the RFC :
User agents SHOULD display any included entity to the user.
Granted, the really proper status code semantically wouldn't be 404, but
409 Conflict. To quote RFC 2616:
This code is only allowed in situations where
it is expected that the user might be able to resolve the conflict
and resubmit the request. The response body SHOULD include enough
information for the user to recognize the source of the conflict.
Ideally, the response entity would include enough information for the
user or user agent to fix the problem; however, that might not be
possible and is not required.. This is exactly what is happening: the user can recognize the source of the conflict (missing content) and fix it (by creating the content). But I've yet to see code 409 really be used.
Oh, well. Audean is for coders anyway, and these types don't normally use MSIE. For others, a browser detection script could be used to return the wrong results MSIE needs, and the proper results to other user agents.