Categorization with field-value pairs

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Evan (talk | contribs) at 19:38, 10 December 2003 (Added a namespaced category). It may differ significantly from the current version.

One prominent application of field-value pairs is for automatically categorizing articles in Wikipedia or other MediaWiki sites. This proposal outlines how to make this happen.

Acknowledgements

Most of this work has already done by Magnus Manske, by the way, and is included in MediaWiki; he used links in a pseudo-namespace Category:, so there's a minor variation in syntax and the back end, but the idea remains essentially the same.

Rationale

It's tedious to maintain long lists of articles pertaining to a particular category. This proposal would automate some of this work.

Markup

Article editors would mark up articles with one or more fields indicating the category they belong to. For example, w:William Carlos Williams might have markup like this:

 [[category=American poets]]
 [[category=20th-century Americans]]

Articles for categories would have markup to indicate that they are categories, and may have markup for categories they belong to. For example, w:American poets might have:

 [[type=category]]
 [[category=Americans]]
 [[category=poets]]

Articles for categories may also have Wikitext for explanations of the category or for describing the idea in general.

Category names are just article titles. They can be used for other namespaces, too. Say, for w:Wikipedia:User preferences help, we could have:

 [[category=Wikipedia:Help]]

(NOTE: Category names here are examples; good category naming and categorization is something to be set by policy in the MediaWiki installation, and probably also to grow organically from practice.)

Article rendering

For all articles, the categories will be extracted and shown as links out-of-page, like interlanguage links. For example, for William Carlos Williams:

   Other languages: Deutsch Francais Espanol
   Categories: [[American poets]] [[20th-century Americans]]

The links would have the same display as in-page links, that is, broken links -- no such article -- would be in red or with a question mark.

Category rendering

In addition, articles that are tagged with [[type=category]] would have additional output after the article text to show the articles in that category, and show the sub-categories of that category. For en:American poets, we might see:

    American poets started a new tradition separate from their English
    counterparts in the early 18th-century, and have continued... (etc.)
    Subcategories:
    * w:Language poets
    * w:Romantic poets
    * w:Beat poets
    Articles:
    * w:William Carlos Williams
    * w:Edgar Allen Poe
    * w:Elizabeth Bishop

Only articles that directly reference this category will be displayed; for example, if w:Allen Ginsberg were tagged with [[category=Beat poets]], his name would not appear on the American poets page, but on the Beat poets page. (If, however, his page was tagged with both categories, it would appear on both pages. There's flexibility here for different kinds of article-category relationships.)

Note that category articles can also be in categories themselves, and their categories would be listed in the same space (by the interlanguage links) as any other page. So American poets might have:

     Other languages: Dansk Hindi
     Categories: w:poets w:Americans