Commons:Categories: Difference between revisions

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Content deleted Content added
separate out category structure issues to Commons:Category_structure
Line 1: Line 1:
{{header|lang-cat|COM:C|COM:Cat}}
{{header|lang-cat|COM:C|COM:Cat}}
{{TOCright}}
{{TOCright}}
A '''category''' is a software feature of [[w:MediaWiki|MediaWiki]], a special page which is intended to group related pages and media. In practice, it implies that you'll associate a '''single subject''' with a given category. The category name would be enough to guess the subject, but some extra text can be useful to precisely define it.
A '''category''' is a software feature of [[w:MediaWiki|MediaWiki]], a special page which is intended to group related pages and media. In practice, it implies that you'll associate a '''single subject''' with a given category. The category name would be enough to guess the subject, but some extra text can be useful to precisely define it.

When creating new categories, or renaming categories to resolve naming issues, it is important to understand the [[Commons:Category structure]].


==Quick guide==
==Quick guide==
Line 15: Line 17:
#*or: use HotCat (see [[Help:Gadget-HotCat]])
#*or: use HotCat (see [[Help:Gadget-HotCat]])
#*or: use Cat-a-lot (see [[Commons:Cat-a-lot]])
#*or: use Cat-a-lot (see [[Commons:Cat-a-lot]])

== Purpose of categories in Wikimedia Commons ==
The [[Commons:Guide to layout/Category pages|category]] structure is the primary way to organize and find files on the Commons. It is essential that every file can be found by browsing the category structure. To allow this, each file must be put into a category directly. Each category should itself be in more general categories, forming a hierarchical structure.

== Category structure in Wikimedia Commons ==

The category structure is (ideally) a [[w:directed acyclic graph|multi-hierarchy]] with a single root category, [[:Category:CommonsRoot]].
* All categories (except ''CommonsRoot'') should be contained in at least one other category
* There should be no cycles (i.e. a category should not contain itself, directly or indirectly).
* The category structure should reflect a hierarchy of concepts, from the most generic one down to the very specific.

=== Major categories ===
The top-most categories (the ones contained directly in CommonsRoot) divide the category structure by the purpose of the contained categories:

* '''[[:Category:Topics]]''' - This category is the '''global common''' root of the media files categorized by the '''TOPIC'''. '''ALL media files''' should be categorized under this category for the sake of allowing others to find them by topic. Topical categories shouldn't be included through templates.
* '''[[:Category:Copyright statuses]]''' - This category is the '''global common''' root of the media files categorized by the '''LICENSE'''. '''ALL media files''' should be categorized under this category with the appropriate [[Commons:Copyright_tags|license tag]]. This type of category is added by including it in the templates.
* '''[[:Category:Image sources]]''' - This category is the '''global common''' root of the media files categorized by the '''SOURCE''', where they come from (books, collections, sites, etc.). This type of category is generally added by template.
* '''[[:Category:Media types]]''' - This category is the '''global common''' root of the media files categorized by the '''Media TYPE'''. Please note that this type of categorization is sometimes omitted for images, since the vast majority of files on the commons are images of some sort.
* '''[[:Category:Commons]]''' - This category is the '''global common''' root of categorizing Commons' '''maintenance tasks''' and '''pages (Commons:-, and Help:-)''' except for media files. The translated pages in each language should be categorized under their language categories, using the "Category:Commons-[[w:ISO 639|ISO-LANGUAGE-CODE]]" style. The structure of '''[[:Category:Commons-en]]''' is the sample hierarchy for every other language sub category. Do not use two colons in category or page names. See [[Category talk:Commons batch uploading|this discussion]] and [[Help:Namespaces]].
::There is a sub category [[:Category:Commons maintenance content]], which is for the special maintenance of Wikimedia Commons' '''global common contents''' and which does not get translated. '''ALL media files''' should be categorized under the first 4 categories below, but ONLY files having problems and needing to be fixed should also be in the sub-category [[:Category:Commons maintenance content]].
* '''[[:Category:Users]]''' - this is for categories that contain commons users galleries, images and texts, sorted by things like the language they speak. This also contains the [[:Category:User galleries]], which is for user specific (i.e. non-topic) galleries that don't need to be in English language.

== How to use categories ==
You should always put your uploads into categories and/or gallery pages according to topic, so your contributions can be found and used by others.

It is rarely necessary to create a new category (there are exceptions, such as uploading a new text and see ''[[#People|People]]'' below). Before doing so, make sure you are familiar with the existing category structure, and with the customs and policies of the Commons. Please see if there exists a [[:Category:Commons category schemes|category scheme]] or a [[:Category:Commons projects|commons project]] for your topic, and follow the conventions described there.

=== Category names ===
Category names should generally be in English (see [[Commons:Language policy]]). However there are exceptions.

Category names that refer to types of objects or groups of people should generally be in plural form: [[:Category:Tools]], [[:Category:Artists]], [[:Category:Lakes]], [[:Category:Paintings]], [[:Category:Sculptures]] etc, as opposed to general themes or activities such as ([[:Category:History]], [[:Category:Weather]], [[:Category:Music]], [[:Category:Painting]], [[:Category:Sculpture]]) or to a particular individual object (a specific building, monument, artwork etc.). See a proposal of '''[[Commons:Naming categories|Naming categories]]''' for more information.

Categories grouping subcategories by name should generally be named "by name" rather than "by alphabet" (e.g. [[:Category:Ships by name]]).

We still lack internationalization for category names, but this issue should be resolved with appropriate changes to the MediaWiki software (see [[bugzilla:5638]]). Creating intermingled category structures in different languages would only make things worse.

'''For a general discussion of MediaWiki's category feature, see the [[m:Help:Categories|manual page on categories]].'''


=== Categorizing pages ===
=== Categorizing pages ===

To add a page (be it an image, a gallery page, or a category page) to a category, add the following code to the end of the page.
To add a page (be it an image, a gallery page, or a category page) to a category, add the following code to the end of the page.


Line 70: Line 34:


=== Creating a new category ===
=== Creating a new category ===

To create a new category:
To create a new category:
# Do a thorough search, to be sure there isn't an existing category that will serve the purpose.
# Do a thorough search, to be sure there isn't an existing category that will serve the purpose.
Line 82: Line 45:
* If the category should be sorted according to a different string than the category title, add a <nowiki>{{DEFAULTSORT:}}</nowiki>. For instance, the title of a category about a person would not be the right sort string. For such categories, insert after the interwiki links a line like ''<nowiki>{{DEFAULTSORT:Lastname, firstname}}</nowiki>'' with the correct sort string. See [[:meta:Categories#Sort key]] for more information.
* If the category should be sorted according to a different string than the category title, add a <nowiki>{{DEFAULTSORT:}}</nowiki>. For instance, the title of a category about a person would not be the right sort string. For such categories, insert after the interwiki links a line like ''<nowiki>{{DEFAULTSORT:Lastname, firstname}}</nowiki>'' with the correct sort string. See [[:meta:Categories#Sort key]] for more information.


See also [[#How to categorize: guidance by topic]] for guidance on specific classes of category, e.g. categories about [[#People]].
See also [[Commons:Category structure#How to categorize: guidance by topic]] for guidance on specific classes of category, e.g. categories about [[#People]].


=== Renaming or moving categories ===
=== Renaming or moving categories ===
Please see [[Commons:Rename a category]].
Please see [[Commons:Rename a category]].


===Commons category structure===
== For more appropriate categorization ==
'':see [[Commons:Category structure]]''
Pages (including category pages) are categorized according to their '''subject''', and not to their '''contents''', because the contents are generally not a permanent feature of the category page; in particular, you can momentarily find inappropriate contents in a category page.


Example: Assume that [[:Category:Spheres]] contains only pictures of crystal balls. You must not add [[:Category:Glass]] in the category page, according to the current contents, because you can have spheres made with a great variety of materials. Normally, any picture showing a glass object would be already categorized in [[:Category:Glass]] (or in a category of its substructure). So, if the [[:Category:Spheres]] is really crowded with crystal balls pictures, it would be a better idea to create a new category page, like '''Category:Glass spheres''' or '''Category:Crystal balls''', categorized in [[:Category:Spheres]] and [[:Category:Glass]].

Generally files should only be in the most specific category that exists for certain topic. For example files in [[:Category:Looking up the center of the Eiffel Tower]] should not also be in [[:Category:Paris]] ''(see [[#Over-categorization|over-categorization]] below)''. If you do not find a category that fits your purpose, you can create it — but carefully read the section about [[#Using Categories|using categories]] first.

This does not mean that an image only belongs in one category; it just means that images should not be in ''redundant'' or non-specific categories. For instance, an image of a Polar Bear being rescued from an iceberg by a helicopter should be in [[:Category:Ursus maritimus]], [[:Category:Icebergs]], [[:Category:Helicopters]], and [[:Category:Search and rescue]]. It should not, however, be in [[:Category:Ursidae]] or [[:Category:Aircraft]].

=== Categorization tips ===
The categories (or galleries) you choose for your uploads should answer as many as possible of the following questions:
* '''what? / whom?:''' what or whom does the file show? What is the main subject? What are the noteworthy features of the image? For instance [[:Category:Ferrari 575]] or [[:Category:Jimmy Wales]]
* '''where?:''' where was the image taken? What is the location of the subject? What is the location of the camera? This is especially important for pictures of places. E.g. [[:Category:Basin Street, New Orleans]]
*: also use {{tl|location}}
* '''when?:''' when did the depicted events happen, or when was the image created? When was the image taken? This is especially important for historical images. An example would be [[:Category:Warsaw in September 1939]], [[:Category:April 2010 in Northern Ireland]]
* '''who?:''' who is the author? This is especially important for works of well known artists and for historical images, for example [[:Category:Paintings by Rembrandt]]. You can also use the pages from the ''Creator'' namespace as templates to achieve this.
* '''how?:''' how does the file (or the image) do that? Specifically:
** '''what view?:''' what type of view does the image show? e.g. [[:Category:Plan views]], [[:Category:Panoramics]].
** '''what color?:''' which is the general color? e.g. [[:Category:Yellow animals]]
** '''what photography technique?:'''. If the image uses a specific [[:Category:Photographic techniques|technique]] or [[:Category:Photographic effects|effect]], apply the corresponding category: e.g. [[:Category:Tone-mapped HDR images]], [[:Category:Black and white photographs]], [[:Category:Double exposure]]


The above questions cover the main aspects of the image to be categorized. For some images it makes sense to use all, for other images only one or two are reasonable. In addition there are several other aspects of the images that can be used to categorize the image:
* '''what source?:''' information about where did the image came from? For example [[:Category:Images from the German Federal Archive]]
* '''what format?:''' information about the unusual media type, like [[:Category:Audio]] or [[:Category:Animated GIF]], [[:Category:SVG]]
* '''what software?:''' information about software used to create the image. For example [[:Category:Created with Hugin]]
* '''what camera?:''' information about the camera. For example [[:Category:Taken with Nikon D80]]
This last set is useful and important but should always be done in addition of the main set of criteria.

=== Find an appropriate category ===
To find appropriate categories for your uploads, you should navigate the category structure starting from a generic category. Narrow your search down to subcategories until you find the most specific category that fits the file you uploaded. You can navigate the category structure by following links to subcategories, or expanding the tree of subcategories by clicking on the little + symbols on subcategory names. The ''[[#Major_categories|Major categories]]'' section above provides a starting point, and the ''[[#How to categorize: guidance by topic|How to categorize: guidance by topic]]'' covers some topics more. You can also try '''[http://tools.wikimedia.de/~daniel/WikiSense/CommonSense.php CommonSense]''', a tool that is designed to help with categorization based on keywords.

=== Over-categorization ===
[[Image:Over-categorization.svg|thumb|Don't place an item into a category ''and'' its parent (''e.g.'', Put it in 'Black and white photographs of Tour Eiffel', not in 'Black and white photographs of Tour Eiffel' ''and'' 'Paris')]]{{shortcut2|COM:OVERCAT}}
Over-categorization is what happens when an image is placed in several categories within the same tree. The general rule is ''always place an image in the most specific categories, and not in the levels above those.'' An example:

We'll assume that yellow spheres are spheres with a yellow color. We can think about ''Category:Yellow spheres'' and [[:Category:Spheres]]. The picture to be categorized shows yellow marbles. We categorize the file in '''Category:Yellow spheres'''. Now, if we also categorize the image file in [[:Category:Spheres]], this is ''over-categorization'': because we already know that the yellow marbles are spheres. This applies to most images: As mentioned above files in [[:Category:Black and white photographs of Tour Eiffel]] should not also be in [[:Category:Paris]], files in [[:Category:Albert Einstein]] should not be in [[:Category:Physicists from Germany]] and so on.

Visually, it is the same problem as the red arrow shown in the chapter above.

==== Why is over-categorization a problem ====
It's often assumed that the more categories an image is in, the easier it will be to find it. Another example: By that logic, every image showing a man should be in [[:Category:Men]], because even if you know nothing more about the person you're looking for than that he is a man, you'll be able to find it. The result is that the top category fills up, making it necessary to go through hundreds, or in this case more likely thousands of images to find the one you want. You probably won't find what you're looking for, and what's more, those who are looking for a generic picture of a man to illustrate an article like [[:en:Man]] will find that they've drowned out among the movie stars, scientists and politicians.

On lower levels, the problem becomes less acute, since the number of images will be smaller — they can still easily reach into the hundreds, though. But there is still a problem: Let's go back to Einstein. I know that he's a physicist, so I'll look there. I find an image among the hundreds in the category, which I'm not too happy with, but it's the only one there. Since there was an image there, I assume that there are no more hidden elsewhere, rather than look further in [[:Category:Physicists from Germany]] and thus find [[:Category:Albert Einstein]] where there might be a better one. So over-categorization has led to two problems: The top category is cluttered, and users will stop looking for the most relevant category since they've reached one that has a relevant image.

==== Improper categorization of categories is a cause of over-categorization ====
Strange as it may sound, under-categorization can actually be a cause of over-categorization. This happens when a category is not properly categorized, leading users to over-categorize an image to get it into the relevant categories. An example of this: [[:Category:Eivør Pálsdóttir]] was categorized only in [[:Category:People by name]]. So if I add an image of her, and know who she is, I would also place the image in [[:Category:People of the Faroe Islands]] and [[:Category:Vocalists]]. This is over-categorizing, I've caused clutter in the top categories by adding images directly to them.

A related problem is erroneous categorization: [[:Category:Notting Hill]] was for more than a month placed in [[:Category:London]]. When adding an image, it would be very tempting to add that image to [[:Category:Royal Borough of Kensington and Chelsea]], which is where you'll find Notting Hill. Instead, each image should be placed only in the most specific categories, and those categories should in turn be placed in their most specific categories.

When you encounter this, please categorize the categories properly if you are able to do so. That will not only help avoid over-categorization, but also make it easier to move through the category tree.

== How to categorize: guidance by topic ==
For some categories, there is special guidance on how best to sort content within that category. This guidance can be found in a [[:Category:Commons category schemes|category scheme]] or a [[:Category:Commons projects|commons project]] for your topic. There is also some categorizing information in this section and sometimes there is guidance at the top of the category's page, in the Category [[meta:Namespace|namespace]]. So, for instance, some guidance on categorizing content depicting people is at the top of [[:Category:People]], and some is in the section '''[[#People|People]]''' below.

=== People ===
Content depicting people can be put in [[Commons:Guide to layout/Category pages|categories]] and/or [[Commons:Guide to layout/Gallery pages|galleries]] which describe them, such as [[:Category:Economists from the United States]]. Start exploring at [[:Category:People]].

Please see [[Commons:Category scheme People]] for details on how to name and organize these categories.

===Landscapes, outdoor views ===
{| style="float:right" class="wikitable"
| A <br /> > Views of B from A
| B <br /> > Views of B from A
|-
| A <br /> > Views from A <br /> > Views of B from A
| B <br /> > B skylines (or similar) <br /> > B skylines from A
|}
If there are series of similar views of "B", these can be categorized in a category "View of B from A". This category should be a subcategory of both "A" and "B".

Sometimes it makes sense to have an intermediate category in one or the other hierarchy: e.g. [[:Category:Views from the Empire State Building]] or [[:Category:Seattle skylines]].

=== Texts ===
Texts, such as scans of books, should normally have a category for each version of the scan and each edition of the text. Thus a book published in three separate editions would have a parent category for the book, three subcategories for each text, and further subcategories for the text as a jpeg, a DjVu, etc. Assuming each version had actually been uploaded (categories would not be created for editions not held on Commons). This is particularly important for files in formats other than DjVu and PDF, where the category is the only practical means of keeping the scans together; see eg. [[:Category:The Chronicles of England, Scotland and Ireland, Holinshed, 1587]] which contains 2857 jpeg images of page scans.


== Categorization workflow ==
== Categorization workflow ==
Currently, a "[[Commons:Bot|bot]]" (an automated tool) checks if newly uploaded files are categorized in topical categories and attempts to categorize files that are not.

Currently, a bot checks if newly uploaded files are categorized in topical categories and attempts to categorize files that are not.


The workflow is the following:
The workflow is the following:

Revision as of 15:57, 10 June 2011

Shortcuts
This project page in other languages:

A category is a software feature of MediaWiki, a special page which is intended to group related pages and media. In practice, it implies that you'll associate a single subject with a given category. The category name would be enough to guess the subject, but some extra text can be useful to precisely define it.

When creating new categories, or renaming categories to resolve naming issues, it is important to understand the Commons:Category structure.

Quick guide

  1. How to find the appropriate categories
    • Find categories with the search engine (see #Categorization tips)
    • or check how similar images are categorized (some may not be categorized though)
    • or try tools:~daniel/WikiSense/CommonSense.php
    • or start from the main topical category (Category:Topics)
    • Starting from these categories, check their parent or sub-categories to find an appropriate category. Avoid picking too general categories.
  2. Add the categories to the file/image

Categorizing pages

To add a page (be it an image, a gallery page, or a category page) to a category, add the following code to the end of the page.

[[Category:Category Name]]

For example, if you are uploading a diagram showing the orbit of comets, you could add the following to the image description page:

[[Category:Astronomy diagrams]]
[[Category:Comets]]

This will make the diagram show up in the categories Astronomy diagrams, Comets.

For information on how to find good categories for your uploads and galleries, read the section Find an appropriate category below.

Creating a new category

To create a new category:

  1. Do a thorough search, to be sure there isn't an existing category that will serve the purpose.
  2. Find images (or a gallery or other pages) which should be put in the new category. Edit this page, and at the end insert the new category reference. e.g. [[Category:Title]]. Save the edited page. The new category appears as a red link at the bottom of the page.
  3. Click on that red link. The new, empty, category page appears for editing. You can now edit the category like any other wiki page.

A category page should contain the following information (in order of importance):

  • Category-links that put it into one or more parent categories. At the bottom of the new page, insert lines of the form [[Category:Relevant categories]].
  • A short description text that explains what should be in the category. English is the preferred language for the description, other languages can be added (with the template {{ab|...}} for description in Abkhazian, {{en|...}} for description in English, etc, as listed in Commons:Templates for galleries).
  • Interwiki links to the article or category with the same topic in Wikipedia (i.e. interwiki link [[ab:...]] to the page in Abkhazian Wikipedia, [[en:...]] to the page in English Wikipedia, etc.).
  • If the category should be sorted according to a different string than the category title, add a {{DEFAULTSORT:}}. For instance, the title of a category about a person would not be the right sort string. For such categories, insert after the interwiki links a line like {{DEFAULTSORT:Lastname, firstname}} with the correct sort string. See meta:Categories#Sort key for more information.

See also Commons:Category structure#How to categorize: guidance by topic for guidance on specific classes of category, e.g. categories about #People.

Renaming or moving categories

Please see Commons:Rename a category.

Commons category structure

:see Commons:Category structure


Categorization workflow

Currently, a "bot" (an automated tool) checks if newly uploaded files are categorized in topical categories and attempts to categorize files that are not.

The workflow is the following:

  1. User uploads a new file and adds categories (or not)
  2. CategorizationBot checks if the file is categorized
  3. Users categorize files further (e.g. category diffusion below)

See also: User:CategorizationBot#Process, categorization statistics


Other, if manual, categorization workflows are possible :

  • Category filling : Use appropriate keywords in the search engine to find the files that should be in a given category, and put them there.
  • Category diffusing : Go to Category:Categories requiring diffusion, select a crowded category, create appropriate subcategories if needed, and move the files to the subcategories. Gadgets like Cat-a-lot and Hotcats can help.

Categories marked with "HIDDENCAT"

Many non-topical categories are marked with __HIDDENCAT__ on the category page (view e.g. Category:PD_NASA).

While categories are generally visible on every page, categories marked __HIDDENCAT__ are only visible:

  • on the edit screen (at the end of the screen, below the edit box)
  • on category pages
  • on file description pages and gallery pages for logged-in users: each user can choose to see them in a separate "Hidden categories" list, by checking "Show hidden categories" in the "Appearance" section of Special:Preferences.. Try any file in Category:PD_NASA.

This feature is generally used for template-based categories, such as license tag based categories. Sample: {{PD-old-100}} on a file description page adds the image to Category:Author died more than 100 years ago public domain images. These categories are generally non-topical and facilitate maintenance.

See also: mw:Help:Categories#Hidden categories

Tools

See also