Help:Sorting

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Patrick (talk | contribs) at 15:11, 7 April 2007 (→‎See also). It may differ significantly from the current version.

Tables can be made sortable via client-side JavaScript with class="sortable". This works in MediaWiki 1.9, which is installed in all Wikimedia projects. Sortable tables are identified by the arrows in each of its header cells. Clicking them will cause the table rows to sort based on the selected column, in ascending order first, and subsequently toggling between ascending and descending order. Links and other wiki-markup are not possible in headers.

Note that all of the below is subject to change due to improvements in the script.

Sorting modes

The sorting modes (the data types, which, in addition to the choice "ascending" or "descending", determine the sorting order) are:

  • string
    • criterion: the first non-blank element is not of type numeric, date or currency;
    • order: after conversion of capitals to lowercase the order is ASCII - partial list showing the order: !"#$%&'()*+,-./09:;<=>?@[\]^_'az{|}~é— (see also below; a blank space comes before every other character; an nbsp code counts as a space; two adjacent ordinary blank spaces count as one; for multiple blank spaces one can use nbsps or alternate nbsps and ordinary blank spaces)
  • numeric
    • criterion: the first non-blank element consists of just digits, points, and commas; hence a number with thousands separators is now recognized as numeric; not recognized as numeric are:
      • negative numbers (!)
      • numbers in scientific notation
        proposed remedy: for the criterion, allow also -, +, spaces, e and E
    • order: if the string starts with a number (where spaces and nbsp's at the start are ignored) the order is numeric according to the first number in the string (parseFloat is applied) after removing the first comma, if any; if it does not (parseFloat returns NaN), the element is positioned like 0; negative numbers and numbers in scientific notation are properly sorted (if, as said, the first element is not like that), and numbers with one comma separator too; numbers with more comma separators or with space separators are not: they are sorted like the number before the second comma or first space.
proposed remedy: ignore spaces and all commas in evaluating numbers to determine the sorting order
proposed internationalisation: in german languages, treat comma as a decimal point
  • date (see also below)
    • criterion: the first non-blank element is of the form "dd-dd-dddd", "dd-dd-dd", or "dd aaa dddd"
    • order: the string abcdefghij of length 10 is positioned as ghijdeab, the string abcdefghijk of length 8 as 19ghdeab if gh>=50 (string comparison) and 20ghdeab otherwise (i.e., the assumed format is DD-MM-YYYY or DD-MM-YY), and the string "dd aaa dddd" with aaa an abbreviated month name: chronologically
  • currency
    • criterion: the first non-blank element starts with "$" or "?" (the latter criterion seems a bug)
      • The omission of other currencies such as £, €, ¥, needs fixing
    • order: numeric, ignoring all characters except digits and points

The sorting mode is determined by the table element that is currently in the first non-blank row below the header. Thus it may change after sorting, which can give a cycle of four or even more instead of two.

The most versatile is alphabetic sorting using a sortkey which due to CSS is not displayed:

<span style="display:none">...</span>
(however, this more cumbersome method is less often needed after the described fixes are applied)

Javascript sorting is based on the text inside and outside the tags, without the tags themselves. The sortkey comes at the start and is separated from the displayed text in such a way that the latter does not affect the sorting order. For example, if a sortkey system is used where there are no blank spaces in any sortkey, then a blank space can be used for separation. If a single blank space is possible in a sortkey, two nbsps can be used. For table elements for which the text to be displayed is equal to the sortkey, no duplication is needed, of course.

If the text inside and outside the tags together is of a form that would cause a sorting mode other than alphabetic (if and when the element is at the top), a character can be appended at the end of the sortkey to avoid this, again making sure it does not affect the sorting order by putting a space or two nbsps. This can be dispensed with if the element can never be at the top, but this can be complicated to assess as that can be caused by sorting other columns, with varying sorting modes, and it can change when deleting a row, adding a column, etc.

Instead of "display=none" another way is using a font color equal to the background, e.g. <font color="#f9f9f9">999</font> gives "999". With this method the hidden code can be seen in selected text (e.g. with the mouse). Also the hidden text is included when copying the rendered text. The first may be an advantage or a disadvantage, the second seems only a disadvantage. A complication is also that if a user uses a background color different from the default, the specified text color may not match it; to make sure they are the same the background color can be specified also.

Making variable-length numbers with thousands separators sortable

(this more cumbersome method is not needed after one of the described fixes is applied)

If a column consist of non-negative numbers, of which those >= 1000 have thousands separators, alphabetic sorting can be made to correspond with numeric sorting by leading "&nbsp;" codes which render as blank spaces (or with leading zeros) to equalize the number of characters before the explicit or implicit decimal separator. (However, this does not seem to work in all browsers: it works in IE, but reportedly not in Firefox.)

However, if at any time a number less than 1000 would be at the top, the sorting mode would be numeric, even with leading "&nbsp;" codes. Hence subsequent sorting would not work properly due to the thousands separators. One possible workaround is to force alphabetic sorting mode, by writing either all numbers, or just those in the range 0 - 1000, with a "+" in front. (This no longer applies when a plus in the first number no longer prevents numeric sorting mode.)

If all numbers get a plus, the absolute position of the plus has to be a non-increasing function of the number, e.g. the pluses are in the same absolute position, or in the same position relative to the first digit. If only the numbers in the range 0 - 1000 get a plus, the position of the first non-space character (plus or digit) has to be a non-increasing function of the number, so if there is a number in the range 1000 - 10000, pluses should be at most at the fifth position from the right to preserve the sorting order (+ 999 comes before 1,000).

Within a column, either all or no numbers in the range 0 - 1 should have a zero before the decimal point.

In the case where the width of a number is not fixed or not known, as in the case where the number depends on parameters or templates, and/or is the result of a computation, an automatic way of padding with nbsps is needed. This can be done with Template:Lsc11 which also provides thousands separators, and the pluses necessary in the range 0 - 1000 (see above).

We cannot use variable padleft with the nbsp code, because &, the first character of the code, is used for padding, even if the code is put in a template.

Padding with zeros

Example:

  • 000156

Formatnum can be combined with padleft:

Integer:

{{formatnum:{{padleft:299792458|16|0}}}} gives:

  • 0,000,000,299,792,458

Real:

{{formatnum:{{padleft:{{#expr:((299792458.056 - .5) round 0)}}|16|0}}}}.{{padleft:{{#expr:(1000000*(299792458.056 - ((299792458.056 - .5) round 0))) round 0}}|6|0}} gives:

  • 0,000,000,299,792,458.056000

Corresponding alphabetic and numeric sorting

(This more cumbersome method is less often needed after the described fixes are applied)

Alphabetic and numeric sorting can be made to correspond for all numbers between -1e100 and 1e100 in arbitrary precision as follows:

  • where scientific notation is used, it is normalized such that the absolute value of the mantissa is between 1 and 10; the exponent is put first
  • scientific notation is used for all negative numbers, and all positive numbers outside some interval (below: 1e-9 to 1e9), and not inside that interval
  • where the absolute value of the exponent and/or the mantissa is a decreasing function of the number, the notation uses its complement with respect to 99 for exponents and 10 for mantissas; the code "c" is added in these cases
  • numbers 0 ≤ x < 1000 get a "+" in front
  • positive numbers in scientific notation with a negative exponent get "+0" in front
  • spaces and nbsps are added where needed:
    • for numbers not in scientific notation the positions of all explicit and implicit decimal points are aligned
    • for the starting position, i.e. the position of the first "-", "+", or "e", of other numbers, see the example table

For readibility a plain notation can be added after the coded form. If a blank space is used for separation, and no number code has a space at this position, then adding the plain notation does not affect the sorting order. If a number code can have a space at this position, two nbsps can be used. Moreover, display of the first part, acting as sortkey, can be avoided using CSS (this does not affect its functioning for sorting):

<span style="display:none">...</span>

(However, on some projects, notably Ontoworld, a page with this wikitext cannot be saved, as spam protection.)

In the following the left column shows the code for alphabetic sorting, where cryptic followed by the regular notation. The second column contains the same (hence sorting the same), but with code hidden with CSS. The third column shows the corresponding plain numbers, equal to what the second column shows, except that commas have been removed to allow numeric sorting mode. Thus this column also provides numeric sorting, this time using numeric sorting mode, but only when the first element is detected as numeric, i.e., when it is a non-negative number which is not in scientific notation. As a result sorting toggles between ascending numeric and descending alphabetic order.

full code for alphabetic sorting display form plain number
         +6          +6 6
         +7          +7 7
  1,048,576   1,048,576 1048576
      1,234       1,234 1234
       +123        +123 123
 16,777,216  16,777,216 16777216
     65,536      65,536 65536
 67,108,864  67,108,864 67108864
e23 6 6e23 e23 6 6e23 6e23
e09 1 1e9 e09 1 1e9 1e9
         +0 ec89 9.999,99 9.999,99e-10          +0 ec89 9.999,99 9.999,99e-10 9.99999e-10
         +0.000,000,001          +0.000,000,001 1e-9
         +0 ec87 6 6e-12          +0 ec87 6 6e-12 6e-12
         +0 ec86 7 7e-13          +0 ec86 7 7e-13 7e-13
         +0 ec87 5 5e-12          +0 ec87 5 5e-12 5e-12
          -e-10 c0.000,01 -9.999,99e-10           -e-10 c0.000,01 -9.999,99e-10 -9.99999e-10
          -e-08 c6.8 -3.2e-8           -e-08 c6.8 -3.2e-8 -3.2e-8
           -ec86 c0.3 -9.7e13            -ec86 c0.3 -9.7e13 -9.7e13
           -ec99 c7.7 -2.3            -ec99 c7.7 -2.3 -2.3
999,999,999.999,99 999,999,999.999,99 999999999.99999
         +0          +0 0
         +0.3          +0.3 0.3

If no scientific notation is used then alternatively, if a column contains only a few negative numbers we can preserve the correspondence between alphabetic and numeric order by putting the minuses at the position of the right-most plus sign in the column, and inserting a number of nbsp codes after the minus sign, the closer to zero the more.

A column of plain numbers (not in scientific notation), partly negative, sorts as follows:

  • if the first number is non-negative, sorting alternates between ascending numeric order and descending alphabetic order
  • if the first number is negative, sorting first gives ascending alphabetic order, and then alternates again between ascending numeric order and descending alphabetic order (starting with the latter)

Dates

Date sorting mode
07 Apr 2007
16 Apr 2007
18 Mar 2007
27 Mar 2007
20 Aug 2006
22 Jul 2006

Example: (edit to view source)

date
2006 a
2006-12 December 2006
!9936-04 April 64 BC
!9900-07-13-0099-07-13
!9937-09-23-0062-09-23
!9937-10-08-0062-10-08
!9998-12-21-0001-12-21

For dates, the sorting mode is based on the rendered date format. Unfortunately, none of the standard formats for the Mediawiki's date-formatting feature match either of the formats for the "date" sorting mode. Thus, if dates are entered in one of those standard formats, the sorting mode would be "string"; only dates formated as YYYY-MM-DD will result in true chronological sorting.

However, like above we can put a sortkey in front which, due to CSS, is not displayed. With a hidden sortkey one can simply use the non-wikilinked format YYYY-MM-DD for years AD followed by any choice of displayable text, including Mediawiki date formatting. The Wikipedia template w:Template:Dts provides a convenient way of applying this method while using the date-formatting feature for display.

For years BC we can use, for example, !9937-09-23 for -0062-09-23 (subtract the year number BC from 10000, or the absolute value of the astronomical year from 9999).

If a table column contains any or all incomplete dates, this will not cause sorting problems. If only a year and month are given, that incomplete date is positioned alphabetically before the first day of the month in question. Likewise, if only a year is given, the date is positioned before the first month or day given for that year.

If at some point (i.e., after possible previous sorting) the form [[YYYY]] is at the top with a non-negative year, sorting would be numerical; in this case, after toggling between ascending and descending there would be no proper sorting within each year (because parsefloat is applied, finding the first number in the string, and basing sorting on only that number). Also, years BC would not be sorted properly. Therefore, alphabetic sorting has to be enforced. This can be done by putting a non-displayed character after the year, separated by a space.

Examples

Limitations

Javascript sorting may not work properly on tables with cells extending over multiple rows and/or columns. In some cases the table gets messed up when attempting to sort, in other cases some of the sorting buttons work while others don't.

Empty cells

If the first cell below the header of a column of numbers or dates is blank the sort mode will be alphabetic. In the case of a column of numbers this can be avoided by putting a hyphen (minus sign) in the cells without number. This should be avoided in a column that should be sorted alphabetically. Instead, apart from leaving the cell blank, one can use another dash.

Sorting the wikitext of a table

Unfortunately it does not seem possible to directly and automatically sort the wikitext itself, according to one of the sortkeys. This would, after saving, directly produce a table sorted as required.

However, if for a given table, we make an auxiliary sortable table rendering as wikitext for the original table, we can sort the wikitext of the original table.

Example:

Original table:

demo
9
12
11

Auxiliary table:

{|class="sortable" style="width:100%"
!demo

header
|-
| 9
|-
|12
|-
|11

|}

After copying the rendered text to the edit box, and deleting the header line, this renders as:

demo
9
11
12

Alphabetic sorting order

demo
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
0
9
:
;
<
=
>
?
@
[
\
]
^
_
'
A
Z
a
z
A1
Z1
a1
z1
{
|
}
~
É
é
É1
é1

The two-character entries such as A1 demonstrate that A and a are at the same position.

See also

Links to other help pages

Help contents
Meta · Wikinews · Wikipedia · Wikiquote · Wiktionary · Commons: · Wikidata · MediaWiki · Wikibooks · Wikisource · MediaWiki: Manual · Google
Versions of this help page (for other languages see further)
What links here on Meta or from Meta · Wikipedia · MediaWiki
Reading
Go · Search · Namespace · Page name · Section · Backlinks · Redirect · Category · Image page · Special pages · Printable version
Tracking changes
Recent changes (enhanced) | Related changes · Watching pages · Diff · Page history · Edit summary · User contributions · Minor edit · Patrolled edit
Logging in and preferences
Logging in · Preferences
Editing
Starting a new page · Advanced editing · Editing FAQ · Export · Import · Shortcuts · Edit conflict · Page size
Referencing
Links · URL · Interwiki linking · Footnotes
Style and formatting
Wikitext examples · CSS · Reference card · HTML in wikitext · Formula · Lists · Table · Sorting · Colors · Images and file uploads
Fixing mistakes
Show preview · Reverting edits
Advanced functioning
Expansion · Template · Advanced templates · Parser function · Parameter default · Magic words · System message · Substitution · Array · Calculation · Transclusion
Others
Special characters · Renaming (moving) a page · Preparing a page for translation · Talk page · Signatures · Sandbox · Legal issues for editors
Other languages: