API

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Jacobolus (talk | contribs) at 08:25, 10 September 2006 (→‎Sample Requests). It may differ significantly from the current version.

Attention visitors

This page discusses the future API for MediaWiki software.

MediaWiki at present has three interfaces:

  • Query API for retrieving any information in xml/json/php formats.
  • Special:Export feature (bulk export of xml formatted data)
  • Regular Web-based interface

Implementation Strategy

File/Module Structure

  • api.php is the entry point, located at the root of the mediawiki
  • includes/api will contain all files related to the api, but none of them will be allowed as entry points
    • files related to formatting the output will be in the form Output*.php,

Internal Data Structures

  • Query API has had very successful structure of one global nested array() structure passed around. Various modules would add pieces of data to many different points of that array, until, finally, it would get rendered for the client by one of the printers (output modules). For the API, I suggest wrapping this array as a class with helper functions to append individual leaf nodes.

Error/Status Reporting

For now we decided to include error information inside the same structured output as normal result (option #2).

For the result, we may either use the standard HTTP error codes, or always return a properly formatted data:

Using HTTP code
void header( string reason_phrase [, bool replace [, int http_response_code]] )

The header() can be used to set the return status of the operation. We can define all possible values of the reason_phrase, so for the failed login we may return code=403 and phrase="BadPassword", whereas for any success we would simply return the response without altering the header.

Pros: Its a standard. The client always has to deal with http errors, so using http code for result would remove any separate error handling the client would have to perform. Since the client may request data in multiple formats, an invalid format parameter would still be properly handled, as it will simply be another http error code.

Cons: ...

Include error information inside a proper response

This method would always return a properly formatted response object, but the error status / description will be the only values inside that object. This is similar to the way current Query API returns status codes.

Pros: HTTP error codes are used only for the networking issues, not for the data (logical errors). We do not tied to the existing HTTP error codes.

Cons: If the data format parameter is not properly specified, what is the format of the output data? Application has to parse the object to know of an error (perf?). Error checking code will have to be on both the connection and data parsing levels.

Sample Requests

Login

Request:
  api.php ? action=login & name=Yurik & password=12345 [& domain=wikipedia.org]
Result:
  api :
    login :
      result : 'Success'        Other values: 'NoName', 'Illegal', 'WrongPluginPass',
                                                'NotExists', 'WrongPass', 'EmptyPass
      token : 1234567890ABCDEF  Also returned as a cookie (i.e. enwikiToken)
      userName : Yurik          Also returned as a cookie (i.e. enwikiUserName)
      userID : 12345            Also returned as a cookie (i.e. enwikiUserID)

Query

General
Title Normalization
Converts improper page titles to their proper form. Capitalizes first character, replaces '_' with ' ', changes canonical namespace names to their localized alternatives, etc.
Request: Note: articleA's first letter is not capitalized
  api.php ? action=query & titles=Project:articleA|ArticleB
Result:
  'api' :
    'query' :
      'pages' :
        'Wikipedia:ArticleA' :                       Project: is converted to Wikipedia: when running on en-wiki.
          'ns' : 4                                   Show title's namespace except when ns=0
        'ArticleB' : None
      'badtitles' :                                  This element needs a better name
        'Project:articleA' : 'Wikipedia:ArticleA'
Handling Redirects
Redirects can resolved by the server, so that the target of redirect is returned instead of the given title. This example is not very usefull without additional what=... element, but shows the usage of redirect function. The 'redirects' section will contain the target of redirect and non-zero namespace code. Both normalization and redirection may take place. In case of redirect to a redirect, all redirections will be solved, and in case of a circular redirection, there might not be a page in the 'pages' section.
Request:
  api.php ? action=query & titles=Main page & redirects
Result:
  'api' :
    'query' :
      'pages' :
        'Main Page' : None
      'redirects' :
        'Main page' : 'Main Page'
Content
Returns wiki markup for the given list of articles. Requesting content is the same as requesting the last revision with content, so it would be equivalent to what=revisions, except that content can be requested on multiple pages, whereas revisions - only on one. The same result can be obtained with what=revisions & rvprop=content.
Request: 
  api.php ? action=query & what=content & titles=ArticleA|ArticleB
Result:
  'api' :
    'query' :
      'pages' :
        'ArticleA' :
          'id' : 12345
          'lastrev' : 67890
          'revisions' :
            '67890' :
              'content' : '...raw page content...'
        'ArticleB' :
          'id' : 0                           ID=0 when title does not exist
Revisions
Returns revisions for a given article based on the selection criteria. Revisions may only be requested for a single title. By default shows only the id of the last revision.
Request: 
  api.php ? action=query & what=revisions & titles=ArticleA & rvprop=timestamp|user|comment|content
Result:
  'api' :
    'query' :
      'pages' :
        'ArticleA' :
          'id' : 12345
          'lastrev' : 67890
          'revisions' :
            '67890' :
              'timestamp' : '20060908025739'
              'user' : 'UserX'
              'comment' : '...change comment...'
              'content' : '...raw revision content...'
Additional 'revisions' samples
Get the timestamps of up to 10 revisions, begining at 2006-09-01 and moving forward in time.
  api.php ? action=query & what=revisions & titles=ArticleA & rvprop=timestamp & rvlimit=10 & rvdir=newer & rvstart=20060901000000
Get the timestamps of all revisions for the entire month of September 2006. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain 'continue':'rvstart=20060920122343' with the timestamp to continue from.
  api.php ? action=query & what=revisions & titles=ArticleA & rvprop=timestamp & rvstart=20060901000000 & rvend=20061001000000
Get the timestamps of up to 10 revisions, begining at 12345 and moving back in time. If more than 10 revisions are available, 'revisions' element will contain 'continue':'revids=23512' , where revid is the next revision id in order.
  api.php ? action=query & what=revisions & revids=12345 & rvprop=timestamp & rvlimit=10 & rvdir=older
Get the timestamps of all revisions between two given revision IDs. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain 'continue':'rvstartid=23512' with the revid to continue from. Both rvstartid & rvendid must belong to the same title. The titles= parameter is not required, but if given, it must be set to the same title as revision IDs.
  api.php ? action=query & what=revisions & rvprop=timestamp & rvstartid=12345 & rvendid=67890
Lists
All pages
Returns a list of pages starting at ... Up to the limit.
Watchlist
Get a list of pages on the user's watchlist but only if they were changed within the given time period.
Generators

Save

Request:
  api.php ? action=save & title=Project:articleA & edittoken=abc123 & summary=...lalala... & content=...wikitext...
Result:
  'api' :
    'save' :
      'status' : 'Success'             Other values: 'Prohibited', 'Conflict', 'DbLcoked', 'BadToken''
      'title' : 'Wikipedia:ArticleA'   Always returns normalized title
      'ns' : 4                         Show title's namespace except when ns=0
      'revid' : 67891                  On success, the new latest revision id

Wikimania 2006 API discussion

The whiteboard these notes were taken from

A simple-as-possible API was quickly designed at the Wikimania Hacking Days on 4 August 2006. This minimal API only supports the most essential functions, with the goal of allowing a quick implementation, with other functionality (moving pages, uploading images, etc.) to be specified once a basic API has been implemented.

Login

This function logs the user into mediawiki using a username and password. Other authentication methods could possibly be added in the future.

in
username, password, (API key)
API key is not needed but should be introduced now, as it would make it very difficult to get it in later. --Yurik 14:52, 6 August 2006 (UTC)[reply]
out
success, authentication failure, or temporary failure
We can use standard HTTP error codes (500-something is a security error, 200ish - accept, need to lookup the exact codes). --Yurik 14:52, 6 August 2006 (UTC)[reply]

Retrieving Data

Please note that a much more extensive interface - Query API is already available for all media wiki servers. It allows developers to minimize the server load and bandwidth consumption by returning only the data specifically requested. By total coincidence the usage is similar to what Yahoo just released at python dev center.

read revision

This function requests the full text of a single article revision, either by article title/page ID, in which case the most recent revision is fetched, or by revision ID. The return value is xml which includes the name of the page, the name of the most recent editor, a revision ID and timestamp, the full wikitext of the revision, and a flag if the article is a redirect, including the name of the page to which it redirects.

in
article title or page ID, or revision ID
out
A format mostly identical to the current special:export, but also including some metadata about whether the page is a redirect, and the page to redirect to, so that the client does not need to parse the wikitext


Each property requested in the what=... incures additional performance penalty, some more than others.

in article title, with redirect info

http://meta.wikimedia.org/w/query.php?titles=Main_Page&what=namespaces%7Cinfo%7Credirects%7Crevisions&rvcontent&rvcomments&rvlimit=1

in article ID, with redirect info

http://meta.wikimedia.org/w/query.php?pageids=12631&what=namespaces%7Cinfo%7Credirects%7Crevisions&rvcontent&rvcomments&rvlimit=1

in revision ID, with redirect info

http://meta.wikimedia.org/w/query.php?revids=406406&what=namespaces%7Cinfo%7Credirects%7Crevisions&rvcontent&rvcomments&rvlimit=1

in article title

http://meta.wikimedia.org/w/query.php?titles=Main_Page&what=namespaces%7Cinfo%7Crevisions&rvcontent&rvcomments&rvlimit=1

in article ID

http://meta.wikimedia.org/w/query.php?pageids=12631&what=namespaces%7Cinfo%7Crevisions&rvcontent&rvcomments&rvlimit=1

in revision ID

http://meta.wikimedia.org/w/query.php?revids=406406&what=namespaces%7Cinfo%7Crevisions&rvcontent&rvcomments&rvlimit=1

read history

This function requests the edit history of an article, referenced either by title, or by page ID. The range start parameter determines the first edit returned, referenced by revision ID or ISO date. The limit parameter determines how many pages before or since that edit should be in the list, up to 5000 revisions, or some other maximum to be determined.

in
article title or page ID, range start, +- limit
range start can be ISO date or revision ID.
if negative range start, fetch last limit revisions bounded by some maximum of revisions: currently 5000
always return up to 5000 revisions
out
export xml metadata (see readrevision)

return 10 revisions standard, more revisions will be returned when the rvlimit=? parameter is added

in article title

http://meta.wikimedia.org/w/query.php?titles=Main_Page&what=revisions&rvcontent&rvcomments

in page ID

http://meta.wikimedia.org/w/query.php?titles=Main_Page&what=revisions&rvcontent&rvcomments

Submitting Data

This function attempts to save an article to the mediawiki server. The revision ID of the revision the edit is based on is provided so that in the case of an edit conflict, the function will fail. If another edit has been made to the page since the page was loaded, but a simple merge is possible, then the function will succeed.

in
last revision ID or 0 (if the page is new), page text, page title, edit comment, minor edit flag, edit token
out
success(version ID), edit conflict fail, permission failure, temp failure

Notes

Date format:

UTC only, ISO 8601
Date = 0 means the current revision

Simple merges on write:

Because simple merges will be undertaken automatically by mediawiki, it is impossible to assume that when the write article function succeeds, the submitted text is authoritative. If further edits will be made, a new copy of the most recent revision should be fetched from the server.

preexisting

http://en.wikipedia.org/w/query.php