API: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Revert to the revision prior to revision 22874674 dated 2022-02-22 05:23:35 by Ogmommy3 using popups
Tag: Manual revert
 
(42 intermediate revisions by 22 users not shown)
Line 1: Line 1:
{{MovedToMediaWiki|API}}
<div style="background: #f0f0e0; border: 2px solid #aaa; margin: 1em 10%; padding: 4px 4px 4px 15px; width:60%; text-align: center;">
'''Attention visitors'''


Moved preserving full histories. '''[[User:Robchurch|robchurch]]''' | [[User_talk:Robchurch|talk]] 07:09, 14 May 2007 (UTC)
This page discusses the future API for MediaWiki software.

<div style="text-align: left;">
MediaWiki at present has three interfaces:
* [http://en.wikipedia.org/w/query.php Query API] for retrieving any information in xml/json/php formats.
* [[Special:Export]] feature (bulk export of xml formatted data)
* Regular Web-based interface
</div>
</div>

{{MoveToMediaWiki}}

The goal of this API is to provide a direct high-level access to the data contained in the MediaWiki databases. The client program must be able to login, get data, and post changes. The API must support thin web-based JavaScript clients, such as Popup (link needed), applications running on user's machine (vandal fighter), or be accessed by another web site (tool server's utilities)

All output will be available in a structured tree format such as XML, JSON, YAML, WDDX, or PHP serialized. A strongly typed RSS or WSDL-style format might also be implemented using wrappers.

Each API module uses a set of parameters. To prevent name collision, each module has a two letter abbreviation, and each parameter name begin with those two letters. For example, the action=login has prefix '''lg''' for all of its parameters -- lgname and lgpassword.

== Using API internally by other code (done) ==
Sometimes other parts of the code may wish to use the data access and aggregation functionality of the API. Here are the steps needed to accomplish such usage:

1) Prepare request parameters using FauxRequest class. All parameters are the same as if making the request over the web.
$params = new FauxRequest(array (
'action' => 'query',
'list' => 'allpages',
'apnamespace' => 0,
'aplimit' => 10,
'apprefix' => $search
));

2) Create and execute ApiMain instance. The ApiMain will recocgnize the FauxRequest object and will not handle any internal errors.
$module = new ApiMain($params);
$module->execute();

3) Get the resulting data array. Optional sanitizing may be done to remove any "_element" keys that are used internally for proper XML generation.
$result = & $module->getResult();
$result->SanitizeData();
$data = & $result->GetData();
Alternativelly you may use GetResultData() to get raw, unsanitized data.
$data = & $module->GetResultData();

== Login / lg (done) ==
Login gets several tokens that are needed by the server to recognize logged-in user. In every call to api.php, the three values must either be passed as additional parameters, or as cookies within the request header. If any of the login values are given as part of the request, all cookie values are ignored. Please note that user name is passed in as '''lgname''', but returned as normalized '''lgusername'''. The first is used for authentication, whereas the second may be passed together with '''lgtoken''' and '''lguserid''' as tokens when making calls to other modules.

'''Note:''' In this and other examples, all parameters are passed in a GET request just for the sake of simplicity. In your application, make sure all large and/or security sensitive parameters are given as part of the POST request.
'''Request:'''
api.php ? action=login & lgname=Yurik & lgpassword=12345 [& lgdomain=wikipedia.org]
'''Result:'''
api:
login:
result: Success ''Other values: NoName, Illegal, WrongPluginPass,''
''NotExists, WrongPass, EmptyPass''
lgtoken: 123ABC ''Also returned as a cookie (i.e. enwikiToken)''
lgusername: Yurik ''Normalized lgname, ''
''also returned as a cookie (i.e. enwikiUserName)''
lguserid: 12345 ''Also returned as a cookie (i.e. enwikiUserID)''

To use the above values, pass them without alteration to any api.php call in addition to other parameters. Here, a rollback token is acquired for the Main Page (restricted operation):
api.php ? action=query & lgtoken=123ABC & lgusername=Yurik & lguserid=23456
& prop=info & intokens=rollback & titles=Main Page

;Example
: http://en.wikipedia.org/w/api.php?action=login&lgname=user&lgpassword=password

== OpenSearch support (done) ==
This module allows web browsers (Firefox 2.0 at this time) an auto-suggest functionality in the search box. The module needs to be extremelly fast, and provide a simple JSON-formatted output in the form of
["search", ["suggestion1", "suggestion2", ...]]
Since the server might be hit on every user keystroke, the potential server load might be so heavy as to move this feature to separate server(s).

== WatchList RSS/ATOM feeds (done) ==
This module returns watchlist data in a feed format. The potential performance impact is still being evaluated.

== Query - General ==
=== Overview ===
Query API module allows applications to get needed pieces of data from the MediaWiki databases, and is loosely based on the [http://en.wikipedia.org/w/query.php Query API] interface currently available on all MediaWiki servers. All data modifications will first have to use query to acquire a token to prevent abuse from malicious sites.

=== Title Normalization (done) ===
: Converts improper page titles to their proper form. Capitalizes first character, replaces '_' with ' ', changes canonical namespace names to their localized alternatives, etc.
'''Request:''' ''Note: articleA's first letter is not capitalized''
api.php ? action=query & titles=Project:articleA|ArticleB
'''Result:'''
api:
query:
pages:
Wikipedia:ArticleA: ''Project: is converted to Wikipedia: when running on en-wiki.''
ns: 4 ''Show title's namespace except when ns=0''
ArticleB:
normalized: ''Any requested titles not in the "proper" form will be here''
Project:articleA: Wikipedia:ArticleA

; Example
: http://en.wikipedia.org/w/api.php?action=query&titles=Project:articleA|ArticleB

=== Redirects (done) ===
: Redirects can be resolved by the server, so that the target of redirect is returned instead of the given title. This example is not very useful without additional prop=... element, but shows the usage of redirect function. The 'redirects' section will contain the target of redirect and non-zero namespace code. Both normalization and redirection may take place. In case of redirect to a redirect, all redirections will be solved, and in case of a circular redirection, there might not be a page in the 'pages' section.
'''Request:'''
api.php ? action=query & titles=Main page & redirects
'''Result:'''
api:
query:
pages:
Main Page:
redirects:
Main page: Main Page
: Same request without the "redirects" parameter would treat "Main page" as a regular page, so revisions and other information may be obtained. In order to see that it is a redirect, the basic page info must be requested using prop=info.
'''Request:'''
api.php ? action=query & titles=Main page & prop=info
'''Result:'''
api:
query:
pages:
Main page:
id: 12342
redirect:

; Example
: http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page&redirects
: http://en.wikipedia.org/w/api.php?action=query&titles=Main%20page

=== Circular Redirects (done) ===
: Assume Page1 &rarr; Page2 &rarr; Page3 &rarr; Page1 (circular redirect). Also, in this example a non-normalized name 'page1' is used.
'''Request:'''
api.php ? action=query & titles=page1 & redirects
'''Result:'''
api:
query:
redirects:
Page1: Page2 ''Redirects are present, but not the 'pages' element.''
Page2: Page3
Page3: Page1
normalized:
page1: Page1

=== Limits ===
: To prevent server overloads, each query imposes a limit on how many items it can process. Anonymous and logged-in users have one limit, while bots have a considerably higher limit as they are trusted by the community. At present, each query simply lists the maximum request size it allows. For example, allpages list will allow aplimit= to be set no higher than 500, or in case of a bot - no higher than 5000.
: '''Drawbacks:''' Currently all limits are additive, so if the user requests allpages and backlinks, the user will get 500 of each. This is not very good, as the more items are compounded into one request, the heavier the load on the server will be. Instead, some sort of a weighted mechanism should be developed, where each request item has a certain "cost" associated with it, and each user is allocated a fixed allowance per request. The more information user requests, the less the limit becomes for that request. Unfortunately, that makes it very hard to figure out the maximum limits before executing the query, so might not be a workable solution.

== Query - Meta-Information ==
Meta queries allow clients to retrieve the data about the MediaWiki settings itself.

To get meta information, clients will use '''meta=''' parameter:
api.php ? action=query & meta=siteinfo|userinfo & ...

=== siteinfo / si (done) ===
: Returns overall site information.
: '''Parameters:''' siprop=namespaces|general
; Example
: http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo

=== userinfo / ui ===
: Returns information about the current user.
: '''Parameters:''' uiprop=isblocked|hasmsg|rights|groups, uioptions=<opt name>|...
; Example
: http://en.wikipedia.org/w/api.php?action=query&meta=userinfo

== Query - Page Information ==
Page information items are used to get various data about a list of pages specified with either the '''titles=''', '''pageids=''', or '''revids=''' parameters, or by using [[#generators]]. Content, links, interwiki links, and other information may be obtained.

=== info / in (done except tokens) ===
: Gets the basic page information such as pageid, last revid, redirect, last touched, etc. Limit: 500/5000.
: '''Parameters:''' intokens=edit|rollback|delete|protect|move
: '''Issues:''' Should there be tokens for rollback/delete/protect/move be available in this way, as oppose to having an ''action='' for each task? There is a potential for abuse, as someone might have a link on their website to wiki, and that link would contain a "delete" action. If a logged in admin clicks on that link, the api will recognize them because of their cookie, and will allow the deletion.
'''Request:'''
api.php ? action=query & prop=info & titles=TitleA
'''Result:'''
api:
query:
pages:
TitleA:
id: 12341
lastrev: 23456
touched: 20060908025739

=== categories / cl ===
: Gets a list of all templates used on the provided pages. Limit: 200/1000.
: '''Parameters:''' clprop=sortkey|timestamp
'''Request:'''
api.php ? action=query & prop=categories & titles=TitleA
'''Result:'''
api:
query:
pages:
TitleA:
categories:
Category:Cat1:
Category:Cat2:

=== Content (done) ===
: Requesting content should be done by requesting the last revision with content property.
api.php ? action=query & prop=revisions & rvprop=content & titles=ArticleA|ArticleB

=== imageinfo / ii ===
: Gets image information for any titles in the image namespace (#6).
: '''Parameters:''' iiprop=url|history|comment|stats|user|timestamp, iisource=local/shared/all (dflt=local)
:: url - path to the image, history - include every old image versions, stats - image size/type, user - uploader, iisource - look at the local or shared (commons) image repository, or both.
: '''Example:''' Get comments for all image uploads, both local and in the commons repository. Here, ImageA was uploaded 3 times to the local wiki, and 2 times to the shared (commons) repository.
'''Request:'''
api.php ? action=query & prop=imageinfo & titles=Image:ImageA & iiprop=comment|history & iisource=all
'''Result:'''
api:
query:
pages:
Image:ImageA:
ns:6
imageinfo:
local:
comment: last update comment
localhistory:
- ''history is an unordered list of items''
comment: some update
-
comment: another update
shared:
comment: last update on commons
sharedhistory:
-
comment: some update on commons

=== langlinks / ll ===
: Gets a list of all language links (interwikies) from the provided pages to other languages. Limit: 200/1000.

=== links / pl ===
: Gets a list of all links from the provided pages. Limit: 200/1000.
: '''Parameters:''' plnamespace (flt).

=== templates / tl ===
: Gets a list of all templates used on the provided pages. Limit: 200/1000.

=== imagelinks / il ===
''In Query API interface, this command found pages that embedded the given image. It has been renamed to [[#imgembeddedin / ie|imgembeddedin]].
: Gets a list of all images used on the provided pages. Limit: 200/1000.

== Query - Revisions (done) ==
Returns revisions for a given article based on the selection criteria. Revisions may be used with multiple titles only when working with the latest revision. When using rvlimit, rvdir=newer, rvstart, or rvend parameters, titles= must have only one title listed. By default, revisions shows only the id of the last revision.
'''Request:'''
api.php ? action=query & prop=revisions & titles=ArticleA & rvprop=timestamp|user|comment|content
'''Result:'''
api:
query:
pages:
ArticleA:
id: 12345
lastrev: 67890
revisions:
67890:
timestamp: 20060908025739
user: UserX
comment: ...change comment...
content: ...raw revision content...

; Additional 'revisions' samples
: Get the timestamps of up to 10 revisions, beginning at 2006-09-01 and moving forward in time.
api.php ? action=query & prop=revisions & titles=ArticleA
& rvprop=timestamp & rvlimit=10 & rvdir=newer & rvstart=20060901000000
: Get the timestamps of all revisions for the entire month of September 2006. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain ''' 'continue':'rvstart=20060920122343' ''' with the timestamp to continue from.
api.php ? action=query & prop=revisions & titles=ArticleA
& rvprop=timestamp & rvstart=20060901000000 & rvend=20061001000000
: Get the timestamps of up to 10 revisions, beginning at 12345 and moving back in time. If more than 10 revisions are available, 'revisions' element will contain ''' 'continue':'revids=23512' ''', where revid is the next revision id in order.
api.php ? action=query & prop=revisions & revids=12345
& rvprop=timestamp & rvlimit=10 & rvdir=older
: Get the timestamps of all revisions between two given revision IDs. rvlimit is optional. If the number of revisions exceeds the limit, the 'revisions' element will contain ''' 'continue':'rvstartid=23512' ''' with the revid to continue from. Both rvstartid & rvendid must belong to the same title. The titles= parameter is not required, but if given, it must be set to the same title as revision IDs.
api.php ? action=query & prop=revisions & rvprop=timestamp & rvstartid=12345 & rvendid=67890

=== Examples ===
; Get data with content for the last revision of titles "API" and "Main Page":
: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=API|Main%20Page&rvprop=timestamp|user|comment|content
; Get last 5 revisions of the "Main Page":
: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment
; Get first 5 revisions of the "Main Page":
: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer
; Get first 5 revisions of the "Main Page" made after 2006-05-01:
: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvlimit=5&rvprop=timestamp|user|comment&rvdir=newer&rvstart=20060501000000

== Query - Lists ==
Lists differ from other properties in two aspects - instead of appending data to the elements under 'pages' element, each list has its own separated branch under 'query' element. Also, list output is limited by number of items, and may be continued using "paging" technique. Even when no limit is provided, the query will only return a set number of items, and will also provide a string point from which to continue paging. See allpages list for an example.

=== allpages / ap (done) ===
: Returns a list of pages in a given namespace starting at ''from'', ordered by page title.
: '''Parameters:''' apfrom (paging), apnamespace (dflt=0), apredirect (flt), aplimit (dflt=10, max=500/5000)

: '''Example:''' Request a list of 3 pages from namespace 10 (templates) beginning at the first available page.
'''Request:'''
api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3
'''Result:'''
api:
query:
allpages:
Template:A-Article:
id: 12341
ns: 10
Template:B-Article:
id: 12342
ns: 10
Template:C-Article:
id: 12343
ns: 10
query-status:
allpages:
continue: apfrom=D-Article ''The next item in this list would have been Template:D-Article.''
: The client may now make another request using the ''continue'' value as a parameter:
api.php ? action=query & list=allpages & apnamespace=10 & aplimit=3 & apfrom=D-Article

=== backlinks / bl ===
: Lists pages that link to the given page. Ordered by linking page title.
: '''Parameters:''' bltitle, blfrom (paging), blnamespace (flt), blredirect (flt), bllimit (dflt=10, max=500/5000)
api.php ? action=query & list=backlinks & bltitle=ArticleA

=== categorymembers / cm ===
: List of pages that belong to a given category, ordered by page title.
: '''Parameters:''' cmtitle (if title is in NS 0, treats it as category NS), cmfrom (paging), cmnamespace (flt), cmlimit (dflt=10, max=500/5000)
api.php ? action=query & list=categorymembers & cmtitle=category:title

=== embeddedin / ei ===
: What pages include ''template:title'' page as a template. List of pages that include the given page using <nowiki>{{title}}</nowiki>. Ordered by including page title.
: '''Parameters:''' eititle, eifrom (paging), einamespace (flt), eiredirect (flt), eilimit (dflt=10, max=500/5000)
api.php ? action=query & list=embeddedin & eititle=template:title

=== imgembeddedin / ie ===
''This was renamed from imagelinks to avoid the confusion. [[#imagelinks / il|imagelinks]] will now be used to get all images used on a given page''.
: List of pages that include a given image. Ordered by page title.
: '''Parameters:''' ietitle (if image title is in NS 0, treats it as an image NS), iefrom (paging), ienamespace (flt), ielimit (dflt=10, max=500/5000)
api.php ? action=query & list=imgembeddedin & ietitle=image:title

=== logevents / le (semi-complete) ===
: List log events, filtered by time range, event type, user type, or the page it applies to. Ordered by event timestamp.
: '''Parameters:''' letype (flt), lefrom (paging timestamp), leto (flt), ledirection (dflt=older), leuser (flt), letitle (flt), lelimit (dflt=10, max=500/5000)
api.php ? action=query & list=logevents - ''List last 10 events of any type''

=== recentchanges / rc ===
: Gets a list of pages recently changed, ordered by modification timestamp.
: '''Parameters:''' rcfrom (paging timestamp), rcto (flt), rcnamespace (flt), rcminor (flt), rcusertype (dflt=not|bot), rcdirection (dflt=older), rclimit (dflt=10, max=500/5000)
api.php ? action=query & list=recentchanges - ''List last 10 changes''

=== usercontribs / uc ===
: Gets a list of pages modified by a given user, ordered by modification time.
: '''Parameters:''' ucuser, ucfrom (paging timestamp), ucto (flt), ucnamespace (flt), ucminor (flt), uctop (flt), ucdirection (dflt=older), uclimit (dflt=10, max=500/5000)
api.php ? action=query & list=usercontribs & ucuser=User:UserA - ''List last 10 changes made by userA''

=== users / us ===
: Gets a list of registered users, ordered by user name.
: '''Parameters:''' usfrom (paging), uslimit (dflt=10, max=500/5000)

=== watchlist / wl (done) ===
: Get a list of pages on the user's watchlist but only if they were changed within the given time period. Ordered by time of the last change of the watched page.
: '''Parameters:''' wlfrom (paging timestamp), wlto (flt), wlnamespace (flt), wldirection (dflt=older), wllimit (dflt=10, max=500/5000)

== Query - Generators (done) ==
Generator is way to use one of the above [[#lists]] instead of the titles= parameter. The output of the list must be a list of pages, whose titles get automatically used instead of the titles=/revids=/pageids= parameters. Other queries such as content, revisions, etc, will treat those pages as if they were provided by the user in the titles= parameter. Only one generator is allowed, and while it is possible to have both generator= and list= parameters in the same call, they may not contain the same values.

=== Using allpages as generator ===
Use the allpages list as a generator, to get the links and categories for all titles returned by allpages.
'''Request:'''
api.php ? action=query & generator=allpages & apnamespace=3 & aplimit=10 & apfrom=A & prop=links|categories
'''Result:'''
api:
query:
pages:
Template:A-Article:
id: 12341
ns: 10
links:
Linked Article1: ''Linked Article1 is in the main namespace''
Talk:Linked Article2: ''For non-main ns, list it as a sub-element''
ns: 1
...
categories:
Category:Cat1:
Category:Cat2:
...
Template:B-Article:
...
Template:C-Article:
...
query-status:
allpages:
continue: apfrom=D-Article ''The next item in this list would have been Template:D-Article.''


=== Generators and redirects ===
Here, we use "links" page property as a generator. This query will get all the links from all the pages that are linked from Title. For this example, assume that Title has links to TitleA and TitleB. TitleB is a redirect to TitleC. TitleA links to TitleA1, TitleA2, TitleA3; and TitleC links to TitleC1 & TitleC2. Redirect is solved because of the "redirects" parameter.

: The query will execute the following steps:
:# Resolve titles parameter for redirects
:# For all pages specified in titles=...|... parameter, get all links, and substitute original with the new titles=...|... parameter.
:# Resolve new titles list for redirects
:# Execute regular prop=links query using the internally created list of titles.

'''Request:'''
api.php ? action=query & generator=links & titles=Title & prop=links & redirects
'''Result:'''
api:
query:
pages:
TitleA:
links:
TitleA1:
TitleA2:
TitleA3:
TitleC:
links:
TitleC1:
TitleC2:
redirects:
TitleB: TitleC

=== Examples ===
; Show info about 4 pages starting at the letter "T"
: http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=4&gapfrom=T&prop=info
; Show content of first 2 non-redirect pages begining at "Re"
: http://en.wikipedia.org/w/api.php?action=query&generator=allpages&gaplimit=2&gapfilterredir=nonredirects&gapfrom=Re&prop=revisions&rvprop=content

== Posting Data / needs major editPage.php rewrite ==
action=submit allows data to be posted back to the MediaWiki servers. For this to work, the client must first obtain an edittoken by using ''prop=info & intokens=edit'' query call. Both the lastrev and the token have to be sent to the server, together with the title of the page, its content, and the summary comment. ''disablemerge'' parameter stops the save operation in case the article has been modified after the query call. ''testrun'' parameter attempts the save operation by merging the content with the newer changes (if needed), and returning how the page would look like if it was saved, but without actually changing any data.

Note: The parameters should be modified to allow for the controlled merge. For example: rev #1 is received, an attempt is made to save changes to it, but rev #2 has been created in the meantime. The client decides to allow merge with rev #2, but while the decision is made, the rev #3 has been published. The client should have the option to only allow merging with rev #2 which was verified, not with rev#3 that it has not yet seen.

'''Request:'''
api.php ? action=submit & title=Project:articleA & edittoken=abc123 & revid=12345
& summary=edit_comment & content=wikitext
[& minorEdit] [& disablemerge] [& testrun]
'''Result:'''
api:
save:
status: Success ''Other values: 'Prohibited', 'Conflict', 'DbLcoked', 'BadToken', 'MergeRequired'''
''(for the testrun: 'CanMerge', 'CanSaveAsIs' ''
title: Wikipedia:ArticleA ''Always returns normalized title''
ns: 4 ''Show title's namespace except when ns=0''
id: 12345 ''On success, the ID of the page''
revid: 67891 ''On success, the new latest revision id''
redirect: ''On success, when saved page is now treated as a redirect''
content: wiki content ''When used with ''testrun'', this field will be set to the merge result

== Moving/Renaming Pages ==
'''Request'''
api.php ? action=move & mvfrom=OldTitle & mvto=NewTitle & mvtoken=123ABC [& mvoverride]


== Implementation Strategy ==
''See [[/Implementation Strategy]]''.

== Wikimania 2006 API discussion ==
''See [[/Wikimania 2006 API discussion]].''

Latest revision as of 06:02, 22 February 2022

Moved preserving full histories. robchurch | talk 07:09, 14 May 2007 (UTC)[reply]