User-Agent policy: Difference between revisions
Content deleted Content added
Rock-pp Tags: Reverted Visual edit Mobile edit Mobile web edit |
m Protected "User-Agent policy": Persistent vandalism ([Edit=Allow only autoconfirmed users] (indefinite) [Move=Allow only autoconfirmed users] (indefinite)) |
||
(29 intermediate revisions by 16 users not shown) | |||
Line 1: | Line 1: | ||
{{MovedToFoundationGovWiki|Policy:User-Agent policy|translations=true}} |
|||
<languages /> |
|||
{{notice|<translate><!--T:1--> |
|||
This page is purely informative, reflecting the current state of affairs. To discuss this topic, please use the wikitech-l [[<tvar name="mail-lists">Special:MyLanguage/Mailing lists</tvar>|mailing list]].</translate>}} |
|||
[[Category:Global policies]] |
|||
<translate> |
|||
[[Category:Bots]] |
|||
<!--Chubby malayo bbw and love sex Malaysian locals --> |
|||
[[Category:Policies maintained by the Wikimedia Foundation]] |
|||
As of February 15, 2010, Wikimedia sites require a '''HTTP [[w:User-Agent|User-Agent]] header''' for all requests. This was an operative decision made by the technical staff and was announced and discussed on the technical mailing list.<ref>[<tvar name="ref1url">https://lists.wikimedia.org/pipermail/wikitech-l/2010-February/thread.html#46764</tvar> The Wikitech-l February 2010 Archive by subject]</ref><ref>[<tvar name="ref2url">https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/R4RU7XTBM5J3BTS6GGQW77NYS2E4WGLI/</tvar> User-Agent: - Wikitech-l - lists.wikimedia.org]</ref> The rationale is, that clients that do not send a User-Agent string are mostly ill behaved scripts that cause a lot of load on the servers, without benefiting the projects. User-Agent strings that begin with non-descriptive default values, such as <code>python-requests/x</code>, may also be blocked from Wikimedia sites (or parts of a website, e.g. <code>api.php</code>). |
|||
<!--T:3--> |
|||
Requests (e.g. from browsers or scripts) that do not send a descriptive User-Agent header, may encounter an error message like this: |
|||
<!--T:4--> |
|||
:''Scripts should use an informative User-Agent string with contact information, or they may be blocked without notice.'' |
|||
<!--T:5--> |
|||
Requests from disallowed user agents may instead encounter a less helpful error message like this: |
|||
<!--T:6--> |
|||
:''Our servers are currently experiencing a technical problem. Please try again in a few minutes. |
|||
<!--T:7--> |
|||
This change is most likely to affect scripts (bots) accessing Wikimedia websites such as Wikipedia automatically, via api.php or otherwise, and command line programs.<ref>[<tvar name="ref3url">//www.mediawiki.org/w/index.php?title=API:FAQ#do_I_get_HTTP_403_errors.3F</tvar> API:FAQ - MediaWiki]</ref> If you run a bot, please send a User-Agent header identifying the bot with an identifier that isn't going to be confused with many other bots, and supplying some way of contacting you (e.g. a userpage on the local wiki, a userpage on a related wiki using interwiki linking syntax, a URI for a relevant external website, or an email address), e.g.:</translate> |
|||
<pre> |
|||
User-Agent: CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org) generic-library/0.0 |
|||
</pre> |
|||
<translate> |
|||
<!--T:22--> |
|||
The generic format is <tvar name="fmt"><code><client name>/<version> (<contact information>) <library/framework name>/<version> [<library name>/<version> ...]</code></tvar>. Parts that are not applicable can be omitted. |
|||
<!--T:14--> |
|||
If you run an automated agent, please consider following the Internet-wide convention of including the string "bot" in the User-Agent string, in any combination of lowercase or uppercase letters. This is recognized by Wikimedia's systems, and used to classify traffic and provide more accurate statistics. |
|||
<!--T:8--> |
|||
Do not copy a browser's user agent for your bot, as bot-like behavior with a browser's user agent will be assumed malicious.<ref>[<tvar name="ref4url">//lists.wikimedia.org/pipermail/wikitech-l/2010-February/046783.html [Wikitech-l]</tvar> User-Agent:]</ref> Do not use generic agents such as "curl", "lwp", "Python-urllib", and so on. For large frameworks like pywikibot, there are so many users that just "pywikibot" is likely to be somewhat vague. Including detail about the specific task/script/etc would be a good idea, even if that detail is opaque to anyone besides the operator.<ref>{{cite web|url=<tvar name="ref5url">https://lists.wikimedia.org/pipermail/mediawiki-api/2014-July/003308.html</tvar>|title=Clarification on what is needed for "identifying the bot" in bot user-agent?|publisher=Mediawiki-api|author=Anomie|date=31 July 2014}}</ref> |
|||
<!--T:10--> |
|||
Web browsers generally send a User-Agent string automatically; if you encounter the above error, please refer to your browser's manual to find out how to set the User-Agent string. Note that some plugins or proxies for privacy enhancement may suppress this header. However, for anonymous surfing, it is recommended to send a generic User-Agent string, instead of suppressing it or sending an empty string. Note that other features are much more likely to identify you to a website — if you are interested in protecting your privacy, visit the [<tvar name="eff-url">https://coveryourtracks.eff.org/</tvar> Cover Your Tracks project]. |
|||
<!--T:11--> |
|||
Browser-based applications written in JavaScript are typically forced to send the same User-Agent header as the browser that hosts them. This is not a violation of policy, however such applications are encouraged to include the <tvar name="header-code"><code>Api-User-Agent</code></tvar> header to supply an appropriate agent. |
|||
<!--T:13--> |
|||
As of 2015, Wikimedia sites do not reject all page views and API requests from clients that do not set a User-Agent header. As such, the requirement is not automatically enforced. Rather, it may be enforced in specific cases as needed.</translate><ref>gmane.science.linguistics.wikipedia.technical/83870 ([http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/83870/ deadlink])</ref><translate> |
|||
== Code examples == <!--T:17--> |
|||
<!--T:18--> |
|||
On Wikimedia wikis, if you don't supply a <tvar name="1"><code>User-Agent</code></tvar> header, or you supply an empty or generic one, your request will fail with an HTTP 403 error. Other MediaWiki installations may have similar policies. |
|||
=== JavaScript === <!--T:23--> |
|||
<!--T:19--> |
|||
If you are calling the API from browser-based JavaScript, you won't be able to influence the <tvar name="1"><code>User-Agent</code></tvar> header: the browser will use its own. To work around this, use the <tvar name="header-code"><code>Api-User-Agent</code></tvar> header: |
|||
</translate> |
|||
<syntaxhighlight lang="javascript"> |
|||
// Using XMLHttpRequest |
|||
xhr.setRequestHeader( 'Api-User-Agent', 'Example/1.0' ); |
|||
</syntaxhighlight> |
|||
<syntaxhighlight lang="javascript"> |
|||
// Using jQuery |
|||
$.ajax( { |
|||
url: 'https://example/...', |
|||
data: ..., |
|||
dataType: 'json', |
|||
type: 'GET', |
|||
headers: { 'Api-User-Agent': 'Example/1.0' }, |
|||
} ).then( function ( data ) { |
|||
// .. |
|||
} ); |
|||
</syntaxhighlight> |
|||
<syntaxhighlight lang="javascript"> |
|||
// Using mw.Api |
|||
var api = new mw.Api( { |
|||
ajax: { |
|||
headers: { 'Api-User-Agent': 'Example/1.0' } |
|||
} |
|||
} ); |
|||
api.get( ... ).then( function ( data ) { |
|||
// ... |
|||
}); |
|||
</syntaxhighlight> |
|||
<syntaxhighlight lang="javascript"> |
|||
// Using Fetch |
|||
fetch( 'https://example/...', { |
|||
method: 'GET', |
|||
headers: new Headers( { |
|||
'Api-User-Agent': 'Example/1.0' |
|||
} ) |
|||
} ).then( function ( response ) { |
|||
return response.json(); |
|||
} ).then( function ( data ) { |
|||
// ... |
|||
}); |
|||
</syntaxhighlight> |
|||
<translate> |
|||
=== PHP === <!--T:24--> |
|||
<!--T:20--> |
|||
In PHP, you can identify your user-agent with code such as this: |
|||
</translate> |
|||
<syntaxhighlight lang="php"> |
|||
ini_set( 'user_agent', 'CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org)' ); |
|||
</syntaxhighlight> |
|||
<translate> |
|||
=== cURL === <!--T:25--> |
|||
<!--T:21--> |
|||
Or if you use [[wikipedia:cURL|cURL]]: |
|||
</translate> |
|||
<syntaxhighlight lang="php"> |
|||
curl_setopt( $curl, CURLOPT_USERAGENT, 'CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org)' ); |
|||
</syntaxhighlight> |
|||
<translate> |
|||
=== Python === <!--T:26--> |
|||
<!--T:27--> |
|||
In Python, you can use the [[wikipedia:Requests_(software)|Requests]] library to set a header: |
|||
</translate> |
|||
<syntaxhighlight lang="python"> |
|||
import requests |
|||
url = 'https://example/...' |
|||
headers = {'User-Agent': 'CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org)'} |
|||
response = requests.get(url, headers=headers) |
|||
</syntaxhighlight> |
|||
<translate> |
|||
<!--T:28--> |
|||
Or, if you want to use [<tvar name="url">https://sparqlwrapper.readthedocs.io</tvar> SPARQLWrapper] like in <tvar name="url2">https://people.wikimedia.org/~bearloga/notes/wdqs-python.html</tvar>: |
|||
</translate> |
|||
<syntaxhighlight lang="python"> |
|||
from SPARQLWrapper import SPARQLWrapper, JSON |
|||
url = 'https://example/...' |
|||
user_agent = 'CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org)' |
|||
sparql = SPARQLWrapper(url, agent = user_agent ) |
|||
results = sparql.query() |
|||
</syntaxhighlight> |
|||
<translate> |
|||
== Notes == <!--T:12--> |
|||
</translate> |
|||
<references /> |
|||
<translate> |
|||
== See also == <!--T:15--> |
|||
</translate> |
|||
* <translate><!--T:16--> [[<tvar name="policy">wikitech:Robot policy</tvar>|Policy for crawlers and bots]] that wish to operate on Wikimedia websites</translate> |
|||
[[Category:Global policies{{#translation:}}]] |
|||
[[Category:Bots{{#translation:}}]] |
|||
[[Category:Policies maintained by the Wikimedia Foundation{{#translation:}}]] |
Latest revision as of 22:36, 23 April 2024
This page has been moved to "Policy:User-Agent policy" on Wikimedia Foundation Governance Wiki where you can provide translations and feedback on it. Any translations should be provided at its location on Wikimedia Foundation Governance Wiki and any related feedback provided on its talk page there. Thank you! |