Pywikibot/Use on non-WMF wikis

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by 82.236.150.2 (talk) at 20:27, 5 September 2008 (→‎user-config.py file). It may differ significantly from the current version.

The pywikipedia bot may be used to do all kind of things that are important for the maintenance of a MediaWiki project. When this software is to be used outside of the Wikimedia projects, some configuration needs to be done.

Some non-Wikimedia projects, or families, are already supported. These can be found in the families folder [1] which can be downloaded.

Using the existing files as examples, it should be easy to adapt the bot to your own project.

Instructions

user-config.py file

Open a text file. (Notepad.exe is a text file editor)

Save the text file as user-config.py, in the main pywikipedia folder.

Add the following three required lines to user-config.py:

Code Explanation
mylang = 'xx'

xx is the code for the language code you are working on, "en" is English.[1]

usernames['sitename']['en'] = u'ExampleBot'[2]

Your user-config.py file needs to specify the bot's username.

In this example, the user is working on English sitename, and has created a bot account with the username "ExampleBot".

family = 'sitename'

"Sitename" is the name of the site you're working on.

Sitename is the same name as the sitename in the above username line (usernames['sitename']['en'] = u'ExampleBot'.)

Now save user-config.py again.

family.py file

Modify the existing files, or create a new file in a notebook.txt file.

Save the file in the pywikipedia/families folder, with a name such as mozilla_family.py

Refer to Using the python wikipediabot on how to run the bot.

Examples

Example: Mozilla wiki

The Mozilla Foundation's wiki, wiki.mozilla.org, is a very simple example because it is only available in one language.

This is the contents of families/mozilla_family.py. Hints for you to write your own family specification are underlined.

 # -*- coding: utf-8  -*-
 
import family
 
 # The official Mozilla Wiki. #Put a short project description here.
 
class Family(family.Family):
 
     def __init__(self):
         family.Family.__init__(self)
         self.name = 'mozilla' #Set the family name; this should be the same as in the filename.
         self.langs = {
             'en': 'wiki.mozilla.org', #Put the hostname here.
         }
         self.namespaces[4] = {
             '_default': u'MozillaWiki', #Specify the project namespace here. Other
         }                               #namespaces will be set to MediaWiki default.
 
         self.namespaces[5] = {
             '_default': u'MozillaWiki talk',
         }
 
     def version(self, code):
         return "1.4.2"  #The MediaWiki version used. Not very important in most cases.
 
     def scriptpath(self, code):
         return '' #The relative path of index.php, api.php : look at your wiki address. 
# This line may need to be changed to /wiki or /w, 
# depending on the folder where your mediawiki program is located.

Example: Memory Alpha

memoryalpha_family.py is the "family" definition of Memory Alpha, www.memory-alpha.org, a Star Trek wiki. This specification is a little bit more difficult because it has several languages.

 # -*- coding: utf-8  -*-
 import family
 
 # The Memory Alpha family, a set of StarTrek wikis.
 
 class Family(family.Family):
     def __init__(self):
         family.Family.__init__(self)
         self.name = 'memoryalpha'
 
         self.langs = {  # All available languages are listed here.
             'de': None, # Because the hostname is the same for all languages,
             'en': None, # we don't specify it here, but below in the hostname()
             'nl': None, # function.
             'sv': None,
         }
 
         # Most namespaces are inherited from family.Family.
         self.namespaces[4] = {
             '_default': u'Memory Alpha', # All languages use the same project namespace name.
         }
         self.namespaces[5] = {
             '_default': u'Memory Alpha talk',
             'de': u'Memory Alpha Diskussion',
             'nl': u'Overleg Memory Alpha',
             'sv': u'Memory Alphadiskussion',
         }
 
         # A few selected big languages for things that we do not want to loop over
         # all languages. This is only needed by the titletranslate.py module, so
         # if you carefully avoid the options, you could get away without these
         # for another wiki family.
         self.biglangs = ['en', 'de'] # Not very important
 
     def hostname(self,code):
         return 'www.memory-alpha.org' # The same for all languages
 
     def scriptpath(self, code):
         return '/%s' % code # The language code is included in the path
 
     def version(self, code):
         return "1.4"

Example: Uncyclopedia

The various Uncyclopedias are slightly more awkward as not all are hosted at the same domain or under the same name. Domain names and paths must be specified individually. Just over half are Wikia-hosted; exceptions include fi: hu: ja: ko: no: pt: sv: and zh-tw:. Many have their own registered domain names and many use custom namespaces.

The approaches which work for an Uncyclopædia or a Memory Alpha project typically can be adapted to other Wikia.

Note: There have been subsequent updates and changes, see botwiki:python:uncyclopedia_family.py or uncyclopedia:es:usuario:Chixpy/uncyclopedia_family.py for more current versions of the Uncyclopedia interwiki bot configuration. There are also unresolved issues in which some interwiki languages are not available from all Uncyclopedia projects or point to incorrect/inconsistent destinations; proceed with caution.

# -*- coding: utf-8  -*-
import family
    
# The Uncyclopaedia family, a satirical set of encyclopaedia wikis. (May 2006)
#        
# Save this file to families/uncyclopedia_family.py in your pywikibot installation       
# The pywikipediabot itself is available for free download from sourceforge.net          
#

class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)
        self.name = 'uncyclopedia'
    
        self.langs = {
	    'ar': 'beidipedia.wikia.com',
            'ca': 'valenciclopedia.wikia.com',
            'da': 'da.uncyclopedia.wikia.com',
            'de': 'de.uncyclopedia.wikia.com',
            'el': 'anegkyklopaideia.wikia.com',
            'en': 'uncyclopedia.org',
            'es': 'inciclopedia.wikia.com',
            'fi': 'peelonet.zapto.org',
            'fr': 'desencyclopedie.com',
	    'he': 'eincyclopedia.wikia.com',
            'hu': 'hu.uncyclopedia.info',
	    'it': 'nonciclopedia.wikia.com',
            'ja': 'ja.uncyclopedia.info',
	    'la': 'uncapaedia.wikia.com',
            'no': 'ikkepedia.net',
            'pl': 'nonsensopedia.wikia.com',
            'pt': 'pt.uncyclopedia.info',
            'ru': 'absurdopedia.wikia.com',
	    'sv': 'psyklopedin.hehu.se',
	    'zh': 'zh.uncyclopedia.wikia.com',
            'zh-tw': 'zh.uncyclopedia.info',
            }
    
        # Most namespaces are inherited from family.Family.
        self.namespaces[1] = {
            '_default': u'Talk',
            'ar': u'نقاش',
	    'ca': u'Discussió',
	    'da': u'Diskussion',
            'de': u'Diskussion',
            'el': u'Συζήτηση',
            'en': u'Talk',
	    'es': u'Discusión',
	    'fi': u'Keskustelu',
            'fr': u'Discuter',
            'he': u'שיחה',
	    'it': u'Discussione',
	    'la': u'Disputatio',
	    'no': u'Diskusjon',
            'pl': u'Dyskusja',
            'pt': u'Discussão',
            'ru': u'Обсуждение',
	    'sv': u'Diskussion',
            'zh-tw': u'討論',
	}

        self.namespaces[2] = {
            '_default': u'User',
            'ar': u'مستخدم',
	    'ca': u'Usuari',
            'da': u'Bruger',
            'de': u'Benutzer',
            'el': u'Χρήστης',
            'en': u'User',
	    'es': u'Usuario',
	    'fi': u'Käyttäjä',
            'fr': u'Utilisateur',
            'he': u'משתמש',
	    'it': u'Utente',
	    'la': u'Usor',
	    'no': u'Bruker',
            'pl': u'Użytkownik',
            'pt': u'Usuário',
            'ru': u'Участник',
	    'sv': u'Användare',
	    'zh-tw': u'用戶',
        }

        self.namespaces[3] = {
            '_default': u'User talk',
            'ar': u'نقاش المستخدم',
	    'ca': u'Usuari Discussió',
            'da': u'Bruger diskussion',
            'de': u'Benutzer Diskussion',
            'el': u'Συζήτηση χρήστη',
            'en': u'User talk',
	    'es': u'Usuario Discusión',
	    'fi': u'Keskustelu käyttäjästä',
            'fr': u'Discussion Utilisateur',
            'he': u'שיחת משתמש',
	    'it': u'Discussioni utente',
	    'la': u'Disputatio Usoris',
	    'no': u'Brukerdiskusjon',
            'pl': u'Dyskusja użytkownika',
            'pt': u'Usuário Discussão',
            'ru': u'Обсуждение участника',
	    'sv': u'Användardiskussion',
	    'zh-tw': u'用戶討論',
        }

        self.namespaces[4] = {
            '_default': u'Uncyclopedia',
	    'ar': u'ويكيبيديا',
	    'ca': u'Valenciclopèdia',
            'da': u'Spademanns Leksikon',
            'de': u'Uncyclopedia',
	    'el': u'Ανεγκυκλοπαίδεια',
            'en': u'Uncyclopedia',
            'es': u'Inciclopedia',
	    'fi': u'Hikipedia',
            'fr': u'Désencyclopédie',
            'he': u'איןציקלופדיה',
	    'it': u'Nonciclopedia',
	    'la': u'Uncapaedia',
	    'no': u'Wikipedia',
            'pl': u'Nonsensopedia',
            'pt': u'Desciclopédia',
	    'ru': u'Абсурдопедия',
	    'sv': u'Psykelopedia',
	    'zh': u'伪基百科',
	    'zh-tw': u'偽基百科',
        }
        self.namespaces[5] = {
            '_default': u'Uncyclopedia talk',
	    'ar': u'نقاش ويكيبيديا',
	    'ca': u'Valenciclopèdia Discussió',
            'da': u'Spademanns Leksikon diskussion',
            'de': u'Uncyclopedia Diskussion',
	    'el': u'Ανεγκυκλοπαίδεια συζήτηση',
            'en': u'Uncyclopedia talk',
            'es': u'Inciclopedia Discusión',
	    'fi': u'Keskustelu Hikipediasta',
            'fr': u'Discussion Désencyclopédie',
            'he': u'שיחת איןציקלופדיה',
	    'it': u'Discussioni Nonciclopedia',
	    'la': u'Disputatio Uncapaediae',
	    'no': u'Wikipedia-diskusjon',
            'pl': u'Dyskusja Nonsensopedia',
            'pt': u'Desciclopédia Discussão',
	    'ru': u'Обсуждение Абсурдопедии',
	    'sv': u'Psykelopediadiskussion',
	    'zh': u'伪基百科 talk',
	    'zh-tw': u'偽基百科討論',
        }

	self.namespaces[6] = {
	    '_default': u'Image',
            'ar': u'صورة',
	    'ca': u'Imatge',
	    'da': u'Billede',
	    'de': u'Bild',
	    'el': u'Εικόνα',
	    'es': u'Imagen',
	    'fi': u'Kuva',
	    'he': u'תמונה',
	    'it': u'Immagine',
	    'la': u'Imago',
	    'no': u'Bilde',
	    'pl': u'Grafika',
	    'pt': u'Imagem',
	    'ru': u'Изображение',
	    'sv': u'Bild',
	    'zh-tw': u'圖像',
	}

	self.namespaces[7] = {
	    '_default': u'Image talk',
            'ar': u'نقاش الصورة',
	    'ca': u'Imatge Discussió',
	    'da': u'Billede diskussion',
	    'de': u'Bild Diskussion',
	    'el': u'Συζήτηση εικόνας',
	    'es': u'Imagen Discusión',
	    'fi': u'Keskustelu kuvasta',
	    'fr': u'Discussion Image',
	    'he': u'שיחת תמונה',
	    'it': u'Discussioni immagine',
	    'la': u'Disputatio Imaginis',
	    'no': u'Bildediskusjon',
	    'pl': u'Dyskusja grafiki',
	    'pt': u'Imagem Discussão',
	    'ru': u'Обсуждение изображения',
	    'sv': u'Bilddiskussion',
	    'zh-tw': u'圖像討論',
	}

        self.namespaces[8] = {
            '_default': u'MediaWiki',
            'ar': u'ميدياويكي',
            'he': u'מדיה ויקי',
	    'zh-tw': u'媒體維基',
	}

        self.namespaces[9] = {
            '_default': u'MediaWiki talk',
            'ar': u'نقاش ميدياويكي',
	    'ca': u'MediaWiki Discussió',
            'da': u'MediaWiki diskussion',
	    'de': u'MediaWiki Diskussion',
	    'es': u'MediaWiki Discusión',
	    'fr': u'Discussion MediaWiki',
            'he': u'שיחת מדיה ויקי',
	    'it': u'Discussioni MediaWiki',
	    'la': u'Disputatio MediaWiki',
	    'no': u'MediaWiki-diskusjon',
	    'pl': u'Dyskusja MediaWiki',
            'pt': u'MediaWiki Discussão',
            'ru': u'Обсуждение MediaWiki',
	    'sv': u'MediaWiki diskussion',
	    'zh-tw': u'媒體維基討論',
	}

        #
        # Custom namespace list for en: (and fi:)
        #
        self.namespaces[100] = {
	    '_default':u'Wilde',
	    'en':u'Wilde',
	    'fi':u'Hikiquote',
	    'pl':u'Cytaty',
	}
        self.namespaces[101] = {
	    '_default':u'Wilde talk',
	    'en':u'Wilde talk',
	    'fi':u'Hiktionary',
	    'pl':u'Dyskucja cytatów',
	}
        self.namespaces[102] = {
	    '_default':u'UnNews',
	    'en':u'UnNews',
	    'fi':u'Hikikirjasto',
	    'pl':u'NonNews',
	}
        self.namespaces[103] = {'_default':u'UnNews talk'}
        self.namespaces[104] = {'_default':u'Undictionary'}
        self.namespaces[105] = {'_default':u'Undictionary talk'}
        self.namespaces[106] = {'_default':u'Game'}
        self.namespaces[107] = {'_default':u'Game talk'}
        self.namespaces[108] = {'_default':u'Babel'}
        self.namespaces[109] = {'_default':u'Babel talk'}
        self.namespaces[110] = {'_default':u'Forum'}
        self.namespaces[111] = {'_default':u'Forum talk'}

        # A few selected big languages for things that we do not want to loop over
        # all languages. This is only needed by the titletranslate.py module, so
        # if you carefully avoid the options, you could get away without these
        # for another wiki family.
        self.languages_by_size = ['en', 'pl', 'de', 'es', 'ru', 'fr']

    def hostname(self,code):
        return self.langs[code]

    def scriptpath(self, code):
        if code=='fi':
           return '/hikipedia'
        if code in ['hu', 'ja', 'pt', 'sv', 'zh-tw']:
           return '/w'
        if code=='no':
           return ''
        return '/wiki'

    def version(self, code):
        return "1.7"

Notes

Language

For a single-language site, the language specified does not matter as long as it is consistent between the user-config.py and families/foo_family.py

Login failed. Wrong password?

Pywikipedia does not report anything more useful than success, failure, or host connection failure. If possible, try accessing the web server logs (apache uses access_log by default) and take a look at the URL strings.

Make sure your scriptpath, the relative path to your api.ph and index.php files, is defined appropriately for your wiki in your families file:

def scriptpath(self, code):
    return '/wiki'

See the mozilla configuration for clues.

Mismatched interwiki configuration

In some projects (such as Uncyclopedia), each language operates as an independent wiki. This may mean that interwiki tables differ from one individual wiki to another within the same project. Interwiki.py is built on the assumption that, if outbound interlanguage links are available at all from a language, the list of available link-destination languages and the destination URL for each will match perfectly across all wikis in the project.

This leads to some potential pitfalls:

  • If one language is missing outbound language interwiki support entirely, one must avoid giving pywikipediabot an account on that wiki (in user-config.py) in order to ensure that interwiki.py leaves that one language wiki untouched.
  • If one language is using a valid but incomplete interwiki table, running interwiki.py on that language wiki will create broken links. Unlike the case where one language is missing project-wide, there is no clean and easy workaround.
  • If a language in a project has been forked (not just mirrored), the interwiki for each individual language pair will point to only one of the multiple forks. Verify the wiki your bot is looking at is the same one that is being linked from the wiki you're editing - otherwise the bot will delete some valid links as "page does not exist".

Customisation of namespaces

Some projects use non-standard extensions to provide Special:Interwiki and Special:Namespaces lists; where available, these lists should be checked against the configuration files to detect any additional namespace customisations.

Short URL rewrites

If your site uses short URL rewrites, you may have to add "/api.php" to the blacklists, Otherwise, your bot scripts will not be able to access api.php.

Check your rewrite conditions in your apache conf file, and make an appropriate addition.

Bot & private wikis

Some wikis require logging in into mediawiki before being able to view any wikipage. If you have any such site, add to your custom family file :

def isPublic(self):
    return False

Bot & HTTP auth

Some sites will require password identication to access the HTML pages at the site. If you have any such site, add lines to your user-config.py of the following form:

authenticate['en.wikipedia.org'] = ('John','XXXXX') # where John is your login name, and XXXXX your password.

See also

References

  1. If you want to work with more than one language, choose the most common one, as you can override configured value in command line by -lang parameter.
  2. The 'u' in front of the username stands for Unicode. The 'u' is important if your username contains non-ASCII characters. If you are using ASCII characters only, you can remove the 'u' (if you have troubles loging in with your bot, otherwise you can leave the 'u' as is).