Jump to content

Wikipedia:Wikipedia Signpost/2023-01-01/Technology report

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by HaeB (talk | contribs) at 11:05, 31 December 2022. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Technology report

Could Abstract Wikipedia fail?

Foundation's "Abstract Wikipedia" project "at substantial risk of failure" according to evaluation

Members of the Foundation's Abstract Wikipedia team with the the Google.org fellows and others at an offsite in Switzerland this August. Left hand side on the table, from front to back: Ariel Gutman, Ori Livneh, Maria Keet, Sandy Woodruff, Mary Yang, Eunice Moon. At head of table: Rebecca Wambua. Right hand side of the table, front to back: Olivia Zhang, Denny Vrandečić, Edmund Wright, Dani de Waal, Ali Assaf, James Forrester

In 2020, the Wikimedia Foundation began working on Abstract Wikipedia, which is envisaged to become the first new Wikimedia project since Wikidata's launch in 2012, accompanied and supported by the separate Wikifunctions project. Abstract Wikipedia is "a conceptual extension of Wikidata", where language-independent structured information is rendered in an automated way as human-readable text in a multitude of languages, with the hope that this will vastly increase access to Wikipedia information in hitherto underserved languages. Both Abstract Wikipedia and Wikifunctions are the brainchild of longtime Wikimedian Denny Vrandečić, who also started and led the Wikidata project at Wikimedia Deutschland before becoming a Google employee in 2013, where he began to develop these ideas before joining the Wikimedia Foundation staff in 2020 to lead their implementation.

An evaluation published earlier this month calls the project's future into question:

"This is a sympathetic critique of the technical plan for Abstract Wikipedia. We (the authors) are writing this at the conclusion of a six-month Google.org Fellowship, during which we were embedded with the Abstract Wikipedia team, and assisted with the development of the project. While we firmly believe in the vision of Abstract Wikipedia, we have serious concerns about the design and approach of the project, and think that the project faces a substantial risk."


"We find [Abstract Wikipedia's] vision strongly compelling, and we believe that the project, while ambitious, is achievable. However, we think that the current effort (2020–present) to develop Abstract Wikipedia at the Wikimedia Foundation is at substantial risk of failure, because we have major concerns about the soundness of the technical plan. The core problem is the decision to make Abstract Wikipedia depend on Wikifunctions, a new programming language and runtime environment, invented by the Abstract Wikipedia team, with design goals that exceed the scope of Abstract Wikipedia itself, and architectural issues that are incompatible with the standards of correctness, performance, and usability that Abstract Wikipedia requires."


The fellowship was part of a program by Google.org (the philantropy organization of the for-profit company Google) that enables Google employees to do pro-bono work in support of non-profit causes. The fellow team's tech lead was Ori Livneh, himself a longtime Wikipedian and former software engineer at the Wikimedia Foundation (2012-2016), where he founded and led the Performance Team before joining Google. The other three Google fellows who authored the evaluation are Ariel Gutman (holder of a PhD in linguistics and author of a book titled "Attributive constructions in North-Eastern Neo-Aramaic", who also published a separate "goodbye letter" summarizing his work during the fellowship), Ali Assaf, and Mary Yang.

The evaluation examines a long list of issues in detail, and ends with a set of recommendations centered around the conclusion that

"Abstract Wikipedia should be decoupled from Wikifunctions. The current tight coupling of the two projects together has a multiplicative effect on risk and substantially increases the risk of failure."

Among other things, the fellows caution the Foundation to not "invent a new programming language. The cost of developing the function composition language to the required standard of stability, performance, and correctness is large ..." They propose that

  • "Wikifunctions should extend, augment, and refine the existing programming facilities in MediaWiki. The initial version should be a central wiki for common Lua code", Lua being "an easy-to-learn and general purpose programming language, originally developed in Brazil" that is already widely used on Wikimedia projects, with the added benefit of satisfying "a long-standing community request" (for a "Central repository for gadgets, templates and Lua modules", which had been the third most popular proposal in the 2015 Community Wishlist Survey).

Regarding Abstract Wikipedia, the recommendations likewise center on limiting complexity and aiming to build on existing open-source solutions if possible, in particular for the NLG (natural language generation) part responsible for converting the information expressed in the project's language-independent formalism into a human-readable statement in a particular language:

  • Rather than present to users a general-purpose computation system and programming environment, provide an environment specifically dedicated to authoring abstract content, grammars, and NLG renderers in a constrained formalism.
  • Converge on a single, coherent approach to NLG.
  • If possible, adopt an extant NLG system and build on it."

The Foundation's answer

A response authored by eight Foundation staff members from the Abstract Wikipedia team (published simultaneously with the fellows' evaluation) rejects these recommendations. They begin by acknowledging that although "Wikidata went through a number of very public iterations, and faced literally years of criticism from Wikimedia communities and from academic researchers [, the] plan for Abstract Wikipedia had not faced the same level of public development and discussion. [...] Barely anyone outside of the development team itself has dived into the Abstract Wikipedia and Wikifunctions proposal as deeply as the authors of this evaluation."

However, Vrandečić's team then goes on to reject the evaluation's core recommendations, presenting the expansive scope of Wikifunctions as a universal repository of general-purpose functions a done deal mandated by the Board (the Wikimedia Foundation's top decisionmaking authority), and accusing the Google fellows of "fallacies" rooted in "misconception":

The Foundation’s Board mandate they issued to us in May 2020 was to build the Wikifunctions new wiki platform (then provisionally called Wikilambda) and the Abstract Wikipedia project. This was based on the presentation given to them at that meeting (and pre-reading), and publicly documented on Meta. That documentation at the time very explicitly called out as “a new Wikimedia project that allows to create and maintain code” and that the contents would be “a catalog of all kind[s] of functions”, on top of which there would “also” (our emphasis) be code for supporting Abstract Wikipedia.

The evaluation document starts out from this claim – that Wikifunctions is incidental to Abstract Wikipedia, and a mere implementation detail. The idea that Wikifunctions will operate as a general platform was always part of the plan by the Abstract Wikipedia team.

This key point of divergence sets up much of the rest of this document [i.e. the evaluation] for fallacies and false comparisons, as they are firmly rooted in, and indeed make a lot of sense within, the reality posed by this initial framing misconception."

(The team doesn't elaborate on why the Foundation's trustees shouldn't be able to amend that May 2020 mandate if, two and a half years later, its expansive scope does indeed risk causing the entire project to fail.)

The evaluation report and the WMF's response are both lengthy (at over 6,000 and over 10,000 words, respectively), replete with technical and linguistic arguments and examples that are difficult to summarize here in full. Interested readers are encouraged to read both documents in their entirety. Nevertheless, below we attempt to highlight and explain a few key points made by each side, and to illuminate the underlying principal tensions about decisions that are likely to shape this important effort of the Wikimedia movement for decades to come.

TKTK

In brief

New user scripts to customise your Wikipedia experience

Add using {{userscript |code= [.js] |name= [script name] |doc= [doc page] }}

  • ...

Bot tasks

(SUBST: THE TRANSCLUSIONS, AND TRIM/REFORMAT, PRIOR TO PUBLICATION)

Recently approved tasks

{{Wikipedia:Bots/Requests_for_approval/Approved}}

Current requests for approval

{{Wikipedia:BAG/Status}}

Latest tech news

Latest tech news from the Wikimedia technical community: 2022 #52, #52, & #52 (FIX WEEK NUMBERS). Please tell other users about these changes. Not all changes will affect you. Translations are available on Meta.

Meetings
  • Recurrent item Advanced item Meeting item 1

Installation code