Wikipedia:Wikipedia Signpost/2023-12-24/Recent research

Article display preview:

"LLMs Know More, Hallucinate Less" with Wikidata

This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!

This draft article ...

Y ... has a title defined.
"LLMs Know More, Hallucinate Less" with Wikidata
Y ... has a blurb defined.
And other new research publications
? ... is ready to be copyedited.
N ... has not yet been copyedited.
N ... does not have an image.
N ... is not yet approved for publication.

Writer resources ...

The Newsroom (talk)

deadlines

Writing: 1 June 18:00 (1 day left; 6%)

Publishing: 2 June 18:00 (2 days left; 11%)

There are 6 hours, 56 minutes and 10 seconds until deadline. (refresh)

Last revised 22:17, 23 December 2023 (UTC) (5 months ago) by HaeB (refresh)

← Back to Contents

View Latest Issue

24 December 2023

Recent research

"LLMs Know More, Hallucinate Less" with Wikidata

Contribute —

By Tilman Bayer

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata"

This paper^[1] (by five graduate students at Stanford University's computer science department and Monica S. Lam as last author) sets out to show that

While large language models (LLMs) can answer many questions correctly, they can also hallucinate and give wrong answers. Wikidata, with its over 12 billion facts, can be used to ground LLMs to improve their factuality.

To do this, the paper "presents WikiSP, a few-shot sequence-to-sequence semantic parser for Wikidata that translates a user query, along with results from an entity linker, directly into SPARQL queries [to retrieve information from Wikidata]." It is obtained by fine-tuning the LLaMA large language model.

For example, the user question "What year did giants win the world series?" is supposed to be converted into the query SELECT DISTINCT ?x WHERE {?y wdt:sports_season_of_league_or_competition wd:Q265538; wdt:winner wd:Q308966; wdt:point_in_time ?x. }. The paper uses a modified SPARQL syntax that replaces numerical property IDs (here, P3450) with their English-language label (here, "sports season of league or competition"). The authors motivate this choice by observing that "While zero-shot LLMs [e.g. ChatGPT] can generate SPARQL queries for the easiest and most common questions, they do not know all the PIDs and QIDs [property and item IDs in Wikidata], and nor is it possible to include them in a prompt."

To evaluate the performance of "WikiSP", and as a second contribution of the paper, the authors present

[...] WikiWebQuestions, a high-quality question answering benchmark for Wikidata. Ported over from WebQuestions for Freebase, it consists of real-world data with SPARQL annotation. [...]
Despite being the most popular large knowledge base for a long time, existing benchmarks on Wiki- data with labeled SPARQL queries are unfortunately either small or of low quality. On the other hand, benchmarks over the deprecated Freebase still dominate the KBQA research with better-quality data.

Using this new benchmark, "Our experimental results demonstrate the effectiveness of [WikiSP], establishing a strong baseline of 76% and 65% answer accuracy in the dev and test sets of WikiWeb- Questions, respectively." However, the paper's "Limitations" section hints that despite the impressive "12 billion facts" factoid that the paper opens with, Wikidata's coverage may be too limited to answer most user questions in a satisfying manner:

Even though knowledge bases are an important source of facts, a large portion of the knowledge available in digital form (e.g. Wikipedia, news articles, etc.), is not organized into knowledge bases. As such, the results of this paper can be considered complementary to the larger body of fact-checking research based on free text.

To address this weakness, the authors combine this Wikidata-based setup with a standard LLM that provides the answer if the Wikidata query fails to return a result. They state that

By pairing our semantic parser with GPT-3, we combine verifiable results with qualified GPT-3 guesses to provide useful answers to 96% of the questions in dev.

Data and evaluation code from the paper have been released in a GitHub repo, where the authors state that "We are now working on releasing fine-tuned models."

The paper's endeavour bears some similarity to a paper authored by a different team of Stanford graduate students with professor Lam that sought to use Wikipedia (rather than Wikidata) to reduce LLM hallucations, see the review in our July issue: "Wikipedia-based LLM chatbot 'outperforms all baselines' regarding factual accuracy".

Briefly

See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Using Large Language Models for Knowledge Engineering (LLMKE): A Case Study on Wikidata"

From the abstract:^[2]

"In this work, we explore the use of Large Language Models (LLMs) for knowledge engineering tasks in the context of the ISWC 2023 LM-KBC Challenge. For this task, given subject and relation pairs sourced from Wikidata, we utilize pre-trained LLMs to produce the relevant objects in string format and link them to their respective Wikidata QIDs. [...] The method achieved a macro-averaged F1-score of 0.701 across the properties, with the scores varying from 1.00 to 0.328. These results demonstrate that the knowledge of LLMs varies significantly depending on the domain and that further experimentation is required to determine the circumstances under which LLMs can be used for automatic Knowledge Base (e.g., Wikidata) completion and correction. The investigation of the results also suggests the promising contribution of LLMs in collaborative knowledge engineering. LLMKE won Track 2 of the challenge.

"Large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in [Wikidata]"

From the abstract:^[3]

"Knowledge bases such as WikiData provide large-scale, high-quality representations of inferential semantics and world knowledge. We show that large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in such knowledge bases. Knowledge bases model collective, institutional knowledge, and large language models seem to induce such knowledge from raw text. We show that bigger and better models exhibit more human-like concept organization, across four families of language models and three knowledge graph embeddings."

"KGConv, a Conversational Corpus grounded in Wikidata"

From the abstract:^[4]

"We present KGConv, a large, conversational corpus of 71k conversations where each question-answer pair is grounded in a Wikidata fact. Conversations contain on average 8.6 questions and for each Wikidata fact, we provide multiple variants (12 on average) of the corresponding question using templates, human annotations, hand-crafted rules and a question rewriting neural model. We provide baselines for the task of Knowledge-Based, Conversational Question Generation. [...]"

"WikiDialog" dataset: "Dialog inpainting" using Wikipedia

From the abstract:^[5]

"[...] conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. Our approach takes the text of any document and transforms it into a two-person dialog between the writer and an imagined reader: we treat sentences from the article as utterances spoken by the writer, and then use a dialog inpainter to predict what the imagined reader asked or said in between each of the writer's utterances. By applying this approach to passages from Wikipedia and the web, we produce WikiDialog and WebDialog, two datasets totalling 19 million diverse information-seeking dialogs -- 1,000x larger than the largest existing ConvQA dataset. Furthermore, human raters judge the answer adequacy and conversationality of WikiDialog to be as good or better than existing manually-collected datasets."

References

^ Xu, Silei; Liu, Shicheng; Culhane, Theo; Pertseva, Elizaveta; Wu, Meng-Hsi; Semnani, Sina; Lam, Monica (December 2023). "Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata". Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. EMNLP 2023. Singapore: Association for Computational Linguistics. pp. 5778–5791. doi:10.18653/v1/2023.emnlp-main.353. {{cite conference}}: Unknown parameter |editors= ignored (|editor= suggested) (help) Data and evaluation code
^ Zhang, Bohui; Reklos, Ioannis; Jain, Nitisha; Peñuela, Albert Meroño; Simperl, Elena (2023-09-15), Using Large Language Models for Knowledge Engineering (LLMKE): A Case Study on Wikidata, arXiv code
^ Gammelgaard, Mathias Lykke; Christiansen, Jonathan Gabel; Søgaard, Anders (2023-08-29), Large language models converge toward human-like concept organization, arXiv
^ Brabant, Quentin; Lecorve, Gwenole; Rojas-Barahona, Lina M.; Gardent, Claire (2023-08-29), KGConv, a Conversational Corpus grounded in Wikidata, arXiv
^ Dai, Zhuyun; Chaganty, Arun Tejasvi; Zhao, Vincent; Amini, Aida; Rashid, Qazi Mamunur; Green, Mike; Guu, Kelvin (2022-05-31), Dialog Inpainting: Turning Documents into Dialogs, arXiv, doi:10.48550/arXiv.2205.09073 Dataset, poster presentation

Supplementary references and notes:

This page is a draft for the next issue of the Signpost. Below is some helpful code that will help you write and format a Signpost draft. If it's blank, you can fill out a template by copy-pasting this in and pressing 'publish changes': {{subst:Wikipedia:Wikipedia Signpost/Templates/Story-preload}}

Images and Galleries

Sidebar images

To put an image in your article, use the following template (link):

[[File:|center|300px|alt=Placeholder alt text]]

CAPTION

{{Wikipedia:Wikipedia Signpost/Templates/Filler image-v2
 |image     = 
 |size      = 300px
 |alt       = Placeholder alt text
 |caption   = CAPTION
 |fullwidth = no
}}

This will create the file on the right. Keep the 300px in most cases. If writing a 'full width' article, change |fullwidth=no to |fullwidth=yes.

Inline images

Placing

{{Wikipedia:Wikipedia Signpost/Templates/Inline image
 |image   =
 |size    = 300px
 |align   = center
 |alt     = Placeholder alt text
 |caption = CAPTION
}}

(link) will instead create an inline image like below

[[File:|300px|center|alt=Placeholder alt text]]

CAPTION

Galleries

To create a gallery, use the following

<gallery mode = packed | heights = 200px>
|Caption for second image
</gallery>

to create

Quotes

Framed quotes

“

Lorem ipsum dolor sit amet...

”

— AUTHOR, SOURCE

To insert a framed quote like the one on the right, use this template (link):

{{Wikipedia:Wikipedia Signpost/Templates/Filler quote-v2
 |1         = The goose is on the loose!
 |author    = AUTHOR
 |source    = SOURCE
 |fullwidth = no
}}

If writing a 'full width' article, change |fullwidth=no to |fullwidth=yes.

Pull quotes

To insert a pull quote like

“

Lorem ipsum dolor sit amet...

”

use this template (link):

{{Wikipedia:Wikipedia Signpost/Templates/Quote
 |1         = The goose is on the loose!
 |source    = SOURCE
}}

Long quotes

To insert a long inline quote like

The goose is on the loose! The geese are on the lease!
— User:Oscar Wilde
— Quotations Notes from the Underpoop

use this template (link):

{{Wikipedia:Wikipedia Signpost/Templates/block quote
 | text   = The goose is on the loose! The geese are on the lease!
 | by     = Oscar Wilde
 | source = Quotations
 | ts     = Notes from the Underpoop
 | oldid  = 1234567890
}}

Side frames

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

A caption

Side frames help put content in sidebar vignettes. For instance, this one (link):

{{Wikipedia:Wikipedia Signpost/Templates/Filler frame-v2
 |1         = Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
 |caption   = A caption
 |fullwidth = no
}}

gives the frame on the right. This is useful when you want to insert non-standard images, quotes, graphs, and the like.

Example − Graph/Charts

A caption

For example, to insert the {{Graph:Chart}} generated by

{{Graph:Chart
 |width=250|height=100|type=line
 |x=1,2,3,4,5,6,7,8|y=10,12,6,14,2,10,7,9
}}

in a frame, simple put the graph code in |1=

{{Wikipedia:Wikipedia Signpost/Templates/Filler frame-v2
 |1=
{{Graph:Chart
 |width=250|height=100|type=line
 |x=1,2,3,4,5,6,7,8|y=10,12,6,14,2,10,7,9
}}
 |caption=A caption
 |fullwidth=no
}}

to get the framed Graph:Chart on the right.

If writing a 'full width' article, change |fullwidth=no to |fullwidth=yes.

Two-column vs full width styles

If you keep the 'normal' preloaded draft and work from there, you will be using the two-column style. This is perfectly fine in most cases and you don't need to do anything.

However, every time you have a |fullwidth=no and change it to |fullwidth=yes (or vice-versa), the article will take that style from that point onwards (|fullwidth=yes → full width, |fullwidth=no → two-column). By default, omitting |fullwidth= is the same as putting |fullwidth=no and the article will have two columns after that. Again, this is perfectly fine in most cases, and you don't need to do anything.

However, you can also fine-tune which style is used at which point in an article.

To switch from two-column → full width style midway in an article, insert

{{Wikipedia:Wikipedia Signpost/Templates/Signpost-block-end-v2}}
{{Wikipedia:Wikipedia Signpost/Templates/Signpost-block-start-v2|fullwidth=yes}}

where you want the switch to happen.

To switch from full width → two-column style midway in an article, insert

{{Wikipedia:Wikipedia Signpost/Templates/Signpost-block-end-v2}}
{{Wikipedia:Wikipedia Signpost/Templates/Signpost-block-start-v2|fullwidth=no}}

where you want the switch to happen.

Article series

To add a series of 'related articles' your article, use the following code

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

The value of Wikidata becoming clearer in a world with AI is an unsurprising development. It'll be similar with Abstract Wikipedia. {{u|Sdkb}} ^talk 15:52, 24 December 2023 (UTC)[reply]

I agree with you, Wikidata is becoming more valuable, especially for AI language models like good old ChatGPT or Bing Chat. - The Master of Hedgehogs ^{(always up for a conversation!)} 17:09, 28 December 2023 (UTC)[reply]

I personally do not support AI given their stratospheric impact towards politics and copyright. I cannot believe what I'm seeing here. MarioJump83 (talk) 00:43, 3 January 2024 (UTC)[reply]

Get the latest headlines on your user page — just add {{Signpost-subscription}}.

Home

About