Wikipedia:Wikipedia Signpost/2023-07-17/Recent research: Difference between revisions
formatting, copyedits |
|||
Line 22: | Line 22: | ||
=== Wikipedia and open access === |
=== Wikipedia and open access === |
||
:''Reviewed by [[User:Jullienn|Nicolas Jullien]]'' |
|||
⚫ | From the abstract:<ref>{{Cite| publisher = arXiv| doi = 10.48550/arXiv.2305.13945| last1 = Yang| first1 = Puyu| last2 = Shoaib| first2 = Ahad| last3 = West| first3 = Robert| last4 = Colavizza| first4 = Giovanni| title = Wikipedia and open access| date = 2023-05-23| url = https://arxiv.org/abs/2305.13945}}</ref>: |
||
<blockquote style="padding-left:1.0em; padding-right:1.0em; background-color:#eaf8f4;"> |
|||
⚫ | "we analyze a large dataset of citations from Wikipedia and model the role of open access in Wikipedia's citation patterns. We find that open-access articles are extensively and increasingly more cited in Wikipedia. What is more, they show a 15% higher likelihood of being cited in Wikipedia when compared to closed-access articles, after controlling for confounding factors. This open-access citation effect is particularly strong for articles with low citation counts, including recently published ones. Our results show that open access plays a key role in the dissemination of scientific knowledge, including by providing Wikipedia editors timely access to novel results." |
||
</blockquote> |
|||
⚫ | {{Cite| publisher = arXiv| doi = 10.48550/arXiv.2305.13945| last1 = Yang| first1 = Puyu| last2 = Shoaib| first2 = Ahad| last3 = West| first3 = Robert| last4 = Colavizza| first4 = Giovanni| title = Wikipedia and open access |
||
⚫ | |||
==== Why does it matter for the Wikipedia community? ==== |
==== Why does it matter for the Wikipedia community? ==== |
||
This article is a first draft |
This article is a first draft of an analysis of the relationship between the availability of a scientific journal as [[open access]] and the fact that it is cited in the English Wikipedia (note: although it speaks of "Wikipedia", the article looks only at the English pages). It is a preprint and has not been peer-reviewed, so its results should be read with caution, especially since I am not sure about the robustness of the model and the results derived from it (see below). It is of course a very important issue, as access to scientific sources is key to the diffusion of scientific knowledge, but also, as the authors mention, because Wikipedia is seen as central to the diffusion of scientific facts (and is sometimes used by scientists to push their ideas). |
||
==== Review ==== |
==== Review ==== |
||
The results presented in the article (and its abstract) highlight two important issues for Wikipedia that will likely be addressed in a more complete version of the paper: |
The results presented in the article (and its abstract) highlight two important issues for Wikipedia that will likely be addressed in a more complete version of the paper: |
||
* The question of the reliability of the sources used by Wikipedians |
* The question of the reliability of the sources used by Wikipedians |
||
=> the regressions seem to indicate that the reputation of the journal is not important to be cited in Wikipedia. |
::=> the regressions seem to indicate that the reputation of the journal is not important to be cited in Wikipedia. |
||
=> Predatory journals are known to be more often |
::=> [[Predatory journals]] are known to be more often open access than classical journals, which means that this result potentially indicates that the phenomenon of open access reduces the seriousness of Wikipedia sources. |
||
The authors say on p. 4 that they |
The authors say on p. 4 that they provided "each journal with an [[SCImago Journal Rank|SJR score]], [[H-index]], and other relevant information." |
||
Why did they not use this as a control variable? |
Why did they not use this as a control variable? |
||
(this echoes a debate on the role of Wikipedia: to disseminate verified knowledge, |
(this echoes a debate on the role of Wikipedia: is it to disseminate verified knowledge, or to serve as a platform for the dissemination of new theories? The authors seem to lean towards the second view: p. 2: "With the rapid development of the Internet, traditional peer review and journal publication can no longer meet the need for the development of new ideas".) |
||
* The solidity of the results |
* The solidity of the results |
||
⚫ | |||
The authors said: |
|||
⚫ | |||
⚫ | |||
⚫ | |||
"General science, technology, and biomedical research have relatively higher OA rates." |
|||
⚫ | |||
⚫ | |||
More problematic (and acknowledged by the authors, so probably in the |
More problematic (and acknowledged by the authors, so probably in the process of being addressed), the authors said, on p.7, that they built their model with the assumption that the age of a research article and the number of citations it has both influence the probability of an article being cited in Wikipedia. |
||
Of course, for this causal effect to hold, the age and the number of citations must be taken into account at the moment the article is cited in Wikipedia (if some of the citations are made after the citation in Wikipedia, one could argue that the causal effect could be in the other direction). |
Of course, for this causal effect to hold, the age and the number of citations must be taken into account at the moment the article is cited in Wikipedia (if some of the citations are made after the citation in Wikipedia, one could argue that the causal effect could be in the other direction). |
||
For example, many articles are open access after an embargo period, and are therefore considered open access in the analysis, whereas they may have been cited in Wikipedia when they were under embargo. |
For example, many articles are open access after an embargo period, and are therefore considered open access in the analysis, whereas they may have been cited in Wikipedia when they were under embargo. |
||
The authors did not check |
The authors did not check for this, as acknowledged in the last sentence of the article => would the result be as robust if they do their model taking the first citation in the English Wikipedia, for example, and the age of the article, its open access status, etc. at that moment)? |
||
==== In short ==== |
==== In short ==== |
||
Although this first draft is probably not solid enough to be cited in Wikipedia, it signals important research in progress, and I am sure that the richness of the data and the quality of the team will quickly lead to very interesting insights for the Wikipedia community. |
Although this first draft is probably not solid enough to be cited in Wikipedia, it signals important research in progress, and I am sure that the richness of the data and the quality of the team will quickly lead to very interesting insights for the Wikipedia community. |
||
:''Reviewed by --[[User:Jullienn|Jullienn]] ([[User talk:Jullienn|talk]]) 13:54, 13 July 2023 (UTC)...'' |
|||
=== ... === |
=== ... === |
Revision as of 21:03, 15 July 2023
Article display preview: | This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!
|
YOUR ARTICLE'S DESCRIPTIVE TITLE HERE
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
Wikipedia and open access
- Reviewed by Nicolas Jullien
From the abstract:[1]:
"we analyze a large dataset of citations from Wikipedia and model the role of open access in Wikipedia's citation patterns. We find that open-access articles are extensively and increasingly more cited in Wikipedia. What is more, they show a 15% higher likelihood of being cited in Wikipedia when compared to closed-access articles, after controlling for confounding factors. This open-access citation effect is particularly strong for articles with low citation counts, including recently published ones. Our results show that open access plays a key role in the dissemination of scientific knowledge, including by providing Wikipedia editors timely access to novel results."
Why does it matter for the Wikipedia community?
This article is a first draft of an analysis of the relationship between the availability of a scientific journal as open access and the fact that it is cited in the English Wikipedia (note: although it speaks of "Wikipedia", the article looks only at the English pages). It is a preprint and has not been peer-reviewed, so its results should be read with caution, especially since I am not sure about the robustness of the model and the results derived from it (see below). It is of course a very important issue, as access to scientific sources is key to the diffusion of scientific knowledge, but also, as the authors mention, because Wikipedia is seen as central to the diffusion of scientific facts (and is sometimes used by scientists to push their ideas).
Review
The results presented in the article (and its abstract) highlight two important issues for Wikipedia that will likely be addressed in a more complete version of the paper:
- The question of the reliability of the sources used by Wikipedians
- => the regressions seem to indicate that the reputation of the journal is not important to be cited in Wikipedia.
- => Predatory journals are known to be more often open access than classical journals, which means that this result potentially indicates that the phenomenon of open access reduces the seriousness of Wikipedia sources.
The authors say on p. 4 that they provided "each journal with an SJR score, H-index, and other relevant information." Why did they not use this as a control variable? (this echoes a debate on the role of Wikipedia: is it to disseminate verified knowledge, or to serve as a platform for the dissemination of new theories? The authors seem to lean towards the second view: p. 2: "With the rapid development of the Internet, traditional peer review and journal publication can no longer meet the need for the development of new ideas".)
- The solidity of the results
- The authors said: "STEM fields, especially biology and medicine, comprise the most prominent scientific topics in Wikipedia [17]." "General science, technology, and biomedical research have relatively higher OA rates."
- => So, it is obvious that, on average, there are more citations of Open Access articles in Wikipedia (than in the entire available research corpus), and explain that open access articles are cited more.
- => Why not control for discipline in the models?
More problematic (and acknowledged by the authors, so probably in the process of being addressed), the authors said, on p.7, that they built their model with the assumption that the age of a research article and the number of citations it has both influence the probability of an article being cited in Wikipedia.
Of course, for this causal effect to hold, the age and the number of citations must be taken into account at the moment the article is cited in Wikipedia (if some of the citations are made after the citation in Wikipedia, one could argue that the causal effect could be in the other direction). For example, many articles are open access after an embargo period, and are therefore considered open access in the analysis, whereas they may have been cited in Wikipedia when they were under embargo. The authors did not check for this, as acknowledged in the last sentence of the article => would the result be as robust if they do their model taking the first citation in the English Wikipedia, for example, and the age of the article, its open access status, etc. at that moment)?
In short
Although this first draft is probably not solid enough to be cited in Wikipedia, it signals important research in progress, and I am sure that the richness of the data and the quality of the team will quickly lead to very interesting insights for the Wikipedia community.
...
- Reviewed by ...
...
- Reviewed by ....
Briefly
- See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
- ...
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
- Compiled by ...
"..."
From the abstract:
...
"..."
From the abstract:
...
"..."
From the abstract:
...
References
- ^ Yang, Puyu; Shoaib, Ahad; West, Robert; Colavizza, Giovanni (2023-05-23), Wikipedia and open access, arXiv, doi:10.48550/arXiv.2305.13945
- Supplementary references and notes:
This page is a draft for the next issue of the Signpost. Below is some helpful code that will help you write and format a Signpost draft. If it's blank, you can fill out a template by copy-pasting this in and pressing 'publish changes': {{subst:Wikipedia:Wikipedia Signpost/Templates/Story-preload}}
Images and Galleries
|
---|
To put an image in your article, use the following template (link): This will create the file on the right. Keep the 300px in most cases. If writing a 'full width' article, change
Placing (link) will instead create an inline image like below [[File:|300px|center|alt=Placeholder alt text]]
To create a gallery, use the following to create |
Quotes
| |||
---|---|---|---|
To insert a framed quote like the one on the right, use this template (link): If writing a 'full width' article, change
To insert a pull quote like
use this template (link):
To insert a long inline quote like
use this template (link): |
Side frames
|
---|
Side frames help put content in sidebar vignettes. For instance, this one (link): gives the frame on the right. This is useful when you want to insert non-standard images, quotes, graphs, and the like.
For example, to insert the {{Graph:Chart}} generated by in a frame, simple put the graph code in to get the framed Graph:Chart on the right. If writing a 'full width' article, change |
Two-column vs full width styles
|
---|
If you keep the 'normal' preloaded draft and work from there, you will be using the two-column style. This is perfectly fine in most cases and you don't need to do anything. However, every time you have a However, you can also fine-tune which style is used at which point in an article. To switch from two-column → full width style midway in an article, insert where you want the switch to happen. To switch from full width → two-column style midway in an article, insert where you want the switch to happen. |
Article series
|
---|
To add a series of 'related articles' your article, use the following code or will create the sidebar on the right. If writing a 'full width' article, change Alternatively, you can use at the end of an article to create For more Signpost coverage on the visual editor see our visual editor series. If you think a topic would make a good series, but you don't see a tag for it, or that all the articles in a series seem 'old', ask for help at the WT:NEWSROOM. Many more tags exist, but they haven't been documented yet. |
Links and such
|
---|
By the way, the template that you're reading right now is {{Editnotices/Group/Wikipedia:Wikipedia Signpost/Next issue}} (edit). A list of the preload templates for Signpost articles can be found here. |
Discuss this story
Presumably the preprint about WiCE, after giving the example quoted above, goes on to discuss the problems with both the sentence from the article Santa Maria della Pietà, Prato ("13th-century icon" is not supported by the source) and the "sub-claims" GPT-3 generated from it (clearly the "icon" can't be both 13th-century and from 1638)? If so, what does it say? I think the original source has misunderstood that the 14th-century image itself (attributed to Giovanni Bonsi), as opposed to a "depiction of the miraculous event" (unspecified, but it occurred in the 17th century), is the fresco at the centre of the later altarpiece (painted by Mario Balassi in 1638, and on canvas rather than in fresco according to the Italian Wikipedia article), so that doesn't help. Ham II (talk) 11:23, 17 July 2023 (UTC)[reply]
Thanks and small correction on Wikipedia ChatGPT plugin
Thanks for covering this work! One small correction RE:
This was true of the earliest version of the plugin, but for production we've switched to leveraging the Wikimedia Search API to find articles matching the user's query. We'll update the docs/README to reflect this (our quick R&D on this outpaced our technical documentation, but catching up now)! Maryana (WMF) (talk) 22:10, 17 July 2023 (UTC)[reply]
"google_search_is_enabled"
), but that the latter is selected as the preferred search provider right now in the settings. (Feel free to correct me as I may have misread the code.)