Talk:Abstract Wikipedia

From Meta, a Wikimedia project coordination wiki

Sub-pages

Info-Box (German)

Who takes care of the entries in the Info-Box? At least in the German Info-Box the items differ from the real translations. —The preceding unsigned comment was added by Wolfdietmann (talk) 09:32, 15 April 2021 (UTC)Reply

@Wolfdietmann: Hi. In the navigation-box, there's an icon in the bottom corner that leads to the translation-interface (or here's a direct link). However, that page doesn't list any entries as being "untranslated" or "outdated". Perhaps you mean some of the entries are mis-translated, in which case please do help to correct them! (Past contributions are most easily seen via the history page for Template:Abstract_Wikipedia_navbox/de). I hope that helps. Quiddity (WMF) (talk) 18:33, 15 April 2021 (UTC)Reply
@Quiddity (WMF): Thank´s a lot. Your hint was exactly what I needed.--Wolfdietmann (talk) 09:29, 16 April 2021 (UTC)Reply

Effects of Abstract Text to the Job Market

Hello,

I have thinked about what could happen if abstract Text works good and is used in many contexts also out of the Wikimediaprojects. What does that mean for jobs. I asked me is Abstract Text a innovation that reduces the need of personnel for example for writing technical instructions. Changes are something what happen and they are not bad. I think it is important that there is a support for employees that are affected by that change and to make sure that they have another job after the change. From my point of view is Wikifunctions here a important part because it can help enable people to learn the skills that are good to know, to do other jobs. From my point of view programming is something that is interesting and can help in many parts. I suggest to create a page with some recommendations for potential users of Abstract text to make sure they are aware of the changes that can come through with that change for their employees and that they get the knowledge they need. What do you think about that. What is my responsibility as a volunteer who is interested in that project and plans to participate if it is live, until now I read through the most pages here and tried to understand how it works, to make sure that I do not support a increasing unemployment through optimiziation. --Hogü-456 (talk) 19:55, 17 April 2021 (UTC)Reply

@Hogü-456: Thank you for this thoughtful comment. In a scenario where Abstract Wikipedia and abstract content are so successful to make a noticeable dent on the market for technical and other translators, it would also create enormous amounts of value for many people by expanding potential markets and by making more knowledge available to more people. In such a world, creating and maintaining (or translating natural language texts to) abstract content will become a new, very valuable skill, that will create new job opportunities. Imagine people who create a chemistry or economics text book in abstract content! How awesome would that be, and the potential that this could unlock!
I actually had an interview with the Tool Box Journal: A Computer Journal For Translation Professionals earlier this year (it is in issue 322, but I cannot find a link to that issue). The interesting part was that, unlike with machine learning tools for translation, the translator that interviewed me really found it interesting, because they, as the translator, could really control the content and the presentation, unlike with machine learned systems. He sounded rather eager and interested in what we are building, and not so much worried about it. I think he also thought that the opportunities - as outlined above - are so big, if all of this works at all. --DVrandecic (WMF) (talk) 21:06, 21 April 2021 (UTC)Reply

Recent spam on https://annotation.wmcloud.org/

A few suspicious accounts have been created today, and started creating spam pages on the wiki (see the recent changes log). TomT0m (talk) 18:20, 18 April 2021 (UTC)Reply

Thanks for the heads-up. Quiddity (WMF) (talk) 03:53, 21 April 2021 (UTC)Reply

Regular inflection

Hello,

in the last weeks I tried to add the inflactions/Beugungen of german nouns that consist of more than one noun as Lexemes in Wikidata. At the overview about the phases I have seen that it is part of the second phase to make it possible to automatically create regular inflections. I created a template as a csv-File that helps me to extract the possible words out of a longer word. In the template I extract all possible combinations with a length beetween 3 and 10 characters. After that I check which combinations match with a download of the so far existing german lexems for nouns with their forms and there I then check if it is going to the end and for these word I extract then the part before the last word with the first character of the last word and match to that additional the existing forms from the Lexemes. For that I have a script in R and a spreadsheet. I want to work at a script for creating the forms of a noun starting with german at the Wikimedia Hackathon. Has someone created something similar or have you thinked about that also for other languages and how the rules are in that language.

At the Wikimedia Remote Hackathon 2021 I suggested a session what is a conversation about how to enable more people to learn coding also with a focus to the Wikimedia Projects and in the phabricator ticket for that I added a link to the mission statement of Wikifunctions after I think this is a project what has this as a goal to make functions accessable to more people and I have several ideas about functions that I think are helpful and I plan to publish them in Wikifunctions if I am able to write them. If you are interested you can attend at the Conversation.--Hogü-456 (talk) 21:26, 12 May 2021 (UTC)Reply

@Hogü-456: Thanks! This is exactly the kind of functions I hope to see in Wikifunctions. Alas, it is still a bit too early for this year's Hackathon for us to participate properly, i.e. for Wikifunctions to be a target where such functions land. But any such work will be great preparation for what we want to be able to represent in Wikifunctions. We hope to get there later this year, definitely by next year's Hackathon.
If you were to write your code in Python or JavaScript, then Wikifunctions will be soon ready to accept and execute such functions. I also hope that we will cover R at some point, but currently there is no timeline for that.
If you are looking for another resource that has already implemented inflections for German, there seem to be some code packages that do so. The one that I look to for inspiration is usually Grammatical Framework. The German dictionary is here: http://www.grammaticalframework.org/~john/rgl-browser/#!german/DictGer - you can find more in through their Resource Grammar Library: http://www.grammaticalframework.org/~john/rgl-browser/
One way or the other, that's exciting, and I hope that we will be able to incorporate your work once Wikifunctions has reached the point where we can host these functions. Thank you! --DVrandecic (WMF) (talk) 00:43, 14 May 2021 (UTC)Reply

Idea: Fact-id for each fact?

Can each fact statement get its own id? i.e., make a Wiki of Facts along with abstract Wikipedia.

  • Each fact should get its own fact-id, so that people can share the id to support claims made in discussions elsewhere. Similar to Z-id or Q-id of wikidata. This proposal requests fact-id for each statement or facts.
  • It will create a structured facts wiki, in which each page about a topic will list bulleted list of facts, with each facts having its own id. References are added to it to support the claims.
  • It presents Facts directly without hiding it in verbal prose. Cut to the chase!
  • Example 1: Each fact statement of a list like Abstract Wikipedia/Examples/Jupiter will get its own id. Example 2: A page will list facts like w:List of common misconceptions with each fact in it getting its own id.
  • This wiki will become the go-to site to learn, find, link, support and verify facts/statements/claims. And over time, Wiki of Facts will have more reliability and credibility than any other format of knowledge. (elaborated at WikiFacts) -Vis M (talk) 04:54, 27 May 2021 (UTC)Reply
@Vis M:I see you have withdrawn the WikiFacts proposal. The Abstract Wikipedia Wiki of Facts would be Wikidata, with each Wikidata statement or claim corresponding to a "fact". There are occasions when an explicit statement ID would be helpful in Wikidata, notably when a different Property is used to represent the same information. However, the Item/Property combination used in Wikidata is aligned to the classical subject—predicate form, which is probably a better starting point for natural-language representations. In particular, it is likely that similar predicates (represented using identical Property IDs) would be rendered in similar language for identifiable classes of subject. Allocating individual subjects (Wikidata Items) to appropriate classes for this purpose is also likely to be achieved within Wikidata, either with an additional Property or within lexical data (according to the factors driving the different linguistic representations).--GrounderUK (talk) 09:17, 19 June 2021 (UTC)Reply
Ok, thanks! Vis M (talk) 14:46, 22 June 2021 (UTC)Reply

Why was this approved so quickly?

Wikispore, Wikijournal, and Wikigenealogy all have potential. Wikifunctions is just another project that will sit near-empty for years like Wikispecies has, and should be part of Wikidata anyway. 2001:569:BD7D:6E00:F9E8:8F6F:25D1:B825 01:38, 19 June 2021 (UTC)Reply

I kind of agree. If lexemes were just added as a separate namespace to Wikidata, why weren't functions? ~~~~
User:1234qwer1234qwer4 (talk)
11:41, 19 June 2021 (UTC)Reply
From my point of view one difference between Wikifunctions and for example lexems is, that Wikifunctions will offer calculation resources so that some of the calculations can be made on the platform directly and it is not needed to run a function locally. As far as I understand it can be also used to centralize templates that are currently defined locally in the different language versions of Wikipedia and the other Wikimedia projects. So there is maybe a technical aspect why Wikifunctions is an own sister project.
Wikidata has a lot of contenct so I think it can happen that is not so easy to find something for a user. I have sometimes problems to find Lexems after I need to change the search for that, so that the search goes also to the Lexemenamespace. So I hope that it is easier also for external users that they can find the content if Wikifunctions is an own project. To make sure that it will not sit near-empty for years is from my point of view a big challenge. As far as I understand the project was approved because there is the hope that it can help making knowledge accessible in more languages. I dont know if this works also for small languages but I hope that it will work and I think it is important to work on it for the next years that it can become reality. How to make Wikifunctions accessible for many people is an important question and I hope that there are more discussions about it in the next weeks. The Wikimania this year is a chance to talk about Wikifunctions also with people who speak small languages.--Hogü-456 (talk) 19:54, 20 June 2021 (UTC)Reply
Yes, as Hogü-456 describes, those are among the reasons why the project is better as a separate wiki, and some of the goals of the project. They require very different types of software back-ends and front-ends to Wikidata, and a significantly different skillset among some of the primary contributors. I might describe it as: Wikidata is focused on storing and serving certain types of structured data, whereas Wikifunctions is focused on running calculations on structured data. There are more overview details in Abstract Wikipedia/Overview that you might find helpful. Quiddity (WMF) (talk) 22:19, 21 June 2021 (UTC)Reply
@Hogü-456: Re: problems searching for Lexemes on Wikidata - If you prefix your searches there with L: then that will search only the Lexeme namespace. E.g. L:apple. :) Quiddity (WMF) (talk) 22:23, 21 June 2021 (UTC)Reply

Decompilation

I am still working on Spreadsheetsfunctions and try to understand them and write functions to bring them into another programming language. In the last days I have thinked about how far it is possible to get the specific program that is generated for the input I gave through entering functions in a Spreadsheet. So that the Output is generated and printet out in a Cell. I want the program as Byte code or Assembler or in another notation from where it is possible to bring it to other programming languages in a automatic way. Do you think that this is possible to make out of spreadsheet functions a program in another programming language by trying to get the binary code and then try to decompile it. After I dont know much about programming and I dont know if it is possible. Has someone of you experiences about decompiling and how it is possible to get the byte code that is executed when I enter a own written composition of functions in my Spreadsheet program.--Hogü-456 (talk) 18:01, 8 July 2021 (UTC)Reply

@Hogü-456: There is a lot of "it depends" in this answer :) If you search the Web for "turn spreadsheet into code" you can find plenty of tools and descriptions helping with that. I have my doubts about these, to be honest. I think, turning them into binary code or assembler would probably be an unnecessarily difficult path.
My suggestion would be to turn each individual formula into a function, and the wire the inputs and outputs of the functions according to the cells together. That's probably what these tools do. But I would try to stay at least at the abstraction level of the spreadsheet, and not dive into the binary.
It will be interesting to see how this will play together with Wikifunctions. I was thinking about using functions from Wikifunctions in a spreadsheet - but not the other way around, using a spreadsheet to implement functions in Wikifunctions. That's an interesting idea, that could potentially open the path for some people to contribute implementations?
I filed a task to keep your idea in the tracker! --DVrandecic (WMF) (talk) 21:13, 3 September 2021 (UTC)Reply

you are creating just a new language just like any of existing natural languages. this is wrong.

https://meta.wikimedia.org/wiki/Abstract_Wikipedia :

In Abstract Wikipedia, people can create and maintain Wikipedia articles in a language-independent way. A particular language Wikipedia can translate this language-independent article into its language.

https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Examples :

  Article(
    content: [
      Instantiation(
        instance: San Francisco (Q62),
        class: Object_with_modifier_and_of(
          object: center,
          modifier: And_modifier(
            conjuncts: [cultural, commercial, financial]
          ),
          of: Northern California (Q1066807)
        )
      ),
      Ranking(
        subject: San Francisco (Q62),
        rank: 4,
        object: city (Q515),
        by: population (Q1613416),
        local_constraint: California (Q99),
        after: [Los Angeles (Q65), San Diego (Q16552), San Jose (Q16553)]
      )
    ]
  )

* English : San Francisco is the cultural, commercial, and financial center of Northern California. It is the fourth-most populous city in California, after Los Angeles, San Diego and San Jose.


you are creating a new language just like other natural languages. and it is worse than the existing natural languages, because it is going to have many functions like "Object_with_modifier_and_of", "Instantiation", "Ranking".

an advantageous feature, compared to natural languages, shown there, is that you link concepts to wikidata, but that can be done also with natural languages. other good thing here is structure shown with parentheses, but that also can be done with natural languages. so, there is nothing better in this proposed (artificial) language, compared to natural languages.

i think that, probably, any sentence of any natural language is semantically a binary tree like this:

(
	(
		(San Francisco)
		(
			be
			(
				the
				(
					(
						(
							(
								(culture al)
								(
									,
									(commerce ial)
								)
							)
							(
								,
								(
									and
									(finance ial)
								)
							)
						)
						center
					)
					(
						of
						(
							(North ern)
							California
						)
					)
				)
			)
		)
	)
	s
)
.

(
	(
		It 
		(
			be
			(
				(
					the
					(
						(four th)
						(	
							(
								(much est)
								(populous city)
							)
							(in California)
						)
					)
				)
				(
					,
					(
						after
						(
							(
								(Los Angeles)
								(
									,
									(San Diego)
								)
							)
							(
								and
								(San Jose)
							)
						)
					)
				)
			)
		)
	)
	s
)
.

some parts of text can be shown several ways as binary trees. for example:

((San Francisco) ((be X) s))

(((San Francisco) be X) s)

fourth
(	
	(
		most
		(populous city)
	)
	(in California)
)

(
fourth
most
)
(	
	
	(populous city)
	(in California)
)

(
	fourth
	(	
		(
			most
			(populous city)
		)
	)
)
(in California)

fourth
(	
	(
		(
			most
			populous
		)
		city
	)
	(in California)
)


--QDinar (talk) 12:31, 17 August 2021 (UTC)Reply

creating a new language is a huge effort, and only few people are going to know it. you have to discuss different limits of every word in it to come to some consensus... and all that work is just to create just another language no better, by its structure, than existing thousands natural languages. (lexicon can be bigger than of some languages). what you should do instead is just use a format like this for every language and use functions to transform it to usual form of that language. also, speach synthesis can be done better using the parentheses. also you can transform this formats from language to language. --QDinar (talk) 12:55, 17 August 2021 (UTC)Reply

i think, any paragraph, probably, also can be structured into binary tree, like this, and i make a tree of mediawiki discussion signature, for purpose of demonstration of the binary tree concept:

(
	(
		(
			creating a new language is a huge effort, and only few people are going to know it.
			you have to discuss different limits of every word in it to come to some consensus...
		)
		(
			and all that work is just to create just another language no better, by its structure, than existing thousands natural languages.
			(lexicon can be bigger than of some languages).
		)
	)
	(
		(
			(
				what you should do instead is just use a format like this for every language and use functions to transform it to usual form of that language.
				also, speach synthesis can be done better using the parentheses.
			)
			also you can transform this formats from language to language.
		)
	)
)
(
	--
	(
		(
			QDinar
			("()" talk)
		)
		(
			(
				12
				(: 55)
			)
			(
				,
				(
					(
						(17 August)
						2021
					)
					(
						"()"
						(
							(U T)
							C
						)
					)
				)
			)
		)
	)
)

(the regular sentences are intentionally not structured into binary tree in this example). this structure can be useful to better connect sentences via pronouns. and different languages may have different limits and preferences in using one sentence vs several sentences with pronouns. this parentheses may help to (properly) translate that places to other languages. --QDinar (talk) 13:22, 17 August 2021 (UTC)Reply

since this is like out of scope of the Abstract Wikipedia project, i have submitted this as a project: Structured text. --QDinar (talk) 19:20, 18 August 2021 (UTC)Reply

@Qdinar: Yes, you're right, in a way we are creating a new language. With the difference, that we are creating it together and that we are creating tools to work with that language. But no, it is not a natural language, it is a formal language. Natural languages are very hard to parse (the structures that you put into your examples, all these parentheses, were done with a lot of intelligence on your part). The other thing is that a lot of words in natural languages are ambiguous, which makes them hard to translate. The "sprengen" in "Ich sprenge den Rasen" is a very different "sprengen" than the one in "Ich sprenge die Party". That's why we think we need to work with a formal language, in order to avoid these issues. I don't think that you could use a natural language as the starting point for this (although I have a grammar of Ithkuil on my table right now, and that might be an interesting candidate. One could argue whether that's natural, though). --DVrandecic (WMF) (talk) 20:59, 3 September 2021 (UTC)Reply

"Natural languages are very hard to parse (the structures that you put into your examples, all these parentheses, were done with a lot of intelligence on your part)." - i do not agree with this. to write this i just need to know this language and write, that is english and i already know it. binary tree editor could help to make this tree faster. i have also added request for binary tree tool in the Structured text, and additional explanations. in comparison, to write in your language, i need to learn that your language. and this binary tree structure is easier to parse than your complicated language. if you say about parsing from traditional text, then, it is possible to do it, and there is almost zero texts in your language yet. --QDinar (talk) 08:49, 4 September 2021 (UTC)Reply
"The other thing is that a lot of words in natural languages are ambiguous, which makes them hard to translate." - probably, your artificial language is also going to have some ambiguities. because get every meaning and you can divide that meaning into several cases. "The "sprengen" in "Ich sprenge den Rasen" is a very different "sprengen" than the one in "Ich sprenge die Party"." - this example has not been useful example to prove to me. in both cases it is about scattering something, is not it? if somebody causes a party to cancel before even 10% of its people have known about it is going to be held, is this word used? i suspect, it is not used in that case. " although I have a grammar of Ithkuil on my table right now, and that might be an interesting candidate. One could argue whether that's natural, though " - according to wikipedia, it is a constructed language with no users. (ie artificial language). that artificial languages have few users. probably ithkuil have some problems, like limits of meaning are not established. since it has 0 users, when new people start to use it, they are going to change that limits. --QDinar (talk) 09:11, 4 September 2021 (UTC)Reply

Wikifunctions

What will be the URL for the upcoming Wikifunctions website? When is it expected to be completed? 54nd60x (talk) 12:55, 17 August 2021 (UTC)Reply

@54nd60x: The URL will be wikifunctions.org (similar to wikidata.org).
Timeline: Overall, "we'll launch when we're ready". We're hoping to have a version on the mw:Beta Cluster sometime during the next few months, and hoping the "production" version will be ready for initial launch sometime early in the next calendar year. It won't be "feature-complete" at launch, but it will be: stable for adding the initial functions content; locally-usable for writing and running functions; and ready for further development. The design and feature-set will steadily grow and adapt based on planned work and on feedback, over the coming years. More broad details on the development are at Abstract Wikipedia/Phases, and finer details at the links within. I hope that helps! Quiddity (WMF) (talk) 19:51, 18 August 2021 (UTC)Reply

Logo of Wikifunctions and License

Hello, when will the logo for Wikifunctions be finalized and then published. There was a voting about the favourite logo and since then I havent heard something about it. Something I am also interested in is to understand what the license of the published functions will be. Do you know what license will be used for the content in Wikifunctions. Have you talked with the legal team of the Wikimedia Foundation what they think about it. Please try to give an answer to this questions after it is now a while since the discussions about that topics.--Hogü-456 (talk) 19:49, 23 August 2021 (UTC)Reply

@Hogü-456: We are working on both, sorry for the delay! We had some meetings with the legal team, and we are still working on the possible options regarding the license. That is definitely a conversation or decision (depending on how the discussion with legal progresses) we still owe to the community. This will likely take two or three months before we get to the next public step.
Regarding the logo, we ran into some complications. It is currently moving forward, but will still take a few weeks (but not months, this should be there sooner).
Thanks for inquiring! We'll keep you posted on both as soon as we have updates! --DVrandecic (WMF) (talk) 20:45, 3 September 2021 (UTC)Reply