Machine translation: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Line 24: Line 24:
*I rather like '''WikiTran''' myself. --[[user:Stephen Gilbert|Stephen Gilbert]]
*I rather like '''WikiTran''' myself. --[[user:Stephen Gilbert|Stephen Gilbert]]
::I prefer Wikibabel, in a similar way to WIKIpedia, WIKIspecies and so on.
::I prefer Wikibabel, in a similar way to WIKIpedia, WIKIspecies and so on.
::How about Wikitongues?

===License===
===License===
All code and data should be released under a free licence ([[GPL]] for code, [[GFDL]] for text).
All code and data should be released under a free licence ([[GPL]] for code, [[GFDL]] for text).

Revision as of 21:19, 16 October 2005

 

Translate this page!

Help us develop tools for translating Wikipedia.

The purpose of the Wikipedia Machine Translation Project is to develop ideas, methods and tools that can help translate Wikipedia articles from one language to another (particularly out of English and into languages with small numbers of fluent speakers).

Motivation


Small languages can't produce articles as fast as English wikipedia because the number of wikipedians is too low. The solution for this problem is the translation of English wikipedia. But, some languages will not have enough translators. Machine Translation can improve the productivity of the community.

But manual translation can be added later, for a more accurate text.

TradWiki/WikiTran


TradWiki/WikiTran (WikipediaTranslator/WikiTranslator/BabelWiki) is a wiki that will be coded to help wikipedians translate articles from English to other languages.

I prefer Wikibabel, in a similar way to WIKIpedia, WIKIspecies and so on.
How about Wikitongues?

License

All code and data should be released under a free licence (GPL for code, GFDL for text).

Advantages

  • faster translation of wikipedia
  • generation of large amounts of useful data (corpora).
  • creation of a useful tool

TradWiki/WikiTran - Translation memory approach


A Translation Memory is a computer program that uses a database of old translations to help a human translator. If this approach is followed, WikipediaTranslator will need the following features:

  • visualization of translated and original versions
  • split of original versions on several parts for individual translation

Lexical, syntactic and semantic analysis of wikipedia content


The first step for wikipedia translation is the analysis of wikipedia's content. This analysis will determine:

  • Number of words and sentences
  • Words distribution
  • Frequency of the most popular sentences and expressions
  • Semantic relations between words and between sentences
  • Syntactic analysis of all sentences
It would be interesting the user could click on every word in an article to link to the wiktionary definition, if there is not an inside wikipedia article. And indicate to the software to translate the word into another language ( using the right mouse clicking).

Information about the most popular sentences and expressions can be used to create a translation database of such expressions so translators don't need to repeat a translation.

Yes, a database of idioms
You mean like a w:translation memory system?

Resources:

  • Translation rules
  • Code
Unfortunately none of these projects seem to have been updated since around 2003.
I like the idea use traduki. One can use traduki keys to stablish relations between words in different languages. I.e. hundo is the key to en:dog, es:perro and so on. So, going to hundo, you can add another translation to other lnnguages, without add language: links in the es:perro article, for example.


Links

References:

Discussion

See the talk:Wikipedia Machine Translation Project page.