Jump to content

Pinyin: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Stevertigo (talk | contribs)
No edit summary
Stevertigo (talk | contribs)
No edit summary
Line 1: Line 1:
Pinyin [[w:Pinyin]] is a Roman alphabet-based phonetic rendering of Chinese word pronunciations. It requires the use of four special macrons over the vowels, representing four tones (in addition to the nominal "5th" tone) which distinguish word meaning.
Pinyin [[w:Pinyin]] is a Roman alphabet-based phonetic rendering of Chinese word pronunciations. It requires the use of four special macrons over the vowels, representing four tones (in addition to the nominal "5th" tone) which distinguish word meaning.
:::<big>āáǎà</big>
A good idea for functionality to add to [[Mediawiki]]:
[http://www.foolsworkshop.com/downloads/pinyintounicode.txt PinyintoUnicode Source (GNU GPL)] -- it takes a word like 'Feng1shui3' and converts it to 'Fēngshǔi'. This has to be done within context marks, like <nowiki><pinyin>Feng1shui3</pinyin> to isolate the function.</nowiki>


The correction of improper character sets used for the purpose of displaying pinyin should not be an issue, since pinyin is not so much a character set as it is a very limited array of display marks over vowels, within Unicode its a standard feature and is well incorporated into the standard sets. Still if at some point pinyin to IPA conversion might be useful, then that conversion process might require some correction of misused characters. Most problematic is the third tone mark like "ě"-- which may be substituted with a similar rounder-shaped (not sharp) diacritic.
[[Pinyinacc.png]]


*See [http://www.foolsworkshop.com/ptou/ Pinyin to Unicode converter] ''This page converts text written in pinyin, with syllable-final tone numbers, into unicode. Simply enter or paste in the pinyin and convert.''
From [http://www.math.nus.edu.sg/aslaksen/read.shtml Helmer Aslaksen's page on Reading and Writing Pinyin in Unicode]

:'''Warning:''' ''Some older browser have trouble with hexadecimal numeric character references, so it may be safest to use decimal.''


Latin-1 Supplement - Unicode U+0080 - U+00FF - (128-255)
Latin-1 Supplement - Unicode U+0080 - U+00FF - (128-255)
Line 42: Line 43:
subtract 1 for upper case
subtract 1 for upper case



[http://www.foolsworkshop.com/ptou/ Pinyin to Unicode converter] [http://www.foolsworkshop.com/downloads/pinyintounicode.txt Source GNU GPL]
From [http://www.math.nus.edu.sg/aslaksen/read.shtml Helmer Aslaksen's page on Reading and Writing Pinyin in Unicode]
:"This page performs a simple function. It converts text written in pinyin, with syllable-final tone numbers, into unicode. The result is displayed both as plain unicode text and as the HTML code necessary to display the unicode in a web page. Simply enter or paste in the pinyin and convert."

:'''Warning:''' ''Some older browser have trouble with hexadecimal numeric character references, so it may be safest to use decimal.''

Revision as of 20:43, 10 September 2003

Pinyin w:Pinyin is a Roman alphabet-based phonetic rendering of Chinese word pronunciations. It requires the use of four special macrons over the vowels, representing four tones (in addition to the nominal "5th" tone) which distinguish word meaning.

āáǎà

A good idea for functionality to add to Mediawiki: PinyintoUnicode Source (GNU GPL) -- it takes a word like 'Feng1shui3' and converts it to 'Fēngshǔi'. This has to be done within context marks, like <pinyin>Feng1shui3</pinyin> to isolate the function.

The correction of improper character sets used for the purpose of displaying pinyin should not be an issue, since pinyin is not so much a character set as it is a very limited array of display marks over vowels, within Unicode its a standard feature and is well incorporated into the standard sets. Still if at some point pinyin to IPA conversion might be useful, then that conversion process might require some correction of misused characters. Most problematic is the third tone mark like "ě"-- which may be substituted with a similar rounder-shaped (not sharp) diacritic.

  • See Pinyin to Unicode converter This page converts text written in pinyin, with syllable-final tone numbers, into unicode. Simply enter or paste in the pinyin and convert.

Latin-1 Supplement - Unicode U+0080 - U+00FF - (128-255) á = á = á = á à = à = à = à é = é = é = é è = è = è = è í = í = í = í ì = ì = ì = ì ó = ó = ó = ó ò = ò = ò = ò ú = ú = ú = ó ù = ù = ù = ù ü = ü = ü = ü subtract 32 for upper case

Latin Extended-A - Unicode U+0100 - U+017F - (256-383) ā = ā = ā ē = ē = ē ě = ě = ě ī = ī = ī ō = ō = ō ū = ū = ū subtract 1 for upper case

Latin Extended-B U+0180 - U+024F (384-591) ǎ = ǎ = ǎ ǐ = ǐ = ǐ ǒ = ǒ = ǒ ǔ = ǔ = ǔ

ǖ = ǖ = ǖ ǘ = ǘ = ǘ ǚ = ǚ = ǚ ǜ = ǜ = ǜ subtract 1 for upper case


From Helmer Aslaksen's page on Reading and Writing Pinyin in Unicode

Warning: Some older browser have trouble with hexadecimal numeric character references, so it may be safest to use decimal.