Joined: June 2006
||Posted: Mar. 13 2008,20:14
|Quote (stupid_idiot @ Mar. 13 2008,13:06)|
I don't think the slight added convenience of tone marks (Chinese pinyin) and diacritics (Indic languages) justifies the major trouble of a new charset (research, testing, everyday use).
If 2 people want to communicate using phonetic representation, I think there is a much simpler solution:
In the case of Chinese, the user can just use numbers (as in "1, 2, 3, 4") to represent tone:
This way, 2 people can communicate in Chinese using pinyin, without the aid of a Chinese font.
This is easier compared to using tone marks. When using tone marks, the user must know where (which vowel) to place the tone mark.
This is not required when using numbers:
Wo3 shi4 zhong1 wen2 xue2 sheng1.
Similarly, 2 people can communicate in Japanese using Romaji, without the aid of a Japanese font.
Indic languages might be harder to transliterate/romanize:
Idea: There are ascii-only transliteration schemes for Bengali/Hindu/Tamil. But because I don't know any of these languages at all, I don't know whether Bengali/Hindu/Tamil speakers prefer the official schemes (with diacritics), or if they prefer the ascii-only schemes (more user-friendly??).
The dravidian languages (Bengali, Hindi, et cetera) present the greatest difficulty due to a great number of diacritical and other marks.
If I understand correctly, your goal is:
To input [any-language] characters using phonetic representation -- i.e. actual input, rather than transliteration/romanization.
If so, I think the existing combination of pinyin input methods (e.g. SCIM, and others) + Unicode/locale-specific font(s) is the only solution. (Your proposed new charset/font can be used to input other languages, but it cannot display them.)
For romanization: I think we should use ascii-only methods, rather than create a new charset. In the case of Indic languages (Bengali, Hindu, Tamil, etc), I don't know those languages at all; but in my uninformed opinion, I think the proposed new charset will be of very little use.
If 2 people want to communicate in Tamil:
-- If no Tamil font is installed, then they would use the most compatible way of transliterating Tamil (ascii??). If so, they would not want to use an obscure charset...?
-- If a Tamil font is installed, then the proposed new font would be redundant?
I put you login in quotes, since it is obviously intended to be ironic, coming from one who responds so intelligently.
Yes, numbers can be used to indicate tones, that being Jyutping, Yale and Cantonese-Pinyin practice. I suspect that many find the 4 tone marks in bejinghua to be more intuitive, since they are pictures of how the tones rise and fall. Personally, I would struggle through deciphering accented Pinyin, but would pass on the numeric notation unless I had a very strong motivation otherwise.
As for my intention, I'm focused on the first impression, a-la the first impression one receives when meeting another person. It is to make DSL more universally useful, without bloat.
Universally useful is similar my standard test for new software. Boot it up and see if I can use it. If it is intuitive enough that I can do something useful, then I will read the "friendly" manual. I'd like to see a majority of people be able to boot DSL and use it well enough to seek further help and recieve it.
Without bloat, to me, means an initial, small-footprint means of access that doesn't unduly burden the 50MB footprint. One 8-bit font, compared to N unicode fonts seems to be more in keeping with the DSL style.
Perhaps just including the Indic glyphs in the 128 - 255 range would be enough. I'm still looking at it.
What would be great is a system that allows someone to change languages and keyboards with a click of a button, as another poster mentioned. The $64,000 question, "Is that feasable within a 50MB footprint?" John and Robert are the ones to answer that question, and will wisely ignore it until there is something real to assess.