动态生成Ge'ez unicodes

Question

动态生成Ge'ez unicodes

在此输入图像描述

你好.如果您查看上面的图像,您将看到一组非常奇怪的字符以及一些拉丁字符.奇怪的是厄立特里亚人物.它们是我们在我国使用的角色.因此,为了说明问题,我希望创建最简单的软件甚至是批处理文件(如果可能的话),以帮助我在网络上使这些字符适用,并让PC理解并显示它们.类型.就像使用阿拉伯语,印度语,中文......字符一样.我认为,因为"创造一种语言"的问题通常很少,或者因为我可能不知道使用的正确术语,当我搜索互联网以找到任何教程甚至是自由职业者或任何东西时,我得到的只是......没有.所以,我希望,如果有人能给我一步一步的指导,

谢谢.

Answer 1

Bri*_*ell 27

Your question asks "how to create a language", so I will describe all the pieces that need to be in place for a new language (or more accurately, writing system). You ask specifically about the Eritrean alphabet, so I will provide specific examples of how that is supported on modern systems, and try to provide you pointers for the pieces you are missing. The answer is long, and provides lots of links, to support the two explanations.

To work with a script like Ge'ez (also known as Ethiopic, the script used to write Amharic in Ethiopia and Tigrinya in Eritrea) you need a few things. The first is a way to encode the characters; a set of numbers representing each character, that the computer can use to represent the text. Luckily, Unicode has become widespread, and Unicode is designed to be a universal character set that includes all of the world's languages. Unicode 3.0 introduced Ethiopic in the range U+1200-U+137F, and later versions added supplements of more obscure characters in the ranges U+1380-U+1394, U+2D80-U+2DDF and U+AB00-U+AB2F. If you wanted to support a language that Unicode didn't yet support, you would either need to use the private use area and define your own mapping of characters to code points, or submit a proposal to have your script added to Unicode; for example, see the proposal for Ethiopic.

Now, Unicode is just a character set; an abstract mapping between characters and numbers. To actually transmit these characters as a sequence of bytes, you use a character encoding. There are many encodings; some of them, like ASCII and ISO-8859-1 only cover a subset of the full Unicode character set, while others, like UTF-8 and UTF-16, cover the full range. For documents on the web, UTF-8 is the recommended character encoding; you should never use anything else if you can help it. In UTF-8, you can write Ge'ez directly in the document, for example: ????. One thing to watch out for is that some programs (especially on Windows) will offer you "Unicode" as an encoding, when they mean UTF-16; you want to make sure to choose UTF-8, as it's more efficient and more compatible with a wider variety of software.

If you are using encodings that don't cover the full range of Unicode, or you don't have a good way to type those characters, and you are writing HTML or XML, you can use numeric character references instead. To do this, you write the Unicode code point of the character you want to refer between &# and ;. You can write the number in decimal, or in hexadecimal prefixed with an x. For example, ሀ can be written ሀ or ሀ (the semicolon at the end is important; it wasn't working for you in the comments because you were missing it).

Now that you have a character set, and a way of encoding it, you need a way to display it. Some scripts are easier to display in others. For all scripts, you need a font; a file defining how each character looks. A font contains a collection of glyphs, or drawings of each character. Some scripts, like the Latin alphabet (the alphabet used for English and most European languages) are relatively simple; each character is a separate glyph, and how they are drawn doesn't depend on what characters come before or after (though diacritics and ligatures can make it a little more complicated). Others, like Arabic and Indic scripts are written in cursive, where letters join to each other so how they are drawn can depend on the characters near them. These languages require special rendering support like Uniscribe or DirectWrite on Windows, Pango on Linux, or advanced font technology like Apple Advanced Typography or Graphite.

Luckily, Ge'ez is a fairly simple writing system, that doesn't require any specialized rending support or advanced font systems. Each of the characters is a separate glyph, and it doesn't require any reordering. So a normal OpenType font, displayed with the rendering systems already available on most computers, will do the job. But you still need the font in order to be able to display the characters. To create you own font, you can use FontForge (a free/open source tool), Fontographer, FontLab Studio, or other similar software.

For Ethiopic, you don't need to create your own. There are numerous fonts available that include the Ethiopic characters, but one that I would recommend is Abyssinica SIL from SIL (the Summer Institute of Linguistics), which does a lot of great work for minority languages and writing systems. Their fonts are available under a free license, that allows you to use the font, redistribute the font, and modify the font, so their fonts are quite flexible and can be used in a wide variety of situations. Windows ships with Nyala, which includes Ethiopic characters, since Windows Vista, and Ebrima,它增加了对Windows 8中Ethiopic字符的支持; 所以Windows Vista或更高版本的用户应该能够查看Ethiopic字符.截至10.6, Mac OS X附带Kefa.

Once you have the font, you will be able to view Ethiopic characters. But other people reading your documents might not have those fonts (if they are using an older version of Windows or Mac OS X, if they didn't install all of the fonts that came with Windows, or the like), in which case the characters will probably show up as boxes or question marks on their machine. You could give those people a redistributable font like Abyssinica SIL, or they could buy a font that includes Ethiopic characters, but that can be inconvenient. For working with word processor documents or plain text, that's probably the best you can do; they will need the font installed on their computer to be able to display the text. If you create a PDF on your computer, it should embed the fonts that it needs to display the text, so creating a PDF can be a convenient way to include uncommon fonts with your document.

On a web page, you can use web fonts to link to a font from your stylesheet, allowing the users web browser to load that font for that web page. Web fonts are supported all the way back to IE 6, and in recent versions of most other web browsers, so they are actually quite widely supported. Different web browsers support different font file formats (EOT, TTF, OpenType, SVG, and WOFF), and slightly different syntaxes for the CSS (older versions of IE are based on an older draft), so it can be a bit tricky to make a page that is compatible with all browsers. Luckily, people have automated that process. Some web fonts are available online from Google Web Fonts or FontSquirrel, but sadly, I couldn't find any Ethiopic fonts already hosted. However, you can upload a font to FontSquirrel, and it will convert it into all of the major formats, and provide example CSS that will work on all modern browsers. Note that you should only do this with fonts that allow web embedding; not all fonts do. Since Abyssinica SIL is available under the Open Font License, you can use it, and I've run it through FontSquirrel for you; you can see how it works (check out the Glyphs & Languages tab), or download the kit. To use it, just put the font files (.ttf, .eot, .svg, and .woff) on your server in the same directory as your CSS, and include the following in your CSS:

@font-face {
    font-family: 'abyssinica_silregular';
    src: url('abyssinicasil-r.eot');
    src: url('abyssinicasil-r.eot?#iefix') format('embedded-opentype'),
         url('abyssinicasil-r.woff') format('woff'),
         url('abyssinicasil-r.ttf') format('truetype'),
         url('abyssinicasil-r.svg#abyssinica_silregular') format('svg');
    font-weight: normal;
    font-style: normal;
}

Run Code Online (Sandbox Code Playgroud)

Now that you know how to encode Ethiopic, view Ethiopic characters, and share documents containing Ethiopic characters, you are probably going to want to type them into documents. If you are using HTML, you could just type the numeric character reference described above. In other documents, you could just copy and paste the characters from a chart of all of them, like the Wikipedia page. But that would become pretty cumbersome. Depending on your system and settings, you can also use Unicode Hex Input to enter arbitrary Unicode characters, but that is also cumbersome.

要完全支持在计算机上键入脚本,您需要键盘布局或输入法.某些脚本可以使用简单的键盘布局键入,该键盘布局指出哪些键对应于哪些字符.如果脚本的字符数多于键盘上的键,则可以使用Shift和Alt(或Mac上的Option)映射到更多字符.死键也可用于扩展您键入的字符范围; 死键是产生单个字形的两个或多个击键的序列; 例如,在Mac OS X上,要键入"á",您可以键入Option-E A.要在Windows上创建键盘布局,可以使用Microsoft键盘布局创建器.Mac OS X使用XML格式对于键盘布局,您可以直接创建一个,或者使用SIL的Ukelele创建一个更容易.在使用X11(如Linux)的系统上,您可以创建自己的XKB布局.

If you need more characters than can be supported with modifiers and dead keys, like typing Chinese or Japanese, then you need a full-fledged input method. An input method allows you to run arbitrary code to map what someone types into the text it produces; for example, in a Japanese input method, you may type a phonetic representation of what you you are writing, and it will show you a drop down list of possible characters that match that representation, allowing you to choose the appropriate ones. Windows provides the Input Method Manager for writing input methods, Mac OS X the Input Method Kit, and X11 has a few ways to do it, such as SCIM and iBus.

The standard input method for Ethiopic makes extensive use of dead keys. It looks like the most popular existing input method for Ethiopic is Keyman, which is a commercial input method that works on Mac and Windows, and in addition there's a free variant, KMFL, that works on Linux. SIL has keyboard downloads for this input method; they also have a keyboard layout for Mac OS X which uses dead keys to achieve the same thing. Mac OS X has more extensive dead key support, so it doesn't require an input method to support this form of input, while on Windows you need to use an input method like Keyman to be able to enter input this way. Google has a free input method for Windows, Google Input Tools for Windows, which supports Amharic, and allows you to customize its input schemes; you could try adapting their Amharic support for Tigrinya.

If you just need to support input on a web site, you could do this in JavaScript, by writing an input method in JavaScript that transliterates from what someone types into Ethiopic. I do not know of any existing frameworks for doing this; however, I have found Korean and Japanese input methods implemented in JavaScript. You could take a look at how those are implemented. Upon looking further, I've found that Tavultesoft, who make Keyman, also have KeymanWeb, a JavaScript based input method that you can buy and embed in your site. MediaWiki also has an input method extension Narayam, that includes a JavaScript based input method for MediaWiki based sites like Wikipedia, which includes an experimental Amharic input method. There is also a draft W3C IME API, which helps provide an interface between web apps and native IMEs, as well as JavaScript based IMEs. Given that it's still a draft, I don't know if it is yet supported anywhere.

With all the above (a character set, encoding, fonts, rendering support, and an input method), you will be able to create, share, and view documents in your script. If that's all you need, great; the above will allow you to work with documents in a given script. But for full support for a language on your computer, not just its script or writing system, there are two more pieces that you need: a locale, and your software to be localized (translated and adapted) for your language.

A locale specifies how programs should manipulate text in a given script, language, culture, and/or encoding. There are many common text processing operations that programs do: displaying numbers, displaying dates and times, sorting strings or names, and so on. How these should work can differ based on the language, script, and culture of the person using the program; for instance, in Swedish "ü" is sorted along with "y", while in English and German it's sorted along with "u". Differences may not be based on language: both Mexico and Spain use Spanish, but in Mexico numbers are displayed with . as the decimal separator (1½ is written "1.5"), while in Spain , is used as the decimal separator (1½ is written "1,5"). A locale specifies all of these rules. Because the locale can vary based on language, culture, and sometimes other factors, the language and country are usually used to specify the locale, and other information can be used as well.

The most widely used standard for naming locales is RFC 4646 (BCP 47). Locales are usually specified as "ln-CC" with the language code ln and country code CC: US English is en-US, British English is en-UK, and French in France is fr-FR. If more information needs to be specified, it can be included. For instance, Serbian can be written with either Latin or Cyrillic, and so Serbian in Serbia can be either sr-Latn-CS or sr-Cyrl-CS. Tigrinya in Eritrea is written ti-ER.

There are a variety of different formats for defining the rules that a particular locale has. Windows uses NLP files, a custom format that can be created with Microsoft Locale Builder. POSIX (

@Eritrea很高兴帮忙,祝你早上好好读,请告诉我你是否有任何疑问. (2认同)

归档时间：	13 年前
查看次数：	6482 次
最近记录：	6 年，6 月前