⚝
One Hat Cyber Team
⚝
Your IP:
3.21.125.27
Server IP:
97.74.87.16
Server:
Linux 16.87.74.97.host.secureserver.net 5.14.0-503.38.1.el9_5.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Apr 18 08:52:10 EDT 2025 x86_64
Server Software:
Apache
PHP Version:
8.2.28
Buat File
|
Buat Folder
Eksekusi
Dir :
~
/
usr
/
share
/
gtk-doc
/
html
/
harfbuzz
/
View File Name :
unicode-character-categories.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Unicode character categories: HarfBuzz Manual</title> <meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> <link rel="home" href="index.html" title="HarfBuzz Manual"> <link rel="up" href="shaping-concepts.html" title="Shaping concepts"> <link rel="prev" href="shaping-operations.html" title="Shaping operations"> <link rel="next" href="text-runs.html" title="Text runs"> <meta name="generator" content="GTK-Doc V1.32 (XML mode)"> <link rel="stylesheet" href="style.css" type="text/css"> </head> <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> <table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle"> <td width="100%" align="left" class="shortcuts"></td> <td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td> <td><a accesskey="u" href="shaping-concepts.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td> <td><a accesskey="p" href="shaping-operations.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td> <td><a accesskey="n" href="text-runs.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td> </tr></table> <div class="section"> <div class="titlepage"><div><div><h2 class="title" style="clear: both"> <a name="unicode-character-categories"></a>Unicode character categories</h2></div></div></div> <p> Shaping models are typically specified with respect to how scripts are defined in the Unicode standard. </p> <p> Every codepoint in the Unicode Character Database (UCD) is assigned a <span class="emphasis"><em>Unicode General Category</em></span> (UGC), which provides the most fundamental information about the codepoint: whether the codepoint represents a <span class="emphasis"><em>Letter</em></span>, a <span class="emphasis"><em>Mark</em></span>, a <span class="emphasis"><em>Number</em></span>, <span class="emphasis"><em>Punctuation</em></span>, a <span class="emphasis"><em>Symbol</em></span>, a <span class="emphasis"><em>Separator</em></span>, or something else (<span class="emphasis"><em>Other</em></span>). </p> <p> These UGC properties are "Major" categories. Each codepoint is further assigned to a "minor" category within its Major category, such as "Letter, uppercase" (<code class="literal">Lu</code>) or "Letter, modifier" (<code class="literal">Lm</code>). </p> <p> Shaping models are concerned primarily with Letter and Mark codepoints. The minor categories of Mark codepoints are particularly important for shaping. Marks can be nonspacing (<code class="literal">Mn</code>), spacing combining (<code class="literal">Mc</code>), or enclosing (<code class="literal">Me</code>). </p> <p> In addition to the UGC property, codepoints in the Indic and Southeast Asian scripts are also assigned <span class="emphasis"><em>Unicode Indic Syllabic Category</em></span> (UISC) and <span class="emphasis"><em>Unicode Indic Positional Category</em></span> (UIPC) properties that provide more detailed information needed for shaping. </p> <p> The UISC property sub-categorizes Letters and Marks according to common script-shaping behaviors. For example, UISC distinguishes between consonant letters, vowel letters, and vowel marks. The UIPC property sub-categorizes Mark codepoints by the relative visual position that they occupy (above, below, right, left, or in multiple positions). </p> <p> Some complex scripts require that the text run be split into syllables. What constitutes a valid syllable in these scripts is specified in regular expressions, formed from the Letter and Mark codepoints, that take the UISC and UIPC properties into account. </p> </div> <div class="footer"> <hr>Generated by GTK-Doc V1.32</div> </body> </html>