Hexadecimal string representation of a code point
Same as toHexString
but without the U+
prefix.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Hexadecimal string representation of a code point
Prefix U+
followed by at least four uppercase hexadecimal digits
(e.g. U+0123
and U+4B0A1
but neither U+123
nor U+4b0a1
).
Equations
- Unicode.toHexString code = "U+" ++ Unicode.toHexStringAux code
Instances For
Get code point from hexadecimal string representation
For convenience, the U+
prefix may be omitted and lowercase hexadecimal
digits are accepted.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get value of hex digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get code point from hexadecimal string representation
For convenience, the U+
prefix may be omitted and lowercase hexadecimal
digits are accepted.
Equations
- One or more equations did not get rendered due to their size.
Instances For
General Category #
Major general category (L
, M
, N
, P
, S
, Z
, C
)
Unicode property: General_Category
- letter : MajorGeneralCategory
(
L
) Letter - mark : MajorGeneralCategory
(
M
) Mark - number : MajorGeneralCategory
(
N
) Number - punctuation : MajorGeneralCategory
(
P
) Punctuation - symbol : MajorGeneralCategory
(
S
) Symbol - separator : MajorGeneralCategory
(
Z
) Separator - other : MajorGeneralCategory
(
C
) Other
Instances For
Equations
- Unicode.instDecidableEqMajorGeneralCategory x✝ y✝ = if h : x✝.toCtorIdx = y✝.toCtorIdx then isTrue ⋯ else isFalse ⋯
String abbreviation for major general category
Equations
- Unicode.MajorGeneralCategory.letter.toAbbrev = "L"
- Unicode.MajorGeneralCategory.mark.toAbbrev = "M"
- Unicode.MajorGeneralCategory.number.toAbbrev = "N"
- Unicode.MajorGeneralCategory.punctuation.toAbbrev = "P"
- Unicode.MajorGeneralCategory.symbol.toAbbrev = "S"
- Unicode.MajorGeneralCategory.separator.toAbbrev = "Z"
- Unicode.MajorGeneralCategory.other.toAbbrev = "C"
Instances For
Minor general category
Unicode property: General_Category
- casedLetter : MinorGeneralCategory MajorGeneralCategory.letter
- uppercaseLetter : MinorGeneralCategory MajorGeneralCategory.letter
(
Lu
) uppercase letter - lowercaseLetter : MinorGeneralCategory MajorGeneralCategory.letter
(
Ll
) lowercase letter - titlecaseLetter : MinorGeneralCategory MajorGeneralCategory.letter
(
Lt
) titlecase letter: digraphic character, with first part uppercase - modifierLetter : MinorGeneralCategory MajorGeneralCategory.letter
(
Lm
) modifier letter - otherLetter : MinorGeneralCategory MajorGeneralCategory.letter
(
Lo
) other letters, including syllables and ideographs - nonspacingMark : MinorGeneralCategory MajorGeneralCategory.mark
(
Mn
) nonspacing combining mark (zero advance width) - spacingMark : MinorGeneralCategory MajorGeneralCategory.mark
(
Mc
) spacing combining mark (positive advance width) - enclosingMark : MinorGeneralCategory MajorGeneralCategory.mark
(
Me
) enclosing combining mark - decimalNumber : MinorGeneralCategory MajorGeneralCategory.number
(
Nd
) decimal digit - letterNumber : MinorGeneralCategory MajorGeneralCategory.number
(
Nl
) letter number: a letterlike numeric character - otherNumber : MinorGeneralCategory MajorGeneralCategory.number
(
No
) numeric character of other type - connectorPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Pc
) connecting punctuation mark, like a tie - dashPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Pd
) dash or hyphen punctuation mark - groupPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
- openPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Ps
) opening punctuation mark (of a pair) - closePunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Pe
) closing punctuation mark (of a pair) - quotePunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
- initialPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Pi
) initial quotation mark - finalPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Pf
) final quotation mark - otherPunctuation : MinorGeneralCategory MajorGeneralCategory.punctuation
(
Po
) punctuation mark of other type - mathSymbol : MinorGeneralCategory MajorGeneralCategory.symbol
(
Sm
) symbol of mathematical use - currencySymbol : MinorGeneralCategory MajorGeneralCategory.symbol
(
Sc
) currency sign - modifierSymbol : MinorGeneralCategory MajorGeneralCategory.symbol
(
Sk
) non-letterlike modifier symbol - otherSymbol : MinorGeneralCategory MajorGeneralCategory.symbol
(
So
) symbol of other type - spaceSeparator : MinorGeneralCategory MajorGeneralCategory.separator
(
Zs
) space character (of various non-zero widths) - lineSeparator : MinorGeneralCategory MajorGeneralCategory.separator
(
Zl
) line separator (U+2028 LINE SEPARATOR only) - paragraphSeparator : MinorGeneralCategory MajorGeneralCategory.separator
(
Zp
) paragraph separator (U+2029 PARAGRAPH SEPARATOR only) - control : MinorGeneralCategory MajorGeneralCategory.other
(
Cc
) C0 or C1 control code - format : MinorGeneralCategory MajorGeneralCategory.other
(
Cf
) format control character - surrogate : MinorGeneralCategory MajorGeneralCategory.other
(
Cs
) surrogate code point - privateUse : MinorGeneralCategory MajorGeneralCategory.other
(
Co
) private-use character - unassigned : MinorGeneralCategory MajorGeneralCategory.other
(
Cn
) reserved unassigned code point or a noncharacter
Instances For
General category of a code point
Unicode property: General_Category
- major : MajorGeneralCategory
Major general category of a code point
- minor : Option (MinorGeneralCategory self.major)
Minor general category of a code point
Instances For
Equations
- Unicode.instInhabitedGeneralCategory = { default := { major := default, minor := default } }
General category: letter (L
)
Equations
- Unicode.GeneralCategory.L = { major := Unicode.MajorGeneralCategory.letter, minor := none }
Instances For
General category: cased letter (LC
)
Equations
- Unicode.GeneralCategory.LC = { major := Unicode.MajorGeneralCategory.letter, minor := some Unicode.MinorGeneralCategory.casedLetter }
Instances For
General category: other letter (Lo
)
Equations
- Unicode.GeneralCategory.Lo = { major := Unicode.MajorGeneralCategory.letter, minor := some Unicode.MinorGeneralCategory.otherLetter }
Instances For
General category mark (M
)
Equations
- Unicode.GeneralCategory.M = { major := Unicode.MajorGeneralCategory.mark, minor := none }
Instances For
General category: nonspacing combining mark (Mn
)
Equations
- Unicode.GeneralCategory.Mn = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.nonspacingMark }
Instances For
General category: spacing combining mark (Mc
)
Equations
- Unicode.GeneralCategory.Mc = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.spacingMark }
Instances For
General category: enclosing combining mark (Me
)
Equations
- Unicode.GeneralCategory.Me = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.enclosingMark }
Instances For
General category: number (N
)
Equations
- Unicode.GeneralCategory.N = { major := Unicode.MajorGeneralCategory.number, minor := none }
Instances For
General category: decimal digit (Nd
)
Equations
- Unicode.GeneralCategory.Nd = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.decimalNumber }
Instances For
General category: letter number (Nl
)
Equations
- Unicode.GeneralCategory.Nl = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.letterNumber }
Instances For
General category: other number (No
)
Equations
- Unicode.GeneralCategory.No = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.otherNumber }
Instances For
General category: punctuation (P
)
Equations
- Unicode.GeneralCategory.P = { major := Unicode.MajorGeneralCategory.punctuation, minor := none }
Instances For
General category: symbol (S
)
Equations
- Unicode.GeneralCategory.S = { major := Unicode.MajorGeneralCategory.symbol, minor := none }
Instances For
General category: math symbol (Sm
)
Equations
- Unicode.GeneralCategory.Sm = { major := Unicode.MajorGeneralCategory.symbol, minor := some Unicode.MinorGeneralCategory.mathSymbol }
Instances For
General category: other symbol (So
)
Equations
- Unicode.GeneralCategory.So = { major := Unicode.MajorGeneralCategory.symbol, minor := some Unicode.MinorGeneralCategory.otherSymbol }
Instances For
General category: separator (Z
)
Equations
- Unicode.GeneralCategory.Z = { major := Unicode.MajorGeneralCategory.separator, minor := none }
Instances For
General category: other (C
)
Equations
- Unicode.GeneralCategory.C = { major := Unicode.MajorGeneralCategory.other, minor := none }
Instances For
General category: control (Cc
)
Equations
- Unicode.GeneralCategory.Cc = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.control }
Instances For
General category: format (Cf
)
Equations
- Unicode.GeneralCategory.Cf = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.format }
Instances For
General category: surrogate (Cs
)
Equations
- Unicode.GeneralCategory.Cs = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.surrogate }
Instances For
General category: private use (Co
)
Equations
- Unicode.GeneralCategory.Co = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.privateUse }
Instances For
General category: unassigned (Cn
)
Equations
- Unicode.GeneralCategory.Cn = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.unassigned }
Instances For
String abbreviation for general category
Instances For
Get general category from string abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get general category from string abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Numeric Type and Value #
Unicode numeric type
Unicode properties: Numeric_Type
, Numeric_Value
- decimal
(value : Fin 10)
: NumericType
Decimal digit type and value
- digit
(value : Fin 10)
: NumericType
Digit type and value
- numeric
(num : Int)
(den : Option Nat)
: NumericType
Numeric type and value
Instances For
Equations
- Unicode.instInhabitedNumericType = { default := Unicode.NumericType.decimal default }
Equations
- Unicode.instReprNumericType = { reprPrec := Unicode.reprNumericType✝ }
Decimal digit type
The character is part of a sequence of contiguous code points representing decimal digits 0 through 9.
Unicode property: Numeric_Type
Equations
- (Unicode.NumericType.decimal value).isDecimal = true
- x✝.isDecimal = false
Instances For
Digit type
The character represents a decimal digit 0 through 9.
Unicode property: Numeric_Type
Equations
- (Unicode.NumericType.decimal value).isDigit = true
- (Unicode.NumericType.digit value).isDigit = true
- x✝.isDigit = false
Instances For
Get the value of a numeric type
Returns either an integer value or a numerator-denominator pair representing a rational value.
Unicode property: Numeric_Value
Equations
- (Unicode.NumericType.decimal n).value = Sum.inl ↑↑n
- (Unicode.NumericType.digit n).value = Sum.inl ↑↑n
- (Unicode.NumericType.numeric n none).value = Sum.inl n
- (Unicode.NumericType.numeric n (some d)).value = Sum.inr (n, d)
Instances For
Decomposition Mapping #
Compatibility format tag
Unicode properties: Decomposition_Type
, Decomposition_Mapping
- font : CompatibilityTag
Font variant
- noBreak : CompatibilityTag
No-break version of a space or hyphen
- initial : CompatibilityTag
Initial presentation form (Arabic)
- medial : CompatibilityTag
Medial presentation form (Arabic)
- final : CompatibilityTag
Final presentation form (Arabic)
- isolated : CompatibilityTag
Isolated presentation form (Arabic)
- circle : CompatibilityTag
Encircled form
- super : CompatibilityTag
Superscript form
- sub : CompatibilityTag
Subscript form
- vertical : CompatibilityTag
Vertical layout presentation form
- wide : CompatibilityTag
Wide (or zenkaku) compatibility character
- narrow : CompatibilityTag
Narrow (or hankaku) compatibility character
- small : CompatibilityTag
Small variant form (CNS compatibility)
- square : CompatibilityTag
CJK squared font variant
- fraction : CompatibilityTag
Vulgar fraction form
- compat : CompatibilityTag
Otherwise unspecified compatibility character
Instances For
Equations
- Unicode.instInhabitedCompatibilityTag = { default := Unicode.CompatibilityTag.font }
Equations
- Unicode.instDecidableEqCompatibilityTag x✝ y✝ = if h : x✝.toCtorIdx = y✝.toCtorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Unicode.instReprCompatibilityTag = { reprPrec := Unicode.reprCompatibilityTag✝ }
Equations
- One or more equations did not get rendered due to their size.
Decomposition maping
Unicode properties: Decomposition_Type
, Decomposition_Mapping
- tag : Option CompatibilityTag
Compatibility format tag
Decomposition mapping
Instances For
Equations
- Unicode.instInhabitedDecompositionMapping = { default := { tag := default, mapping := default } }
Equations
- Unicode.instReprDecompositionMapping = { reprPrec := Unicode.reprDecompositionMapping✝ }
Bidirectional Class #
Bidirectional class
Unicode property: Bidi_Class
- leftToRight : BidiClass
(
L
) strong left-to-right character - rightToLeft : BidiClass
(
R
) strong right-to-left (non-Arabic-type) character - arabicLetter : BidiClass
(
AL
) strong right-to-left (Arabic-type) character - europeanNumber : BidiClass
(
EN
) ASCII digit or Eastern Arabic-Indic digit - europeanSeparator : BidiClass
(
ES
) European separator: plus and - europeanTerminator : BidiClass
(
ET
) European terminator in a numeric format context, includes currency signs - arabicNumber : BidiClass
(
AN
) Arabic-Indic digit - commonSeparator : BidiClass
(
CS
) common separator: commas, colons, and slashes - nonspacingMark : BidiClass
(
NSM
) nonspacing mark - boundaryNeutral : BidiClass
(
BN
) boundary neutral: most format characters, control codes, or noncharacters - paragraphSeparator : BidiClass
(
B
) paragraph separator, various newline characters - segmentSeparator : BidiClass
(
S
) segment separator, various segment-related control codes - whiteSpace : BidiClass
(
WS
) white spaces - otherNeutral : BidiClass
(
ON
) other neutral: most other symbols and punctuation marks - leftToRightEmbedding : BidiClass
(
LRE
) left to right embedding (U+202A: the LR embedding control) - leftToRightOverride : BidiClass
(
LRO
) Left_To_Right_Override (U+202D: the LR override control) - rightToLeftEmbeding : BidiClass
(
RLE
) right-to-left embedding (U+202B: the RL embedding control) - rightToLeftOverride : BidiClass
(
RLO
) right-to-left override (U+202E: the RL override control) - popDirectionalFormat : BidiClass
(
PDF
) pop directional format (U+202C: terminates an embedding or override control) - leftToRightIsolate : BidiClass
(
LRI
) left-to-right isolate (U+2066: the LR isolate control) - rightToLeftIsolate : BidiClass
(
RLI
) right-toleft isolate (U+2067: the RL isolate control) - firstStrongIsolate : BidiClass
(
FSI
) first strong isolate (U+2068: the first strong isolate control) - popDirectionalIsolate : BidiClass
(
PDI
) pop directional isolate (U+2069: terminates an isolate control)
Instances For
Equations
- Unicode.instInhabitedBidiClass = { default := Unicode.BidiClass.leftToRight }
Equations
- Unicode.instDecidableEqBidiClass x✝ y✝ = if h : x✝.toCtorIdx = y✝.toCtorIdx then isTrue ⋯ else isFalse ⋯
Bidi class: strong left-to-right (L
)
Instances For
Bidi class: strong right-to-left (R
)
Instances For
Bidi class: arabic letter (AL
)
Instances For
Bidi class: european number (EN
)
Instances For
Bidi class: european separator (ES
)
Instances For
Bidi class: european terminator (ET
)
Instances For
Bidi class: arabic number (AN
)
Instances For
Bidi class: common separator (CS
)
Instances For
Bidi class: nonspacing mark (NSM
)
Instances For
Bidi class: boundary neutral (BN
)
Instances For
Bidi class: paragraph separator (B
)
Instances For
Bidi class: segment separator (S
)
Instances For
Bidi class: white space (WS
)
Instances For
Bidi class: other neutral (ON
)
Instances For
Bidi class: left-to-right embedding (LRE
)
Instances For
Bidi class: left-to-right override (LRO
)
Instances For
Bidi class: right-to-left embedding (RLE
)
Instances For
Bidi class: right-to-left override (RLO
)
Instances For
Bidi class: pop directional format (PDF
)
Instances For
Bidi class: left-to-right isolate (LRI
)
Instances For
Bidi class: right-to-left isolate (RLI
)
Instances For
Bidi class: first strong isolate (FSI
)
Instances For
Bidi class: pop directional isolate (PDI
)
Instances For
String abbreviation for bidi class
Equations
- Unicode.BidiClass.leftToRight.toAbbrev = "L"
- Unicode.BidiClass.rightToLeft.toAbbrev = "R"
- Unicode.BidiClass.arabicLetter.toAbbrev = "AL"
- Unicode.BidiClass.europeanNumber.toAbbrev = "EN"
- Unicode.BidiClass.europeanSeparator.toAbbrev = "ES"
- Unicode.BidiClass.europeanTerminator.toAbbrev = "ET"
- Unicode.BidiClass.arabicNumber.toAbbrev = "AN"
- Unicode.BidiClass.commonSeparator.toAbbrev = "CS"
- Unicode.BidiClass.nonspacingMark.toAbbrev = "NSM"
- Unicode.BidiClass.boundaryNeutral.toAbbrev = "BN"
- Unicode.BidiClass.paragraphSeparator.toAbbrev = "B"
- Unicode.BidiClass.segmentSeparator.toAbbrev = "S"
- Unicode.BidiClass.whiteSpace.toAbbrev = "WS"
- Unicode.BidiClass.otherNeutral.toAbbrev = "ON"
- Unicode.BidiClass.leftToRightEmbedding.toAbbrev = "LRE"
- Unicode.BidiClass.leftToRightOverride.toAbbrev = "LRO"
- Unicode.BidiClass.rightToLeftEmbeding.toAbbrev = "RLE"
- Unicode.BidiClass.rightToLeftOverride.toAbbrev = "RLO"
- Unicode.BidiClass.popDirectionalFormat.toAbbrev = "PDF"
- Unicode.BidiClass.leftToRightIsolate.toAbbrev = "LRI"
- Unicode.BidiClass.rightToLeftIsolate.toAbbrev = "RLI"
- Unicode.BidiClass.firstStrongIsolate.toAbbrev = "FSI"
- Unicode.BidiClass.popDirectionalIsolate.toAbbrev = "PDI"
Instances For
Get bidi class from abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get bidi class from abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Unicode.instReprBidiClass = { reprPrec := fun (bc : Unicode.BidiClass) (x : Nat) => Std.Format.text (toString "Unicode.BidiClass." ++ toString bc.toAbbrev ++ toString "") }
Structure for data from UnicodeData.txt
- codeValue : UInt32
Code Value
- characterName : Substring
Character Name
- generalCategory : GeneralCategory
General Category
- canonicalCombiningClass : Nat
Canonical Combining Class
- bidiClass : BidiClass
Bidirectional Class
- bidiMirrored : Bool
Bidirectional Mirrored
- decompositionMapping : Option DecompositionMapping
Character Decomposition Mapping
- numeric : Option NumericType
Numeric Value
Uppercase Mapping
Lowercase Mapping
Titlecase Mapping
Instances For
Equations
- Unicode.instBEqUnicodeData = { beq := Unicode.beqUnicodeData✝ }
Equations
- Unicode.instReprUnicodeData = { reprPrec := Unicode.reprUnicodeData✝ }
Equations
- One or more equations did not get rendered due to their size.