Hexadecimal string representation of a code point
Same as toHexString
but without the U+
prefix.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Hexadecimal string representation of a code point
Prefix U+
followed by at least four uppercase hexadecimal digits
(e.g. U+0123
and U+4B0A1
but neither U+123
nor U+4b0a1
).
Equations
- Unicode.toHexString code = "U+" ++ Unicode.toHexStringAux code
Instances For
Get code point from hexadecimal string representation
For convenience, the U+
prefix may be omitted and lowercase hexadecimal
digits are accepted.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get value of hex digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get code point from hexadecimal string representation
For convenience, the U+
prefix may be omitted and lowercase hexadecimal
digits are accepted.
Equations
- One or more equations did not get rendered due to their size.
Instances For
General Category #
Major general category (L
, M
, N
, P
, S
, Z
, C
)
Unicode property: General_Category
- letter: Unicode.MajorGeneralCategory
(
L
) Letter - mark: Unicode.MajorGeneralCategory
(
M
) Mark - number: Unicode.MajorGeneralCategory
(
N
) Number - punctuation: Unicode.MajorGeneralCategory
(
P
) Punctuation - symbol: Unicode.MajorGeneralCategory
(
S
) Symbol - separator: Unicode.MajorGeneralCategory
(
Z
) Separator - other: Unicode.MajorGeneralCategory
(
C
) Other
Instances For
Equations
- Unicode.instDecidableEqMajorGeneralCategory x y = if h : x.toCtorIdx = y.toCtorIdx then isTrue ⋯ else isFalse ⋯
String abbreviation for major general category
Equations
- Unicode.MajorGeneralCategory.letter.toAbbrev = "L"
- Unicode.MajorGeneralCategory.mark.toAbbrev = "M"
- Unicode.MajorGeneralCategory.number.toAbbrev = "N"
- Unicode.MajorGeneralCategory.punctuation.toAbbrev = "P"
- Unicode.MajorGeneralCategory.symbol.toAbbrev = "S"
- Unicode.MajorGeneralCategory.separator.toAbbrev = "Z"
- Unicode.MajorGeneralCategory.other.toAbbrev = "C"
Instances For
Minor general category
Unicode property: General_Category
- casedLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
- uppercaseLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
(
Lu
) uppercase letter - lowercaseLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
(
Ll
) lowercase letter - titlecaseLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
(
Lt
) titlecase letter: digraphic character, with first part uppercase - modifierLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
(
Lm
) modifier letter - otherLetter: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.letter
(
Lo
) other letters, including syllables and ideographs - nonspacingMark: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.mark
(
Mn
) nonspacing combining mark (zero advance width) - spacingMark: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.mark
(
Mc
) spacing combining mark (positive advance width) - enclosingMark: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.mark
(
Me
) enclosing combining mark - decimalNumber: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.number
(
Nd
) decimal digit - letterNumber: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.number
(
Nl
) letter number: a letterlike numeric character - otherNumber: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.number
(
No
) numeric character of other type - connectorPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Pc
) connecting punctuation mark, like a tie - dashPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Pd
) dash or hyphen punctuation mark - groupPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
- openPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Ps
) opening punctuation mark (of a pair) - closePunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Pe
) closing punctuation mark (of a pair) - quotePunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
- initialPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Pi
) initial quotation mark - finalPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Pf
) final quotation mark - otherPunctuation: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.punctuation
(
Po
) punctuation mark of other type - mathSymbol: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.symbol
(
Sm
) symbol of mathematical use - currencySymbol: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.symbol
(
Sc
) currency sign - modifierSymbol: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.symbol
(
Sk
) non-letterlike modifier symbol - otherSymbol: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.symbol
(
So
) symbol of other type - spaceSeparator: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.separator
(
Zs
) space character (of various non-zero widths) - lineSeparator: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.separator
(
Zl
) line separator (U+2028 LINE SEPARATOR only) - paragraphSeparator: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.separator
(
Zp
) paragraph separator (U+2029 PARAGRAPH SEPARATOR only) - control: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.other
(
Cc
) C0 or C1 control code - format: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.other
(
Cf
) format control character - surrogate: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.other
(
Cs
) surrogate code point - privateUse: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.other
(
Co
) private-use character - unassigned: Unicode.MinorGeneralCategory Unicode.MajorGeneralCategory.other
(
Cn
) reserved unassigned code point or a noncharacter
Instances For
Equations
- Unicode.instDecidableEqMinorGeneralCategory = Unicode.decEqMinorGeneralCategory✝
General category of a code point
Unicode property: General_Category
- major : Unicode.MajorGeneralCategory
Major general category of a code point
- minor : Option (Unicode.MinorGeneralCategory self.major)
Minor general category of a code point
Instances For
Equations
- Unicode.instInhabitedGeneralCategory = { default := { major := default, minor := default } }
General category: letter (L
)
Equations
- Unicode.GeneralCategory.L = { major := Unicode.MajorGeneralCategory.letter, minor := none }
Instances For
General category: cased letter (LC
)
Equations
- Unicode.GeneralCategory.LC = { major := Unicode.MajorGeneralCategory.letter, minor := some Unicode.MinorGeneralCategory.casedLetter }
Instances For
General category: other letter (Lo
)
Equations
- Unicode.GeneralCategory.Lo = { major := Unicode.MajorGeneralCategory.letter, minor := some Unicode.MinorGeneralCategory.otherLetter }
Instances For
General category mark (M
)
Equations
- Unicode.GeneralCategory.M = { major := Unicode.MajorGeneralCategory.mark, minor := none }
Instances For
General category: nonspacing combining mark (Mn
)
Equations
- Unicode.GeneralCategory.Mn = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.nonspacingMark }
Instances For
General category: spacing combining mark (Mc
)
Equations
- Unicode.GeneralCategory.Mc = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.spacingMark }
Instances For
General category: enclosing combining mark (Me
)
Equations
- Unicode.GeneralCategory.Me = { major := Unicode.MajorGeneralCategory.mark, minor := some Unicode.MinorGeneralCategory.enclosingMark }
Instances For
General category: number (N
)
Equations
- Unicode.GeneralCategory.N = { major := Unicode.MajorGeneralCategory.number, minor := none }
Instances For
General category: decimal digit (Nd
)
Equations
- Unicode.GeneralCategory.Nd = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.decimalNumber }
Instances For
General category: letter number (Nl
)
Equations
- Unicode.GeneralCategory.Nl = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.letterNumber }
Instances For
General category: other number (No
)
Equations
- Unicode.GeneralCategory.No = { major := Unicode.MajorGeneralCategory.number, minor := some Unicode.MinorGeneralCategory.otherNumber }
Instances For
General category: punctuation (P
)
Equations
- Unicode.GeneralCategory.P = { major := Unicode.MajorGeneralCategory.punctuation, minor := none }
Instances For
General category: symbol (S
)
Equations
- Unicode.GeneralCategory.S = { major := Unicode.MajorGeneralCategory.symbol, minor := none }
Instances For
General category: math symbol (Sm
)
Equations
- Unicode.GeneralCategory.Sm = { major := Unicode.MajorGeneralCategory.symbol, minor := some Unicode.MinorGeneralCategory.mathSymbol }
Instances For
General category: other symbol (So
)
Equations
- Unicode.GeneralCategory.So = { major := Unicode.MajorGeneralCategory.symbol, minor := some Unicode.MinorGeneralCategory.otherSymbol }
Instances For
General category: separator (Z
)
Equations
- Unicode.GeneralCategory.Z = { major := Unicode.MajorGeneralCategory.separator, minor := none }
Instances For
General category: other (C
)
Equations
- Unicode.GeneralCategory.C = { major := Unicode.MajorGeneralCategory.other, minor := none }
Instances For
General category: control (Cc
)
Equations
- Unicode.GeneralCategory.Cc = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.control }
Instances For
General category: format (Cf
)
Equations
- Unicode.GeneralCategory.Cf = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.format }
Instances For
General category: surrogate (Cs
)
Equations
- Unicode.GeneralCategory.Cs = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.surrogate }
Instances For
General category: private use (Co
)
Equations
- Unicode.GeneralCategory.Co = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.privateUse }
Instances For
General category: unassigned (Cn
)
Equations
- Unicode.GeneralCategory.Cn = { major := Unicode.MajorGeneralCategory.other, minor := some Unicode.MinorGeneralCategory.unassigned }
Instances For
String abbreviation for general category
Instances For
Get general category from string abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get general category from string abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Numeric Type and Value #
Unicode numeric type
Unicode properties: Numeric_Type
, Numeric_Value
- decimal: Fin 10 → Unicode.NumericType
Decimal digit type and value
- digit: Fin 10 → Unicode.NumericType
Digit type and value
- numeric: Int → Option Nat → Unicode.NumericType
Numeric type and value
Instances For
Equations
- Unicode.instInhabitedNumericType = { default := Unicode.NumericType.decimal default }
Equations
- Unicode.instReprNumericType = { reprPrec := Unicode.reprNumericType✝ }
Decimal digit type
The character is part of a sequence of contiguous code points representing decimal digits 0 through 9.
Unicode property: Numeric_Type
Equations
- (Unicode.NumericType.decimal value).isDecimal = true
- x.isDecimal = false
Instances For
Digit type
The character represents a decimal digit 0 through 9.
Unicode property: Numeric_Type
Equations
- (Unicode.NumericType.decimal value).isDigit = true
- (Unicode.NumericType.digit value).isDigit = true
- x.isDigit = false
Instances For
Get the value of a numeric type
Returns either an integer value or a numerator-denominator pair representing a rational value.
Unicode property: Numeric_Value
Equations
- (Unicode.NumericType.decimal n).value = Sum.inl ↑↑n
- (Unicode.NumericType.digit n).value = Sum.inl ↑↑n
- (Unicode.NumericType.numeric n none).value = Sum.inl n
- (Unicode.NumericType.numeric n (some d)).value = Sum.inr (n, d)
Instances For
Decomposition Mapping #
Compatibility format tag
Unicode properties: Decomposition_Type
, Decomposition_Mapping
- font: Unicode.CompatibilityTag
Font variant
- noBreak: Unicode.CompatibilityTag
No-break version of a space or hyphen
- initial: Unicode.CompatibilityTag
Initial presentation form (Arabic)
- medial: Unicode.CompatibilityTag
Medial presentation form (Arabic)
- final: Unicode.CompatibilityTag
Final presentation form (Arabic)
- isolated: Unicode.CompatibilityTag
Isolated presentation form (Arabic)
- circle: Unicode.CompatibilityTag
Encircled form
- super: Unicode.CompatibilityTag
Superscript form
- sub: Unicode.CompatibilityTag
Subscript form
- vertical: Unicode.CompatibilityTag
Vertical layout presentation form
- wide: Unicode.CompatibilityTag
Wide (or zenkaku) compatibility character
- narrow: Unicode.CompatibilityTag
Narrow (or hankaku) compatibility character
- small: Unicode.CompatibilityTag
Small variant form (CNS compatibility)
- square: Unicode.CompatibilityTag
CJK squared font variant
- fraction: Unicode.CompatibilityTag
Vulgar fraction form
- compat: Unicode.CompatibilityTag
Otherwise unspecified compatibility character
Instances For
Equations
- Unicode.instInhabitedCompatibilityTag = { default := Unicode.CompatibilityTag.font }
Equations
- Unicode.instDecidableEqCompatibilityTag x y = if h : x.toCtorIdx = y.toCtorIdx then isTrue ⋯ else isFalse ⋯
Equations
- Unicode.instReprCompatibilityTag = { reprPrec := Unicode.reprCompatibilityTag✝ }
Equations
- One or more equations did not get rendered due to their size.
Decomposition maping
Unicode properties: Decomposition_Type
, Decomposition_Mapping
Compatibility format tag
Decomposition mapping
Instances For
Equations
- Unicode.instInhabitedDecompositionMapping = { default := { tag := default, mapping := default } }
Equations
- Unicode.instReprDecompositionMapping = { reprPrec := Unicode.reprDecompositionMapping✝ }
Bidirectional Class #
Bidirectional class
Unicode property: Bidi_Class
- leftToRight: Unicode.BidiClass
(
L
) strong left-to-right character - rightToLeft: Unicode.BidiClass
(
R
) strong right-to-left (non-Arabic-type) character - arabicLetter: Unicode.BidiClass
(
AL
) strong right-to-left (Arabic-type) character - europeanNumber: Unicode.BidiClass
(
EN
) ASCII digit or Eastern Arabic-Indic digit - europeanSeparator: Unicode.BidiClass
(
ES
) European separator: plus and - europeanTerminator: Unicode.BidiClass
(
ET
) European terminator in a numeric format context, includes currency signs - arabicNumber: Unicode.BidiClass
(
AN
) Arabic-Indic digit - commonSeparator: Unicode.BidiClass
(
CS
) common separator: commas, colons, and slashes - nonspacingMark: Unicode.BidiClass
(
NSM
) nonspacing mark - boundaryNeutral: Unicode.BidiClass
(
BN
) boundary neutral: most format characters, control codes, or noncharacters - paragraphSeparator: Unicode.BidiClass
(
B
) paragraph separator, various newline characters - segmentSeparator: Unicode.BidiClass
(
S
) segment separator, various segment-related control codes - whiteSpace: Unicode.BidiClass
(
WS
) white spaces - otherNeutral: Unicode.BidiClass
(
ON
) other neutral: most other symbols and punctuation marks - leftToRightEmbedding: Unicode.BidiClass
(
LRE
) left to right embedding (U+202A: the LR embedding control) - leftToRightOverride: Unicode.BidiClass
(
LRO
) Left_To_Right_Override (U+202D: the LR override control) - rightToLeftEmbeding: Unicode.BidiClass
(
RLE
) right-to-left embedding (U+202B: the RL embedding control) - rightToLeftOverride: Unicode.BidiClass
(
RLO
) right-to-left override (U+202E: the RL override control) - popDirectionalFormat: Unicode.BidiClass
(
PDF
) pop directional format (U+202C: terminates an embedding or override control) - leftToRightIsolate: Unicode.BidiClass
(
LRI
) left-to-right isolate (U+2066: the LR isolate control) - rightToLeftIsolate: Unicode.BidiClass
(
RLI
) right-toleft isolate (U+2067: the RL isolate control) - firstStrongIsolate: Unicode.BidiClass
(
FSI
) first strong isolate (U+2068: the first strong isolate control) - popDirectionalIsolate: Unicode.BidiClass
(
PDI
) pop directional isolate (U+2069: terminates an isolate control)
Instances For
Equations
- Unicode.instInhabitedBidiClass = { default := Unicode.BidiClass.leftToRight }
Equations
- Unicode.instDecidableEqBidiClass x y = if h : x.toCtorIdx = y.toCtorIdx then isTrue ⋯ else isFalse ⋯
Bidi class: strong left-to-right (L
)
Instances For
Bidi class: strong right-to-left (R
)
Instances For
Bidi class: arabic letter (AL
)
Instances For
Bidi class: european number (EN
)
Instances For
Bidi class: european separator (ES
)
Instances For
Bidi class: european terminator (ET
)
Instances For
Bidi class: arabic number (AN
)
Instances For
Bidi class: common separator (CS
)
Instances For
Bidi class: nonspacing mark (NSM
)
Instances For
Bidi class: boundary neutral (BN
)
Instances For
Bidi class: paragraph separator (B
)
Instances For
Bidi class: segment separator (S
)
Instances For
Bidi class: white space (WS
)
Instances For
Bidi class: other neutral (ON
)
Instances For
Bidi class: left-to-right embedding (LRE
)
Instances For
Bidi class: left-to-right override (LRO
)
Instances For
Bidi class: right-to-left embedding (RLE
)
Instances For
Bidi class: right-to-left override (RLO
)
Instances For
Bidi class: pop directional format (PDF
)
Instances For
Bidi class: left-to-right isolate (LRI
)
Instances For
Bidi class: right-to-left isolate (RLI
)
Instances For
Bidi class: first strong isolate (FSI
)
Instances For
Bidi class: pop directional isolate (PDI
)
Instances For
String abbreviation for bidi class
Equations
- Unicode.BidiClass.leftToRight.toAbbrev = "L"
- Unicode.BidiClass.rightToLeft.toAbbrev = "R"
- Unicode.BidiClass.arabicLetter.toAbbrev = "AL"
- Unicode.BidiClass.europeanNumber.toAbbrev = "EN"
- Unicode.BidiClass.europeanSeparator.toAbbrev = "ES"
- Unicode.BidiClass.europeanTerminator.toAbbrev = "ET"
- Unicode.BidiClass.arabicNumber.toAbbrev = "AN"
- Unicode.BidiClass.commonSeparator.toAbbrev = "CS"
- Unicode.BidiClass.nonspacingMark.toAbbrev = "NSM"
- Unicode.BidiClass.boundaryNeutral.toAbbrev = "BN"
- Unicode.BidiClass.paragraphSeparator.toAbbrev = "B"
- Unicode.BidiClass.segmentSeparator.toAbbrev = "S"
- Unicode.BidiClass.whiteSpace.toAbbrev = "WS"
- Unicode.BidiClass.otherNeutral.toAbbrev = "ON"
- Unicode.BidiClass.leftToRightEmbedding.toAbbrev = "LRE"
- Unicode.BidiClass.leftToRightOverride.toAbbrev = "LRO"
- Unicode.BidiClass.rightToLeftEmbeding.toAbbrev = "RLE"
- Unicode.BidiClass.rightToLeftOverride.toAbbrev = "RLO"
- Unicode.BidiClass.popDirectionalFormat.toAbbrev = "PDF"
- Unicode.BidiClass.leftToRightIsolate.toAbbrev = "LRI"
- Unicode.BidiClass.rightToLeftIsolate.toAbbrev = "RLI"
- Unicode.BidiClass.firstStrongIsolate.toAbbrev = "FSI"
- Unicode.BidiClass.popDirectionalIsolate.toAbbrev = "PDI"
Instances For
Get bidi class from abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get bidi class from abbreviation
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Unicode.instReprBidiClass = { reprPrec := fun (bc : Unicode.BidiClass) (x : Nat) => Std.Format.text (toString "Unicode.BidiClass." ++ toString bc.toAbbrev ++ toString "") }
Structure for data from UnicodeData.txt
- codeValue : UInt32
Code Value
- characterName : Substring
Character Name
- generalCategory : Unicode.GeneralCategory
General Category
- canonicalCombiningClass : Nat
Canonical Combining Class
- bidiClass : Unicode.BidiClass
Bidirectional Class
- bidiMirrored : Bool
Bidirectional Mirrored
- decompositionMapping : Option Unicode.DecompositionMapping
Character Decomposition Mapping
- numeric : Option Unicode.NumericType
Numeric Value
Uppercase Mapping
Lowercase Mapping
Titlecase Mapping
Instances For
Equations
- Unicode.instBEqUnicodeData = { beq := Unicode.beqUnicodeData✝ }
Equations
- Unicode.instReprUnicodeData = { reprPrec := Unicode.reprUnicodeData✝ }
Equations
- One or more equations did not get rendered due to their size.