General API #
As a general rule, for a given a Unicode property called Unicode_Property
,
for example:
If the property is boolean valued then the implementation is called
Unicode.isUnicodeProperty
.Otherwise, the implementation is called
Unicode.getUnicodeProperty
.If the value is not of standard type (
Bool
,Char
,Nat
,Int
, etc.) or a combination thereof (e.g.Bool × Option Nat
) then the value type is defined inUnicodeBasic.Types
.If the input type needs disambiguation (e.g.
Char
vsString
) the type name may be appended to the name as inUnicode.isUnicodePropertyChar
or inUnicode.getUnicodePropertyString
.If the output type is
Option _
then the suffix?
may be appended to indicate that this is a partial function. In this case, a companion function with the suffix!
may be implemented. This function will perform the same calculation as the original but it assumes the user has checked that the input is in the domain, the function may panic if this is not the case.
There are exceptions and variations on these general rules. For example,
Unicode.isInGeneralCategory
checks whether a character belongs to a given
general category. Because of derived general categories, this makes more
sense than what the first rule above would give.
Some properties may be grouped in a namespace. The namespace
Unicode.GeneralCategory
is such. For example, Unicode.GeneralCategory.isL
checks whether a character belongs to the derived general category L
.
Name #
Get character name
When the Unicode property Name
is empty, a unique code label is returned
as described in Unicode Standard, section 4.8. These labels start with
'<'
(U+003C) and end with '>'
(U+003E) so they are distinguishable from
actual name values.
Unicode property: Name
Equations
- Unicode.getName char = (Unicode.getUnicodeData char).characterName.toString
Instances For
Bidirectional Properties #
Get character bidirectional class
Unicode property: Bidi_Class
Equations
- Unicode.getBidiClass char = (Unicode.getUnicodeData char).bidiClass
Instances For
Check if bidirectional mirrored character
Unicode property: Bidi_Mirrored
Equations
- Unicode.isBidiMirrored char = (Unicode.getUnicodeData char).bidiMirrored
Instances For
Check if bidirectional control character
Unicode property: Bidi_Control
Equations
- One or more equations did not get rendered due to their size.
Instances For
General Category #
Get character general category
Caveat: This function never returns a derived general category. Use
Unicode.isInGeneralCategory
to check whether a character belongs to a
general category (derived or not).
Unicode property: General_Category
Equations
- Unicode.getGeneralCategory char = (Unicode.getUnicodeData char).generalCategory
Instances For
Check if character belongs to the general category
Unicode property: General_Category
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if letter character (L
)
This is a derived category (L = Lu | Ll | Lt | Lm | Lo
).
Unicode Property: General_Category=L
Equations
- Unicode.GeneralCategory.isLetter char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.letter, minor := minor } => true | x => false
Instances For
Check if letter character (L
)
This is a derived category (L = Lu | Ll | Lt | Lm | Lo
).
Unicode Property: General_Category=L
Instances For
Check if lowercase letter character (Ll
)
Unicode Property: General_Category=Ll
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if lowercase letter character (Ll
)
Unicode Property: General_Category=Ll
Instances For
Check if titlecase letter character (Lt
)
Unicode Property: General_Category=Lt
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if titlecase letter character (Lt
)
Unicode Property: General_Category=Lt
Instances For
Check if uppercase letter character (Lu
)
Unicode Property: General_Category=Lu
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if uppercase letter character (Lu
)
Unicode Property: General_Category=Lu
Instances For
Check if cased letter character (LC
)
This is a derived category (L = Lu | Ll | Lt
).
Unicode Property: General_Category=LC
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if cased letter character (LC
)
This is a derived category (L = Lu | Ll | Lt
).
Unicode Property: General_Category=LC
Instances For
Check if modifier letter character (Lm
)
Unicode Property: General_Category=Lm
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if modifier letter character (Lm
)
Unicode Property: General_Category=Lm
Instances For
Check if other letter character (Lo
)
Unicode Property: General_Category=Lo
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if other letter character (Lo
)
Unicode Property: General_Category=Lo
Instances For
Check if mark character (M
)
This is a derived category (M = Mn | Mc | Me
).
Unicode Property: General_Category=M
Equations
- Unicode.GeneralCategory.isMark char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.mark, minor := minor } => true | x => false
Instances For
Check if mark character (M
)
This is a derived category (M = Mn | Mc | Me
).
Unicode Property: General_Category=M
Instances For
Check if nonspacing combining mark character (Mn
)
Unicode Property: General_Category=Mn
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if nonspacing combining mark character (Mn
)
Unicode Property: General_Category=Mn
Instances For
Check if spacing combining mark character (Mc
)
Unicode Property: General_Category=Mc
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if spacing combining mark character (Mc
)
Unicode Property: General_Category=Mc
Instances For
Check if enclosing combining mark character (Me
)
Unicode Property: General_Category=Me
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if enclosing combining mark character (Me
)
Unicode Property: General_Category=Me
Instances For
Check if number character (N
)
This is a derived category (N = Nd | Nl | No
).
Unicode Property: General_Category=N
Equations
- Unicode.GeneralCategory.isNumber char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.number, minor := minor } => true | x => false
Instances For
Check if number character (N
)
This is a derived category (N = Nd | Nl | No
).
Unicode Property: General_Category=N
Instances For
Check if decimal number character (Nd
)
Unicode Property: General_Category=Nd
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if decimal number character (Nd
)
Unicode Property: General_Category=Nd
Instances For
Check if letter number character (Nl
)
Unicode Property: General_Category=Nl
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if letter number character (Nl
)
Unicode Property: General_Category=Nl
Instances For
Check if other number character (No
)
Unicode Property: General_Category=No
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if other number character (No
)
Unicode Property: General_Category=No
Instances For
Check if punctuation character (P
)
This is a derived category (P = Pc | Pd | Ps | Pe | Pi | Pf | Po
).
Unicode Property: General_Category=P
Equations
- Unicode.GeneralCategory.isPunctuation char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.punctuation, minor := minor } => true | x => false
Instances For
Check if punctuation character (P
)
This is a derived category (P = Pc | Pd | Ps | Pe | Pi | Pf | Po
).
Unicode Property: General_Category=P
Instances For
Check if connector punctuation character (Pc
)
Unicode Property: General_Category=Pc
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if connector punctuation character (Pc
)
Unicode Property: General_Category=Pc
Instances For
Check if dash punctuation character (Pd
)
Unicode Property: General_Category=Pd
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if dash punctuation character (Pd
)
Unicode Property: General_Category=Pd
Instances For
Check if open punctuation character (Ps
)
Unicode Property: General_Category=Ps
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if open punctuation character (Ps
)
Unicode Property: General_Category=Ps
Instances For
Check if close punctuation character (Pe
)
Unicode Property: General_Category=Pe
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if close punctuation character (Pe
)
Unicode Property: General_Category=Pe
Instances For
Check if initial punctuation character (Pi
)
Unicode Property: General_Category=Pi
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if initial punctuation character (Pi
)
Unicode Property: General_Category=Pi
Instances For
Check if initial punctuation character (Pf
)
Unicode Property: General_Category=Pf
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if initial punctuation character (Pf
)
Unicode Property: General_Category=Pf
Instances For
Check if other punctuation character (Po
)
Unicode Property: General_Category=Po
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if other punctuation character (Po
)
Unicode Property: General_Category=Po
Instances For
Check if symbol character (S
)
This is a derived category (S = Sm | Sc | Sk | So
).
Unicode Property: General_Category=S
Equations
- Unicode.GeneralCategory.isSymbol char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.symbol, minor := minor } => true | x => false
Instances For
Check if symbol character (S
)
This is a derived category (S = Sm | Sc | Sk | So
).
Unicode Property: General_Category=S
Instances For
Check if math symbol character (Sm
)
Unicode Property: General_Category=Sm
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if math symbol character (Sm
)
Unicode Property: General_Category=Sm
Instances For
Check if currency symbol character (Sc
)
Unicode Property: General_Category=Sc
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if currency symbol character (Sc
)
Unicode Property: General_Category=Sc
Instances For
Check if modifier symbol character (Sk
)
Unicode Property: General_Category=Sk
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if modifier symbol character (Sk
)
Unicode Property: General_Category=Sk
Instances For
Check if other symbol character (So
)
Unicode Property: General_Category=So
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if other symbol character (So
)
Unicode Property: General_Category=So
Instances For
Check if separator character (Z
)
This is a derived property (Z = Zs | Zl | Zp
).
Unicode Property: General_Category=Z
Equations
- Unicode.GeneralCategory.isSeparator char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.separator, minor := minor } => true | x => false
Instances For
Check if separator character (Z
)
This is a derived property (Z = Zs | Zl | Zp
).
Unicode Property: General_Category=Z
Instances For
Check if space separator character (Zs
)
Unicode Property: General_Category=Zs
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if space separator character (Zs
)
Unicode Property: General_Category=Zs
Instances For
Check if line separator character (Zl
)
Unicode Property: General_Category=Zl
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if line separator character (Zl
)
Unicode Property: General_Category=Zl
Instances For
Check if paragraph separator character (Zp
)
Unicode Property: General_Category=Zp
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if paragraph separator character (Zp
)
Unicode Property: General_Category=Zp
Instances For
Check if other character (C
)
This is a derived category (C = Cc | Cf | Cs | Co | Cn
).
Unicode Property: General_Category=C
Equations
- Unicode.GeneralCategory.isOther char = match Unicode.getGeneralCategory char with | { major := Unicode.MajorGeneralCategory.other, minor := minor } => true | x => false
Instances For
Check if other character (C
)
This is a derived category (C = Cc | Cf | Cs | Co | Cn
).
Unicode Property: General_Category=C
Instances For
Check if control character (Cc
)
Unicode Property: General_Category=Cc
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if control character (Cc
)
Unicode Property: General_Category=Cc
Instances For
Check if format character (Cf
)
Unicode Property: General_Category=Cf
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if format character (Cf
)
Unicode Property: General_Category=Cf
Instances For
Check if surrogate character (Cs
)
Unicode Property: General_Category=Cs
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if surrogate character (Cs
)
Unicode Property: General_Category=Cs
Instances For
Check if private use character (Co
)
Unicode Property: General_Category=Co
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if private use character (Co
)
Unicode Property: General_Category=Co
Instances For
Check if unassigned character (Cn
)
Unicode Property: General_Category=Cn
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if unassigned character (Cn
)
Unicode Property: General_Category=Cn
Instances For
Case Type and Mapping #
Map character to lowercase
This function does not handle the case where the lowercase mapping would consist of more than one character.
Unicode property: Simple_Lowercase_Mapping
Equations
- Unicode.getLowerChar char = match (Unicode.getUnicodeData char).lowercaseMapping with | some char => char | none => char
Instances For
Map character to uppercase
This function does not handle the case where the uppercase mapping would consist of more than one character.
Unicode property: Simple_Uppercase_Mapping
Equations
- Unicode.getUpperChar char = match (Unicode.getUnicodeData char).uppercaseMapping with | some char => char | none => char
Instances For
Map character to titlecase
This function does not handle the case where the titlecase mapping would consist of more than one character.
Unicode property: Simple_Titlecase_Mapping
Equations
- Unicode.getTitleChar char = match (Unicode.getUnicodeData char).titlecaseMapping with | some char => char | none => Unicode.getUpperChar char
Instances For
Decomposition Type and Mapping #
Get canonical decomposition of character (NFD
)
Unicode property: Decomposition_Mapping
Equations
- Unicode.getCanonicalDecomposition char = { data := Unicode.getCanonicalDecomposition.loop [char] }
Instances For
Numeric Type and Value #
Check if character represents a numerical value
Unicode property: Numeric_Type=Numeric
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if character represents a digit in base 10
Unicode property: Numeric_Type=Digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get value of digit
Unicode properties: Numeric_Value
, Numeric_Type=Digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if character represents a decimal digit
For this property, a character must be part of a contiguous sequence representing the ten decimal digits in order from 0 to 9.
Unicode property: Numeric_Type=Decimal
Equations
- Unicode.isDecimal char = match (Unicode.getUnicodeData char).numeric with | some (Unicode.NumericType.decimal value) => true | x => false
Instances For
Get decimal digit range
If the character is part of a contiguous sequence representing the ten decimal digits in order from 0 to 9, this function returns the first and last characters from this range.
Unicode property: Numeric_Type=Decimal
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if character represents a digit in base 16
Unicode property: Hex_Digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Get value of digit
Unicode properties: Hex_Digit
Equations
- One or more equations did not get rendered due to their size.
Instances For
Other Properties #
Check if white space character
Unicode property: White_Space
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if lowercase letter character
Unicode property: Lowercase
Equations
- Unicode.isLowercase char = (Unicode.GeneralCategory.isLl char || Unicode.PropList.isOtherLowercase char.val)
Instances For
Check if uppercase letter character
Unicode property: Uppercase
Equations
- Unicode.isUppercase char = (Unicode.GeneralCategory.isLu char || Unicode.PropList.isOtherUppercase char.val)
Instances For
Check if cased letter character
Unicode properties: Cased
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if character is ignorable for casing purposes
Generated from general categories Lm
, Mn
, Me
, Sk
, Cf
, and word
break properties MidLetter
, MidNumLet
, Single_Quote
.
Unicode property: Case_Ignorable
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if mathematical symbol character
Unicode property: Math
Equations
- Unicode.isMath char = (Unicode.GeneralCategory.isSm char || Unicode.PropList.isOtherMath char.val)
Instances For
Check if alphabetic character
Unicode property: Alphabetic
Equations
- One or more equations did not get rendered due to their size.
Instances For
Check if alphabetic character
Unicode property: Alphabetic