Skip to main content

$$normalize

Replace special characters forms with their simple form equivalent (removing marks by default)

  • Allows post-processing over Java's normalizer algorithm result

Post Operations

  • ROBUST - Try to return the most of similar letters to latin, replaced to their latin equivalent, including:
    • Removing combining diacritical marks (works with NFD/NFKD which leaves the characters decomposed)
    • Stroked (and others which are not composed) (i.e. "ĐŁłŒ" -> "DLlOE")
    • Replacing (with space) and trimming white-spaces

Usage

"$$normalize([form],[postOperation]):{input}"

Returns

string

Arguments

ArgumentTypeValuesRequired / Default ValueDescription
formEnumNFKD/NFD/NKFC/NFCNFKDNormalizer Form (as described in Java's docuemntation)
postOperationEnumROBUST/NONEROBUSTPost operation to run on result to remove/replace more letters

Examples

Input

Definition

Output

"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize:$"
"This is a funky String abcABC..."
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC…"
"$$normalize(NFKD,NONE):$"
"Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ abcABC..."
"ĐŁłŒœÆæǢǣǼǽ"
"$$normalize:$"
"DLlOEoeAEaeAEaeAEae"