programming-history

Unicode support

Identifiers
  • 💯: follows UAX-31.
  • ✓: allows Unicode in some form.
  • ✗: does not allow Unicode.
Char type
  • 💯: can represent all emoji (e.g. no distinguishing between characters and strings).
  • ✓: can represent all codepoints.
  • ✗: cannot represent all codepoints.
Language Identifiers Char type
C# L|Nl|'_' (L|Pc|Nd|Nl|Mn|Mc|Cf)* ✗ (UTF-16 code unit)
Haskell Ll|Lu|Lt (Ll|Lu|Lt|Nd|'_'|''')*
Java L|Nl|Sc|Pc (L|Sc|Pc|Nd|Nl|Mn|Mc|Cf|Cc)* ✗ (UTF-16 code unit)
Python 3 💯 XID_Start XID_Continue* 💯
Swift ✓ (see link) ✓ (can represent extended grapheme clusters as well)