uni2ascii

provides conversion in both directions between UTF-8 Unicode and a variety of 7-bit ASCII equivalents
Download

uni2ascii Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Publisher Name:
  • Bill Poser
  • Operating Systems:
  • Windows All
  • File Size:
  • 161 KB

uni2ascii Tags


uni2ascii Description

Such ASCII equivalents are useful when including Unicode text in program source, when entering text into web programs that can handle the Unicode character set but are not 8-bit safe, and when debugging. For example, MovableType, the blog software, truncates posts as soon as it encounters a byte with the high bit set. However, if Unicode is entered in the form of HTML numeric character entities, Movable Type will not garble the post. The package consists of three programs. The actual work is done by uni2ascii and ascii2uni. The third program is u2a, a graphical interface to uni2ascii and ascii2uni. Main features: HTML hexadecimal numeric character references HTML decimal numeric character references HTML character entities SGML hexadecimal numeric character references (e.g. #x00E9;) SGML decimal numeric character references (e.g. #0233;) u-escaped hexadecimal, as used in Python and Java (e.g. u00E9) u-escaped hexadecimal within the BMP, U-escapes beyond the BMP, (e.g. u00E9 but U00010024) as used in Tcl and Scheme. u-escaped decimal (e.g. u0233) as used in Rich Text Format U+-escaped hexadecimal (e.g. U+00E9) as in the Unicode standard U-escaped hexadecimal (e.g. U00E9) u-escaped hexadecimal (e.g. u00E9) U-escaped hexadecimal within angle brackets (e.g. ) as used in POSIX locale specifications x-escaped hexadecimal (e.g. x00E9) as used in Tcl for numbers as opposed to characters x-escaped hexadecimal with braces (e.g. x{00E9}) as used in Perl hexadecimal within single quotes with prefix X (e.g. X'00E9') RFC 2396 URI format (e.g. é) RFC 2045 Quoted Printable (=-escaped hexadecimal UTF-8) e.g. =C3=A9 -escaped octal UTF-8 (e.g. 303251) Standard hexadecimal (e.g. 0x00E9) Raw hexadecimal (e.g. 00E9) Common Lisp hexadecimal format (e.g. #x00E9) Perl v-prefixed decimal format (e.g. v233) Hexadecimal numbers preceded by "$" (e.g. $00E9). Hexadecimal numbers preceded by "16#" (e.g. 16#00E9) as in Postscript. Hexadecimal numbers preceded by "#16r" (e.g. #16r00E9) as in Common Lisp. Hexadecimal numbers preceded by "16#" and followed by "#" (e.g. 16#00E9#) as in ADA.


uni2ascii Related Software