unicodeblocks

Python module for unicode blocks
Download

unicodeblocks Ranking & Summary

Advertisement

  • Rating:
  • License:
  • ISC License
  • Price:
  • FREE
  • Publisher Name:
  • Simonas Kazlauskas
  • Publisher web site:
  • https://github.com/simukis/

unicodeblocks Tags


unicodeblocks Description

unicodeblocks is a library to work with unicodeblocks in Python.UsageModule contains two classes: Block and Blocks. Blocks is just a collection of Block. Also there's prebuilt instance of Blocks that contains all 220 Unicode 6.1.0 blocks.>>> import unicodeblocks>>> unicodeblocks.blocksBlocks(...220 * Block...)BlocksYou can do quite a lot strange things with Blocks.For example, if you want to know which block character belongs to, you can do it:>>> unicodeblocks.blocks.block_of('-')Block('Basic Latin', 0x0, 0x7f)>>> unicodeblocks.blocks.block_of('か')Block('Hiragana', 0x3040, 0x309f)>>> unicodeblocks.blocks.block_of('日')Block('CJK Unified Ideographs', 0x4e00, 0x9fff)You can iterate trough them:>>> unicodeblocks.blocks.blocks()< generator object __iter__ at 0x2d5df50 >>>> len(list(itertools.chain(*unicodeblocks.blocks.blocks())))253440 # of characters in all unicode blocks.And trough names of blocks as well:>>> list(unicodeblocks.blocks.names())Getting one specific block is easy as well:unicodeblocks.blocksBlock('Cyrillic', 0x400, 0x4ff)Keys are not case sensitive. As per specification spaces, dashes and underscores are ignored as well.>>> unicodeblocks.blocksBlock('Cyrillic', 0x400, 0x4ff)BlockThey are ordeable, so you can sort them.There's three atributes available:>>> latin.name # Full name of block'Latin Extended-A'>>> latin.start # Block start codepoint256>>> latin.end # Block end codepoint383You can check if letter belongs to some block:>>> 'ą' in latinTrueGet length of block or all letters in it:>>> len(latin)128>>> list(latin)You can merge two blocks to get instance of Blocks for easy manipulation.>>> unicodeblocks.blocks + unicodeblocks.blocksBlocks(Block('Basic Latin', 0x0, 0x7f),Block('Latin Extended-A', 0x100, 0x17f))You can also add a Block to a instance of Blocks in same way, so addition is chainable.NotesThis module doesn't check for validity of characters that doesn't exist in a middle of block. For example see \u38D. If you care about valid unicode characters, you should try to obtain their name with unicodedata module.Product's homepage


unicodeblocks Related Software