background preloader

Chinese

Facebook Twitter

Audrey: Our paroqial fermament, one tide on another. (Alternate title: "Chinese Twitter users live in a density 2x to 8x their English counterparts; here's why.

Audrey: Our paroqial fermament, one tide on another.

") I promised Adina Levin a treatise on the information density of Chinese characters on Twitter "after all SocialCalc (nee wikiCalc) performance bugs are fixed". As I've fixed them yesterday, let's try coding some English... I'll begin by saying that Ken's in(ter)vention of UTF-8 (as narrated by Rob Pike) accurately reflects the relative information density between ASCII and CJK characters. After all, UCS was extended from 16-bits to 21-bits precisely because so damn many Chinese characters need to be encoded, even after the controversial decimation from the Han unification effort. Hypothetically, if Twitter had set its limit to 140 UTF-8 bytes, then our experience when tweeting Chinese would be on par with tweeting English, because each Chinese character would then take 3 bytes — and since I occasionally venture beyond the BMP, sometimes 4 bytes.

ISMB ECCB 2009. Art and Science Exhibition ISMB/ECCB 2009 brings together scientists from a wide range of disciplines, including biology, medicine, computer science, mathematics and statistics.

ISMB ECCB 2009

In these fields people are constantly dealing with information in visual form: from microscope images and photographs of gels to scatter plots, network graphs and phylogenetic trees, structural formulae and protein models to flow diagrams; visual aids for problem-solving are omnipresent. Some of the works of the first such exhibition at the ISMB 2008 in Toronto combine outstanding beauty and aesthetics with deep insight that perfectly proves the validity of our approach or goes beyond the problem's solution. Others were surprising and inspiring through the transition from science to art, opening our eyes and minds to reflect on the work that we are undertaking. The Art & Science Exhibition@ISMB/ECCB 2009 presents the artworks that have been generated as part of research projects. «Mouse» Chinese Input Method - Write Chinese Characters.