charset

TwitterFacebook
Get flash to fully experience Pearltrees
encoding

The International Phonetic Alphabet ( IPA ) [ note 1 ] is an alphabetic system of phonetic notation based primarily on the Latin alphabet . It was devised by the International Phonetic Association as a standardized representation of the sounds of spoken language. [ 1 ] The IPA is used by foreign language students and teachers, linguists , Speech-Language Pathologists , singers , actors , lexicographers , constructed language creators ( conlangers ), and translators . [ 2 ] [ 3 ] The IPA is designed to represent only those qualities of speech that are distinctive in spoken language : phonemes , intonation , and the separation of words and syllables . [ 1 ] To represent additional qualities of speech such as tooth gnashing, lisping , and sounds made with a cleft palate , an extended set of symbols called the Extensions to the IPA may be used. [ 2 ] IPA symbols are composed of one or more elements of two basic types, letters and diacritics .

International Phonetic Alphabet - Wikipedia, the free encyclopedia

http://en.wikipedia.org/wiki/International_Phonetic_Alphabet
I'm posting here a small document I usually keep around to help people understand I18N issues in webapps rather than just hack everything around without a clue of what they are doing. This is all the information I sent to Yannick Q. to help him solve his problem last week There are also a couple of statements I use in my slides of my J2EE training session.

FYI: I18N issues, long document

http://osdir.com/ml/java.ejbca.devel/2005-03/msg00002.html
http://docs.oracle.com/javase/tutorial/i18n/index.html The lessons in this trail teach you how to internationalize Java applications. Internationalized applications are easy to tailor to the customs and languages of end users around the world. Note: This tutorial trail covers core internationalization functionality, which is the foundation required by additional features provided for desktop, enterprise, and mobile applications.

Trail: Internationalization (The Java™ Tutorials)

http://www.w3.org/International/questions/qa-css-charset It is a good idea to always declare the encoding of external CSS style sheets if you have any non-ASCII text in your CSS file. For example, you may have non-ASCII characters in font names, in values of the content property, in selector values, etc. For style declarations embedded in a document, @charset rules are not needed and must not be used.

C I18N FAQ: Déclaration du codage de caractères utilisé dans un fichier CSS

A tutorial on character code issues

This document tries to clarify the concepts of character repertoire , character code , and character encoding especially in the Internet context. It specifically avoids the term character set , which is confusingly used to denote repertoire or code or encoding. ASCII , ISO 646 , ISO 8859 (ISO Latin, especially ISO Latin 1 ), Windows character set , ISO 10646, UCS, and Unicode , UTF-8, UTF-7 , MIME , and QP are used as examples. This document in itself does not contain solutions to practical problems with character codes (but see section Further reading ). Rather, it gives background information needed for understanding what solutions there might be, what the different solutions do - and what's really the problem in the first place. http://www.cs.tut.fi/~jkorpela/chars.html

Internationalization Guide for Java Web Applications

One World, One Character Set I've spent enough time solving internationalization problems that can be very time consuming bugs to track down. If I could help you out, great, but even better if you got something more to share. Projects come and go and every project has their own problems. Please send me more information on the subject! http://tomi.panula-ont.to/i18n/

Character Conversions from Browser to Database

In route to their final storage destination on the World Wide Web, characters move through various layers of programming interfaces and can cross software and hardware boundaries. This article provides helpful hints and best practices for accurately transporting character data from browser to database … and back again. Contents http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
http://www.unicode.org/

Unicode Home Page

Welcome! The Unicode Consortium enables people around the world to use computers in any language. Our freely-available specifications and data form the foundation for software internationalization in all major operating systems, search engines, applications, and the World Wide Web. An essential part of our mission is to educate and engage academic and scientific communities, and the general public.
http://www.javaworld.com/javaworld/jw-05-2004/jw-0524-i18n.html

End-to-end internationalization of Web applications - Java World

A typical Web application workflow involves a user loading one of your Webpages into her browser, filling out HTML form parameters, and submitting data back to the server. The server makes decisions based on this data, sends the data to other components such as databases and Web services, and renders a response back to the browser. At each step along the way, a globally aware application must pay attention to the user's locale and the text's character encoding.
Ascii was very simplistic, and so was extended by adding 'extended' sets by various manufacturers. Apart from being confusing this was still restricted to 256 characters. Now computers are more widely established around the world the need to show other characters such as Japanese and Chinese languages along with various symbols became more important. Unicode is an attempt to standardise every character possible and the latest version (4) is shown below. Tables are in PDF format so you will need Adobe Acrobat Reader to view them. http://www.unicodetables.com/

Unicode Tables - All Unicode Tables and other charts

BMP , Plane 1 , Plane 2 , Plane 3 , Plane 4 , Plane 5 , Plane 6 , Plane 7 , Plane 8 , Plane 9 , Plane 10 , Plane 11 , Plane 12 , Plane 13 , Plane 14 , Plane 15 , Plane 16 To get a list of code charts for a character, enter its code in the search box at the top. To access a chart for a given block, click on its entry in the table.

Code Charts - Scripts

Posted by joconner on July 27, 2005 at 3:13 PM CDT The J2SE platform has come a long way in internationalization. Some things are just easy...like entering your name in a Swing text field regardless of whether your name is John, José, or 田中 (Tanaka). Unicode prevails within the Java core. Unfortunately, entering non-ASCII text in the J2EE world isn't nearly as easy.

John O'Conner's Blog: Charset Pitfalls in JSP/Servlet Containers

Internationalization (I18n), Localization (L10n), Standards, and Amusements

The I18n Guy web site is about regional and cultural differences, internationalization (i18n), localization (l10n), globalization (g11n), translation and software engineering. Pages are added and updated frequently.

Test page for 8-bit encodings

The table contains alternating rows with octets in the range 128–255 (decimal) and ISO 8859-1 characters with the same code numbers. The first two rows for ISO 8859-1 have been greyed out, since those code positions do not contain printable characters but are reserved for control character purposes. This document has no character encoding information. This allows you to view it using various 8-bit encodings.

Unicode - Wikipédia

Unicode est une norme informatique , développée par le Consortium Unicode , qui vise à permettre le codage de texte écrit en donnant à tout caractère de n’importe quel système d’ écriture un nom et un identifiant numérique, et ce de manière unifiée, quelle que soit la plate-forme informatique ou le logiciel . La dernière version, Unicode 6.1.0 , est publiée depuis le 31 janvier 2012 [ 1 ] .