background preloader

Unicode Table

Unicode Table

java - JSF chars get double UTF-8 encoded C syntax The syntax of the C programming language, the rules governing writing of software in the language, is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. The development of this syntax was a major milestone in the history of computer science as it was the first widely successful high-level language for operating-system development. C syntax makes use of the maximal munch principle. Data structures[edit] Primitive data types[edit] The C language represents numbers in three forms: integral, real and complex. All C integer types have signed and unsigned variants. Integer types[edit] C's integer types come in different fixed sizes, capable of representing various ranges of numbers. The representation of some types may include unused "padding" bits, which occupy storage but are not included in the width. Integer constants may be specified in source code in several ways.

Non-UTF-8 encoding in ZIP file (Xueming Shen's Oracle Blog) The Zip specification (historically) does not specify what character encoding to be used for the embedded file names and comments, the original IBM PC character encoding set, commonly referred to as IBM Code Page 437, is supposed to be the only encoding supported. Jar specification meanwhile explicitly specifies to use UTF-8 as the encoding to encode and decode all file names and comments in Jar files. Our java.util.jar and java.util.zip implementation therefor strictly followed Jar specification to use UTF-8 as the sole encoding when dealing with the file names and comments stored in Jar/Zip files. Consequence? the ZIP file created by "traditional" ZIP tool is not accessible for java.util.jar/zip based tool, and vice versa, if the file name contains characters that are not compatible between Cp437 (as an alternative, tools might simply use the default platform encoding) and UTF-8 Something you might want to keep in mind when use these new APIs and the new JDK7 bundles. Enjoy the APIs!

C Preprocessor 6.1: How can I write a generic macro to swap two values? There is no good answer to this question. If the values are integers, a well-known trick using exclusive-OR could perhaps be used, but it will not work for floating-point values or pointers, or if the two values are the same variable (and the "obvious" supercompressed implementation for integral types a^=b^=a^=b is in fact illegal due to multiple side-effects; see questions 4.1 and 4.2). If the macro is intended to be used on values of arbitrary type (the usual goal), it cannot use a temporary, since it does not know what type of temporary it needs, and standard C does not provide a typeof operator. The best all-around solution is probably to forget about using a macro, unless you're willing to pass in the type as a third argument. 6.2: I have some old code that tries to construct identifiers with a macro like #define Paste(a, b) a/**/b but it doesn't work any more. #define Paste(a, b) a##b (See also question 5.4.) No.

a rough guide to character encoding It can be tricky figuring out the difference between character handling code that works and code that just appears to work because testing did not encounter cases that exposed bugs. This is a post about some of the pitfalls of character handling in Java. Topics: I wrote a little bit about Unicode before. This post might be exhausting, but it isn't exhaustive. Unicode in source files Java source files include support for Unicode. One choice is to encode the source files as Unicode, write the characters and inform the compiler at compile time. javac provides the -encoding <encoding> option for this. Code saved as UTF-8, as might be written on an Ubuntu machine: public class PrintCopyright { public static void main(String[] args) { System.out.println("© Acme, Inc."); }} 1. javac -encoding UTF-8 PrintCopyright.java 2. javac -encoding Cp1252 PrintCopyright.java These compiler settings will produce different outputs; only the first one is correct. Unicode and Java data types 1. 2. 3. 4. Encodings Notes

C Pointers Tutorial: Some more on Strings, and Arrays of Strings Well, let's go back to strings for a bit. In the following all assignments are to be understood as being global, i.e. made outside of any function, including main(). We pointed out in an earlier chapter that we could write: char my_string[40] = "Ted"; which would allocate space for a 40 byte array and put the string in the first 4 bytes (three for the characters in the quotes and a 4th to handle the terminating ''). Actually, if all we wanted to do was store the name "Ted" we could write: char my_name[] = "Ted"; and the compiler would count the characters, leave room for the nul character and store the total of the four characters in memory the location of which would be returned by the array name, in this case my_name. In some code, instead of the above, you might see: char *my_name = "Ted"; which is an alternate approach. In the array notation, my_name is short for &myname[0] which is the address of the first element of the array. char multi[5][10]; Just what does this mean? char multi[5][10];

a rough guide to character encoding It can be tricky figuring out the difference between character handling code that works and code that just appears to work because testing did not encounter cases that exposed bugs. This is a post about some of the pitfalls of character handling in Java. Topics: I wrote a little bit about Unicode before. This post might be exhausting, but it isn't exhaustive. Unicode in source files Java source files include support for Unicode. One choice is to encode the source files as Unicode, write the characters and inform the compiler at compile time. javac provides the -encoding <encoding> option for this. Code saved as UTF-8, as might be written on an Ubuntu machine: public class PrintCopyright { public static void main(String[] args) { System.out.println("© Acme, Inc."); }} 1. javac -encoding UTF-8 PrintCopyright.java 2. javac -encoding Cp1252 PrintCopyright.java These compiler settings will produce different outputs; only the first one is correct. Unicode and Java data types 1. 2. 3. 4. Encodings Notes

comp.lang.c FAQ The HTML version of the comp.lang.c FAQ has been taken over by the FAQ's author, Steve Summit. comp.lang.c Frequently Asked Que Please update your links, and thanks - jutta@pobox.com [Last modified August 1, 1995 by scs.] WARNING: A major update to this FAQ list is imminent, probably on September 1, 1995. Certain topics come up again and again on this newsgroup. This article, which is posted monthly, attempts to answer these common questions definitively and succinctly, so that net discussion can move on to more constructive topics without continual regression to first principles. No mere newsgroup article can substitute for thoughtful perusal of a full-length tutorial or language reference manual. If you have a question about C which is not answered in this article, first try to answer it by checking a few of the referenced books, or by asking knowledgeable colleagues, before posing your question to the net at large. Bibliography Samuel P. EoPS

Circular buffer A ring showing, conceptually, a circular buffer. This visually shows that the buffer has no real end and it can loop around the buffer. However, since memory is never physically created as a ring, a linear representation is generally used as is done below. Uses[edit] In some situations, overwriting circular buffer can be used, e.g. in multimedia. How it works[edit] A circular buffer first starts empty and of some predefined length. Assume that a 1 is written into the middle of the buffer (exact starting location does not matter in a circular buffer): Then assume that two more elements are added — 2 & 3 — which get appended after the 1: If two elements are then removed from the buffer, the oldest values inside the buffer are removed. If the buffer has 7 elements then it is completely full: A consequence of the circular buffer is that when it is full and a subsequent write is performed, then it starts overwriting the oldest data. Circular buffer mechanics[edit] Difficulties[edit] slots.

Related: