background preloader

Welcome to Apache Avro!

Welcome to Apache Avro!
Related:  Data Serialization

Apache Thrift The Fat-Free Alternative to XML Extensible Markup Language (XML) is a text format derived from Standard Generalized Markup Language (SGML). Compared to SGML, XML is simple. HyperText Markup Language (HTML), by comparison, is even simpler. Even so, a good reference book on HTML is an inch thick. Most of the excitement around XML is around a new role as an interchangeable data serialization format. It is text-based. These together encouraged a higher level of application-independence than other data-interchange formats. Unfortunately, XML is not well suited to data-interchange, much as a wrench is not well-suited to driving nails. The most informed opinions on XML (see for example suggest that XML has big problems as a data-interchange format, but the disadvantages are compensated for by the benefits of interoperability and openness. JSON promises the same benefits of interoperability and openness, but without the disadvantages. From Simplicity Extensibility Openness

Thrift vs Protocol Bufffers vs JSON | MirthLab Note: This article a work in progress. If there is anything that needs correcting please let me know by leaving a comment. Originally this comparison included a look at JSON. Much of this table was originally compiled by Stuart Sierra but has been edited to include additional information relevant to my own requirements. Thrift and Protocol Buffers are both great choices and there seems like no clear winner between them. I’d choose Protocol Buffers over Thrift if: You’re only using Java, C++ or Python. I’d choose Thrift over Protocol Buffers if: Your language requirements are anything but Java, C++ or Python.

protobuf - Protocol Buffers - Google's data interchange format What is it? Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats. Latest Updates Documentation Read the documentation. Discussion Visit the discussion group. Quick Example You write a .proto file like this: message Person { required int32 id = 1; required string name = 2; optional string email = 3;} Then you compile it with protoc, the protocol buffer compiler, to produce code in C++, Java, or Python. Then, if you are using C++, you use that code like this: Person person;person.set_id(123);person.set_name("Bob");person.set_email(""); fstream out("person.pb", ios::out | ios::binary | ios::trunc);person.SerializeToOstream(&out);out.close(); Or like this: Person person;fstream in("person.pb", ios::in | ios::binary);if (! For a more complete example, see the tutorials.

Google Protocol Buffers vs Apache Avro | The Architect I have been looking into middleware solutions as a push mechanism between server and client. One of the aspects that I had to consider was latency. Hence it was important to find a lean technology agnostic transport format. Many languages offer native serialization APIs, but when serializing the data using the native API, Metadata about the class is serialized into the output too. I also needed to identify the best data format to serialize to. I came across two technologies ‘Google Protocol Buffers’ and ‘Apache Avro’: Google Protocol Buffers Protocol Buffers is a serialisation format with an interface description language developed by Google. It works by you defining how you want your data to be structured via proto files, which are simply structure text files. Protocol buffers claims it takes between 100 to 200 nanoseconds to parse. Apache Avro Avro is another very recent serialisation system. Avro relies on a schema-based system that defines a data contract to be exchanged. Results

Présentation du protocole Syslog Le protocole Syslog est défini par les RFC suivantes: Le protocole Syslog est un protocole en mode "texte", c'est-à-dire qu'il utilise uniquement les caractères du code ASCII. Il utilise le protocole UDP et le port 514 mais il faut savoir qu'il existe aussi des implémentations de Syslog en TCP ou même en SSL et sur d'autres numéros de port. La longueur totale d'une trame Syslog doit être de 1024 octets ou moins. Une trame de protocole Syslog est composée de 3 parties : La partie PRI. 3.2. La partie PRI d'un message Syslog est composée obligatoirement de 3, 4 ou 5 caractères. Le premier caractère est toujours le caractère "<" suivi par un nombre qui représente la priorité (en base 10) du message et suivi par le caractère ">". La seule fois où le caractère "0" peut suivre le caractère "<" est pour coder une priorité dont la valeur est 0 justement. 3.3. La partie HEADER d'un message Syslog contient 2 champs : Le champ TIMESTAMP. 3.2.1. 3.2.2. Le champ HOSTNAME peut contenir : 3.4. 3.5. 4.1. 4.2.