background preloader

Extract-Transform-Load & Extraction Engine Tools

Facebook Twitter

Such as XSLT, Pentaho

Extraction Engine Tools

Open Studio Integration Software Platform. Talend Open Studio is a powerful and versatile set of open source products for developing, testing, deploying and administrating data management and application integration projects.

Open Studio Integration Software Platform

Talend delivers the only unified platform that makes data management and application integration easier by providing a unified environment for managing the entire lifecycle across enterprise boundaries. Developers achieve vast productivity gains through an easy-to-use, Eclipse-based graphical environment that combines data integration, data quality, MDM, application integration and big data. Talend's products dramatically lower the adoption barrier for businesses wanting powerful packaged solutions to operational challenges like data cleansing, master data management, and enterprise service bus deployment. Excel To JSON - Source Code. Community Wiki Home - Pentaho Community - Pentaho Wiki. MongoDB Instaview Sample template - Pentaho Big Data - Pentaho Wiki. Sample Instaview template for use with MongoDB that can be easily added to your installation.

MongoDB Instaview Sample template - Pentaho Big Data - Pentaho Wiki

It demonstrates how to select clickstream data from MongoDB and immediately start to explore it and create visualizations. In order follow along with this how-to guide you will need the following: Pentaho Data Integration (Enterprise Edition Only) A desktop installation of Pentaho Data Integration. 30-Day Eval here. MongoDB A single-node local cluster is sufficient for these exercises but a larger and/or remote configuration will work as well.

Sample Files This sample uses the page_successions.txt.zip data from the Write Data To MongoDB How To. Setup. Gui tools for MongoDB. Big Data Community Home - Pentaho Big Data - Pentaho Wiki. Welcome to the Big Data space in the Pentaho Community wiki.

Big Data Community Home - Pentaho Big Data - Pentaho Wiki

This space is the community home for Big Data and NoSQL technologies within the Pentaho ecosystem. It is the place to find information, how-to's, developer info, technology previews and other information about employing Pentaho technology as part of your overall Big Data Strategy. It is also where you can share your own information and experiences. We look forward to your participation and contribution! If you are not a developer, are looking for more product specific information, or are interested in commercial support, PentahoBigData.com is the place to find those resources. New and recently updated Big Data content on the What's New? MongoDB SQL Server Importer - Home. XML validator and editor for your business. Getting Started for Java Developers - Pentaho Big Data - Pentaho Wiki. The Pentaho Big Data Plugin Project provides support for an ever-expanding Big Data community within the Pentaho ecosystem.

Getting Started for Java Developers - Pentaho Big Data - Pentaho Wiki

It is a plugin for the Pentaho Kettle engine which can be used within Pentaho Data Integration (Kettle), Pentaho Reporting, and the Pentaho BI Platform. Pentaho Big Data Plugin Features This project contains the implementations for connecting to or preforming the following: Pentaho MapReduce: visually design MapReduce jobs as Kettle transformations HDFS File Operations: Read/write directly from any Kettle step. All made possible by the ubiquitous use of Apache VFS throughout Kettle Data Sources JDBC connectivity Apache Hive Native RPC connectivity for reading/writing Apache HBase Cassandra MongoDB CouchDB. Csvtojson. All you need nodejs csv to json converter.

csvtojson

Support big json data, CLI, web server, powerful nested JSON, customised parser, stream, pipe, and more! Since version 0.3, the core class of csvtojson has been inheriting from stream.Transform class. Therefore, it will behave like a normal Stream object and CSV features will not be available any more. Now the usage is like: To convert from a string, previously the code was: csvConverter.from(csvString); Now it is: csvConverter.fromString(csvString,callback); The callback function above is optional. see Parse String. XSLTJSON: Transforming XML to JSON using XSLT - bramstein.com.

XSLTJSON: Transforming XML to JSON using XSLT XSLTJSON is an XSLT 2.0 stylesheet to transform arbitrary XML to JavaScript Object Notation (JSON).

XSLTJSON: Transforming XML to JSON using XSLT - bramstein.com

JSON is a lightweight data-interchange format based on a subset of the JavaScript language, and often offered as an alternative to XML in—for example—web services. To make life easier XSLTJSON allows you to transform XML to JSON automatically. XSLTJSON supports several different JSON output formats, from a compact output format to support for the BadgerFish convention, which allows round-trips between XML and JSON. Extracting information from a JSON file using XSLT version 1.0. XSLT is easy, even for transforming JSON!

Most developers I talk to will cringe if they hear the acronym XSLT.

XSLT is easy, even for transforming JSON!

I suspect that reaction is derived from some past experience where they have seen some horrendously complex XML/XSLT combination. There is certainly lots of that around. However, for certain types of document transformations, XSLT can be a very handy tool and with the right approach, and as long as you avoid edge cases, it can be fairly easy. When I start building an XSLT transform, I always start with the “identity” transform, The identity transform simply traverses the nodes in the input document and copies them into the output document. Make some changes In order to make changes to the output document you need to add templates that will do something other than simply copy the existing node. This template matches on any element named foo and in it’s place creates an element named bar that contains a copy of everything that foo contained.

Given the above XSLT, an input XML document like this, would be transformed into. JetSet.