Big Data

Facebook Twitter
The computing industry is seeing dramatic growth in the use of "shared nothing" database architectures where each node functions independently of one another and is self-sufficient (Hadoop Distributed File System for example). For the sake of performance, contention among nodes for shared disk resources (SAN and NAS) is one of the things these architectures avoid by dedicating storage resources to each node, i.e. no shared disk. While these computing architectures are best-known in the context of Web-based applications and development activities, they are no longer confined to the Web. Shared storage in a 'shared nothing' environment | Data-driven Shared storage in a 'shared nothing' environment | Data-driven
Scale Unlimited is based in Nevada City, California and provides consulting and training services for big data analytics, search, and web mining. The company was founded in 2008 by Stefan Groschupf, Chris Wensel, and Ken Krugler, three of the world’s leading experts in scalable, reliable data analytics, workflow design and web mining. All are well-known community members and contributors to key open source projects, including Hadoop, Bixo, Cascading, Solr, Lucene, Katta and Tika. Solutions from Scale Unlimited are built using these and other widely used and well supported open source packages, providing maximum flexibility with no commercial lock-in. Inspiration About | Elastic Web Mining | Bixo Labs About | Elastic Web Mining | Bixo Labs
Mobclix Selects Aster Data to Move Analytics to the Cloud | Linux/Unix Hosting
How do you query hundreds of gigabytes of new data each day streaming in from over 600 hyperactive servers? If you think this sounds like the perfect battle ground for a head-to-head skirmish in the great MapReduce Versus Database War, you would be correct. Bill Boebel, CTO of Mailtrust (Rackspace’s mail division), has generously provided a fascinating account of how they evolved their log processing system from an early amoeba’ic text file stored on each machine approach, to a Neandertholic relational database solution that just couldn’t compete, and finally to a Homo sapien’ic Hadoop based solution that works wisely for them and has virtually unlimited scalability potential. Rackspace faced a now familiar problem. Lots and lots of data streaming in. Where do you store all that data? [repost]How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data « New IT Farmer [repost]How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data « New IT Farmer
Welcome to Hive! The Apache Hive™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL. Getting Started

Welcome to Hive!

Invalid quantity. Please enter a quantity of 1 or more. The quantity you chose exceeds the quantity available. Big Data 2011 by GigaOM - Infrastructure - Web- Eventbrite Big Data 2011 by GigaOM - Infrastructure - Web- Eventbrite