background preloader

Cassandra

Facebook Twitter

Using Cassandra · thinkaurelius/titan Wiki. This is the documentation for Titan 0.4. Documentation for the latest Titan version is available at The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages. The largest known Cassandra cluster has over 300 TB of data in over 400 machines. — Apache Cassandra Homepage Deploying on Managed Machines The following sections outline the various ways in which Titan can be used in concert with Cassandra.

Local Server Mode Cassandra can be run as a standalone database on the same local host as Titan and the end-user application. Benchmarking High Performance I/O with SSD for Cassandra on AWS. Cassandra Performance Guide. TipsAndTricks/EncryptedFilesystem/Scripts. Linux Hard Disk Encryption With LUKS [ cryptsetup Command ] Dear nixCraft, I carry my Linux powered laptop just about everywhere. How do I protect my private data stored on partition or removable storage media against bare-metal attacks where anyone can get their hands on my laptop or usb pen drive while traveling? Sincerely, Worried about my data.

Dear Worried Linux user, That’s actually a great question. Many enterprises, small business, and government users need to encrypt their laptop to protect confidential information such as customer details, files, contact information and much more. Linux supports the following cryptographic techniques to protect a hard disk, directory, and partition. Linux encryption methods There are two methods to encrypt your data: #1: Filesystem stacked level encryption #2: Block device level encryption Loop-AES – Fast and transparent file system and swap encryption package for linux. Step #1: Install cryptsetup utility You need to install the following package. Reading package lists... Step #2: Configure LUKS partition. Encrypting an existing Centos install | Nonsense and other useful things.

Edit: Meanwhile I have found a better way to migrate an existing centos unencrypted install to a fully encrypted install with /boot as the only unencrypted disk space. This solution is much preferred over the one described in this post. The new approach is here. Inspired by yet another incident in the news of a laptop with sensitive information getting stolen, you start imagining what would happen if someone would get hold of your laptop. How much sensitive data would be on it and what would the consequences be. Therefore, I set out the investigate how to make this laptop more secure. The laptop is running centos 6.4 and uses a typical logical volume setup. The idea is to use LUKS for encryption and to use a single logical volume called encrypted instead of home and data.

This is a very interesting setup that shows the power of the device mapper on linux. Disk space The first thing to consider is disk space. Backups dd if=/dev/sda bs=32M | gzip -c > sda.img.gz Creating encrypted storage #! BASH Total CPU Percentage. TrueCrypt: A tool to encrypt volumes on-the-fly. TrueCrypt TrueCrypt is a powerful yet free Open-Source disk encryption Software. I am quite satisfied with the software that I decided to introduce it here in my blog, may all future releases remain free to use! With TrueCrypt you can maintain an on-the-fly-encrypted volume (data storage device). On their website they explain ‘On-the-fly encryption’ as an automatic continuing encryption process to data right before it is saved and decrypted right after it is loaded.

In simple words, you will end up having an encrypted volume to secure all your sensitive data inside. What I personally think makes this software special is the ease of using encrypted files right from the secured volume to the computer’s RAM. TrueCrypt ‘never saves any decrypted data to a disk – it only stores them temporarily in RAM (memory)’. To install and use in Linux, download the appropriate package from their website: www.truecrypt.org. Then navigate to where it was moved and extract the file Now lets mount them. FAQ. Why can't I make Cassandra listen on 0.0.0.0 (all my addresses)? Cassandra is a gossip-based distributed system.

ListenAddress is also "contact me here address," i.e., the address it tells other nodes to reach it at. Telling other nodes "contact me on any of my addresses" is a bad idea; if different nodes in the cluster pick different addresses for you, Bad Things happen. If you don't want to manually specify an IP to ListenAddress for each node in your cluster (understandable!)

, leave it blank and Cassandra will use InetAddress.getLocalHost() to pick an address. Then it's up to you or your ops team to make things resolve correctly (/etc/hosts/, dns, etc). One exception to this process is JMX, which by default binds to 0.0.0.0 (Java bug 6425769). See CASSANDRA-256 and CASSANDRA-43 for more gory details. What ports does Cassandra use?

By default, Cassandra uses 7000 for cluster communication (7001 if SSL is enabled), 9160 for clients (Thrift), and 7199 for JMX. See [CassandraHardware]. Yes. All the right moves. One of the principles of effective text editing is moving around very efficiently. Following are some pointers which may help you do that. h move one character left j move one row down k move one row up l move one character right w move to beginning of next word b move to beginning of previous word e move to end of word W move to beginning of next word after a whitespace B move to beginning of previous word before a whitespace E move to end of word before a whitespace All the above movements can be preceded by a count; e.g. 4j will move down 4 lines. See :help {command} (for example, :help g_) for all of the above if you want more details. Ctrl-i jump to previous cursor position <C-i> (or <Tab>) goes to the next cursor position in the jump list, and does nothing unless you've already moved to an older position in the jump list using <C-o>.

(Spiiph 12:37, October 5, 2009 (UTC)) Encryption with TrueCrypt LG #165. By Ariel Maiorano About TrueCrypt From its Web site, we learn that TrueCrypt is free, open-source disk encryption software for Windows Vista/XP, Mac OS X, and Linux. Its more common use would be to create a virtual encrypted disk within a file (called a volume file), and mount it as a real disk. Anyhow, it also implements mechanisms to provide plausible deniability, a hidden volume inside another one, and, of course, the possibility to encrypt an entire partition or storage device. Operating system encryption is supported only on Windows at the moment.

Encryption is automatic, real-time (on-the-fly), and transparent. On-the-fly encryption means that data is automatically encrypted or decrypted right before it is loaded or saved, without any user intervention. Although more popular on Windows operating systems [1], TrueCrypt runs well on Linux, and its volume files are fully cross-platform. License conflicts The algorithms Installation Step-by-step installation instructions follow. Encrypting. Cassandra-chef-cookbook/attributes/default.rb at master · michaelklishin/cassandra-chef-cookbook. Getting Started with Apache Cassandra. Exclusive offer: get 50% off this eBook here Cassandra High Performance Cookbook — Save 50% Over 150 recipes to design and optimize large scale Apache Cassandra deployments by Edward Capriolo | July 2011 | Cookbooks Open Source Apache Cassandra is a fault-tolerant, distributed data store which offers linear scalability allowing it to be a storage platform for large high volume websites.

In this article by Edward Capriolo, author of Cassandra High Performance Cookbook, you will learn the following recipes: (For more resources on this subject, see here.) The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together a fully distributed design and a ColumnFamily-based data model. Cassandra is a highly scalable distributed database. Getting ready Visit in your web browser and find a link to the latest binary release. How to do it... How it works... Cassandra comes as a compiled Java application in a tar file. Automating Cassandra Operations and Management With Netflix's Priam Tool. Netflix Open Sources Curator ZooKeeper Library. I’m excited to read that Netflix is getting even more involved in the open source world and the team there is starting to open source some of the tools developed internally.

As some of you may already know, Netflix has been experimenting with quite a few and it is a heavy user of NoSQL databases running the majority of they services in the cloud. The first project announced and available already on GitHub is Curator,a ZooKeeper client wrapper and rich ZooKeeper framework (nb: ZooKeeper just released version 3.4.0). Curator deals with ZooKeeper complexity in the following ways:Retry Mechanism: Curator supports a pluggable retry mechanism. All ZooKeeper operations that generate a recoverable error get retried per the configured retry policy. Curator comes bundled with several standard retry policies (e.g. exponential backoff).Connection State Monitoring: Curator constantly monitors the ZooKeeper connection. Next on the list of open source projects from Netflix: Most read Latest.

Announcing Priam. We talked in the past about our move to NoSQL and Cassandra has been a big part of that strategy. Cassandra hit a big milestone recently with the announcement of the v1 release. We recently announced Astyanax, Netflix's Java Cassandra client with an improved API and connections management which we open sourced last month. Today, we're excited to announce another milestone on our open source journey with an addition to make operations and management of Cassandra easier and more automated. As we embarked on making Cassandra one of our NoSQL databases in the cloud, we needed tools for managing configuration, providing reliable and automated backup/recovery, and automating token assignment within and across regions. Priam was built to meet these needs. The name 'Priam' refers to the king of Troy, in Greek mythology, who was the father of Cassandra. What is Priam? Priam is a co-process that runs alongside Cassandra on every node to provide the following functionality: Backup and recovery.

Compatibility · Netflix/Priam Wiki. Compatibility, branches, and you In general, the master branch of priam coordinates with the current major version of the apache-cassandra project. The version numbers in priam branch names also correlate to the version of cassandra it works with. Thus the priam 1.1 branch works with apache-cassandra 1.1, priam 1.0 works with cassandra 1.0 (and 0.8), and so on. Current compatibility matrix (August 2013) is like this: priam master - cassandra 2.0. Note: vnode deployments are not supported yet, but you can use priam for the "one token per node" paradigm (basically, everything before vnodes).priam 1.2 - cassandra 1.2 onlypriam vnodes - a Work In Progress that is integrating c* 1.2's vnodes into priam.

DO NOT USE IN PRODUCTION, it's far from ready (August 2013).priam 1.1 - cassandra 1.1 only, no longer under active devleopmentpriam dse - this was an experimental branch to get the basic integration with Datastax Enterprise going. Cassandra Service Script ← Automate Everything. Without an existing cassandra service script, I decided to go ahead and create one, to make things a little easier to manage, and to make the whole experience a little more user friendly The script includes a few nodetool basics, such as repair, cleanup, info, netstats etc.

And will log the start and end times in its own log for repair and cleanup, allowing you to see how long the process takes without having the trawl through all the cassandra logs to find a start and end time (very useful for us when it takes over 5 hours to complete a repair). Here is the script, simply copy the content into /etc/init.d/cassandra and make it executable. Installing cassandra on Centos 5 ← Automate Everything. Just a quick post on how-to install cassandra on Centos 5, and getting the required bits on to stop all the errors you will see, such as JNA and MX4J missing. First you need to get all the required modules from yum, to prepare the server. yum -y install gcc-c++ make cmake python-devel bzip2-devel zlib-devel yum -y install log4cpp-devel git git-core cronolog google-perftools-devel yum -y install readline-devel ncurses-devel libtool autoconf expat yum -y install libevent-devel flex byacc expat-devel # Perl Modules for Thrift Install yum -y install perl-Bit-Vector perl-Class-Accessor Next you will want to download the latest version of cassandra available at I have chosen to install cassandra in the following location: /usr/local/share/cassandra Installing JNA is done as follows: wget “ –no-check-certificate -O /usr/local/share/cassandra/lib/jna.jar chmod 755 /usr/local/share/cassandra/lib/jna.jar.

JMeter Plugin for Cassandra. By Vijay Parthasarathy and Denis Sheahan A number of previous blogs have discussed our adoption of Cassandra as a NoSQL solution in the cloud. We now have over 55 Cassandra clusters in the cloud and are moving our source of truth from our Datacenter to these Cassandra clusters. As part of this move we have not only contributed to Cassandra itself but developed software to ease its deployment and use.

It is our plan to open source as much of this software as possible. We recently announced the open sourcing of Priam, which is a co-process that runs alongside Cassandra on every node to provide backup and recovery, bootstrapping, token assignment, configuration management and a RESTful interface to monitoring and metrics. At Netflix we have recently started to standardize our load testing across the fleet using Apache JMeter. Cassandra JMeter Plugin JMeter allows us to customize our test cases based on our application logic/datamodel. An example screenshot is shown below. Benchmark Setup. Cassandra. By Adrian Cockcroft Most of the talks and panel sessions at AWS Re:Invent were recorded, but there are so many sessions that it's hard to find the Netflix ones.

Here's a link to all of the videos posted by AWS that mention Netflix: They are presented below in what seems like a natural order that tells the Netflix story, starting with the migration and video encoding talks, then talking about availability, Cassandra based storage, "big data" and security architecture, ending up with operations and cost optimization. Embracing the Cloud Presented by Neil Hunt - Chief Product Officer, and Yury Israilevsky - VP Cloud and Platform Engineering. Join the product and cloud computing leaders of Netflix to discuss why and how the company moved to Amazon Web Services.

Slides: Netflix's Encoding Transformation Presented by Kevin McEntee, VP Digital Supply Chain.