background preloader

EC2

Facebook Twitter

How and Why Swiftype Moved from EC2 to Real Hardware. This is a guest post by Oleksiy Kovyrin, Head of Technical Operations at Swiftype. Swiftype currently powers search on over 100,000 websites and serves more than 1 billion queries every month. When Matt and Quin founded Swiftype in 2012, they chose to build the company’s infrastructure using Amazon Web Services. The cloud seemed like the best fit because it was easy to add new servers without managing hardware and there were no upfront costs. Unfortunately, while some of the services (like Route53 and S3) ended up being really useful and incredibly stable for us, the decision to use EC2 created several major problems that plagued the team during our first year. Swiftype’s customers demand exceptional performance and always-on availability and our ability to provide that is heavily dependent on how stable and reliable our basic infrastructure is.

The more time we spent working around the problems with EC2, the less time we could spend developing new features for our customers. Conclusion. Who Stole My CPU? One of the most important features of the cloud is the sharing of resources by multi-tenants. Without sharing and being able to optimize utilization of resources, the cloud operator can’t provide scalability and support “economies of scale” for its business. The IaaS public contains its “cloud magic” as well as real hardware such as computing, storage and network devices. The utilization of these resources should be optimized by meeting demand (by time), hence they must be shared between the cloud consumers. What is Steal Time? The basic metric for how a server utilizes its CPU is the idle capacity – the amount of CPU that is free. The CPU utilization compounds from allocations of the following: User – the running applicationSystem – the operating systemsInterrupt – Hardware interruptionsWait – waiting for I/O jobs to endSteal – cycles that are not related to the virtual machineIdle – no work is being done ST on AWS Another important aspect regarding CPU utilization is the workload model.

Save up to 30% by Selecting Better Performing Amazon Instances. If you like the idea of exploiting market inconsistencies to lower your costs then you will love this paper and video from the Hot Cloud '12 conference: Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2. The conclusion is interesting and is a source of good guidance: Amazon EC2 uses diversified hardware to host the same type of instance. The hardware diversity results in performance variation. In general, the variation between the fast instances and slow instances can reach 40%. In some applications, the variation can even approach up to 60%. By selecting fast instances within the same instance type, Amazon EC2 users can acquire up to 30% of cost saving, if the fast instances have a relatively low probability.

The abstract: Cloud computing providers might start with near-homogeneous hardware environment. Configuring SQL Server in EC2. The cloud is robust and reliable. The cloud solves all of our scaling needs. The cloud makes my poop smell like roses. While all of these statements are theoretically true it takes some effort to make them true in reality, especially when a database is involved. Who Is Deploying SQL Server in EC2? A question I hear a lot is, “Who is putting SQL Server into EC2?” Back to the question of who’s deploying SQL Server in EC2… I work with a start up building a platform on a Microsoft stack. Another company I work with is a ISV transitioning to a Software as a Service (SaaS) model. What are some of the problems? There are, of course, problems with every way that you can possibly deploy SQL Server. Cost One of the perceived problems with deploying into the cloud is the cost. How much would it cost to keep a reasonable SQL Server running 365 days a year?

What would a similar server cost you from HP? Despite the cost, Amazon’s cloud is still incredibly popular with many businesses. Noisy Neighbor. Strategy: Kill Off Multi-tenant Instances with High CPU Stolen Time.