Availability

> > >

Disaster Recovery Planning for the Developer. If you are a developer, you probably haven't spent much time thinking about your personal disaster recovery plan. You've probably heard or read about large corporations planning for disasters, but didn't think that such plans applied to you. In today's world of major hurricanes, terrorism, brownouts, hackers, etc, planning for disaster recovery is everyone's burden. I happen to live in Florida, and we are once again getting ready for yet another busy hurricane season. So what does all this have to do with web development? My office happens to be in my home, and I've got three networked PCs that are always on, as well as a laptop that is on most of the time. One of the other computers is my personal computer. Right off the bat, I realized that I need to have my computers on a backup system so that if I lose power during a storm, I won't lose the data that I'm currently working on.

Stay safe, and spend some time developing your own personal disaster recovery plan. Disaster Recovery Planning for the Developer, Part 2. In our last article we touched upon the most basic steps that a developer can take to survive and recover from a disaster, be it a hurricane, terrorist attack, ice storm, hackers, virus--any of the typical disasters we could face today or in the future. In 2005, Hurricane Katrina bought home the reality of disaster recovery, or more precisely, what happens when one doesn't prepare. In this article we're going to take disaster recovery planning to the next level--we're going to discuss the Business Continuity Plan. What is a Business Continuity Plan? The first step in designing your Disaster Recovery Plan is to put together your Business Continuity Plan (BCP).

According to Wikipedia, Business Continuity Planning is a "methodology used to create a plan for how an organization will resume partially or completely interrupted critical function(s) within a predetermined time after a disaster or disruption. " A BCP is a dynamic document that is constantly being revisited, revised and tested. Disaster Recovery Planning for the Developer, Part 3. Making a Disaster Recovery Plan for Don's Web Design When we last left off, we had discussed the creation and planning of a BCP, Business Continuity Plan, and a DCP, Disaster Recovery Plan, for your business. In this 3rd part of our series, we'll walk you through the steps it takes to create a DRP for our fictitious business, "Don's Web Design". Mission Critical Aspects for Don As we discussed previously, the first step in creating the DRP is to decide which aspects of Don's Web Design are the most critical to keeping his business up and running after a disaster. Don's office is located in his home, along with the most critical aspects of his business.

Likely Threats That Don Faces Next in Don's planning is the task of deciding what types of disasters that he should plan for. Now that he's got a list of potential disasters for which he must prepare, he'll need to define how they would each affect his business. Computer Failure Power Outages Hurricanes, Flooding and Other Natural Disasters. Disaster Recovery Planning for the Developer, Part 4. In our last article we went through the steps of creating a Disaster Recovery Plan for our ficticious business, Don's Web Design. This article, our last in this series, will provide you with the resources you need to complete your own DRP. We've scoured the web for the best resources to help you build your data recovery plan, and we've broken them down into the following four categories. Note that inclusion in this resource listing does not mean endorsement of any company listed; we're providing these resources for the reader's convenience.

Descriptions that follow come from the respective site's own pages, hence the quotations. Software Strohl Systems'LDRPS - Their Living Disaster Recovery Planning System is a "flexible, central data repository with built-in, proven business continuity planning expertise. Hosted Services Tools and Templates The BCP Generator - "The template covers everything from initial business impact analysis through to return to business as usual following an incident. Microsoft Word - 0303 - virtualizatiion_1 - virtualizatiion_1.pdf. Microsoft Word - 0304 - virtualizatiion_2 - virtualizatiion_2.pdf. Microsoft Word - 0306 - virtualizatiion_3 - virtualizatiion_3.pdf. For an Always-On World | Stratus Technologies. Integrity NonStop Servers - HP Servers | HP® Official Site. For over 35 years, HP NonStop fault-tolerant computing has supported enterprises in industries that require continuous access to information, support for high volumes of online transactions, modern infrastructure, and low operational costs.

Financial services, telecommunications, retail, manufacturing, healthcare, and public sector are leading the way in delivering a continuous business environment—and HP NonStop continues to be the platform of choice for industries that never stop. HP NonStop is engineered for the highest availability level HP NonStop is designed and integrated for the very highest availability level. According to IDC 1, “Availability Level 4 (AL4) defines fully fault-tolerant servers, in which redundancy of components, combined with special software, ensure that processing will continue, even if there is a failure in a single hardware component in one area of the system.”

That means NO interruption of work, NO transactions lost, and NO degradation in performance. Xplore Abstract - Fault-Tolerant Computing&#8212;Concepts and Examples. Open source high-availability clustering. By Matthew O'Keefe and John Ha Introduction As Linux is used more widely for mission-critical applications, support for high availability through application failover is becoming more important. Improving Linux high availability involves employing both hardware and software technologies, including: Server redundancy, including application failover and server clusteringStorage redundancy, including RAID and I/O multipathingNetwork redundancyPower system redundancy These features provide a way to achieve scalable performance and high availability at low cost. In this article, we focus on the open source Red Hat Cluster Manager application failover software package, describing its basic principles and operation.

Architecture and operation Red Hat Cluster Manager is an application failover software package that allows a group of connected Linux servers (known as a cluster) to run the same application. Figure 1. Figure 1. Configuring a cluster Cluster nodes Fence devices Failover domains Resources. High-availability cluster. HA clusters are often used for critical databases, file sharing on a network, business applications, and customer services such as electronic commerce websites. HA cluster implementations attempt to build redundancy into a cluster to eliminate single points of failure, including multiple network connections and data storage which is redundantly connected via storage area networks.

Application design requirements[edit] Not every application can run in a high-availability cluster environment, and the necessary design decisions need to be made early in the software design phase. In order to run in a high-availability cluster environment, an application must satisfy at least the following technical requirements, the last two of which are critical to its reliable function in a cluster, and are the most difficult to satisfy fully: There must be a relatively easy way to start, stop, force-stop, and check the status of the application. Node configurations[edit] Node reliability[edit] See also[edit] The Bathtub Curve and Product Failure Behavior (Part 1 of 2)

By Dennis J. Wilkins Retired Hewlett-Packard Senior Reliability Specialist, currently a ReliaSoft Reliability Field Consultant This paper is adapted with permission from work done while at Hewlett-Packard. Reliability specialists often describe the lifetime of a population of products using a graphical representation called the bathtub curve. The bathtub curve consists of three periods: an infant mortality period with a decreasing failure rate followed by a normal life period (also known as "useful life") with a low, relatively constant failure rate and concluding with a wear-out period that exhibits an increasing failure rate.

This article provides an overview of how infant mortality, normal life failures and wear-out modes combine to create the overall product failure distributions. Figure 1: The Bathtub Curve Also note that the actual time periods for these three characteristic failure distributions can vary greatly. Infant Mortality What Causes It and What to Do About It? The Bathtub Curve and Product Failure Behavior (Part 2 of 2) By Dennis J. Wilkins Retired Hewlett-Packard Senior Reliability Specialist, currently a ReliaSoft Reliability Field Consultant This paper is adapted with permission from work done while at Hewlett-Packard. Introduction Part One of this article (presented in last month's HotWire) introduced the concept of the reliability bathtub curve. This is a graphical representation of the lifetime of a population of products, which consists of three key periods. Part One examined the first period of the curve, infant mortality, and also discussed issues related to burn-in, a common practice to reduce the occurrence of this type of failure during the useful life of the product.

Reliability Bathtub Curve Review As described in more detail in Part One, the bathtub curve, displayed in Figure 1 below, does not depict the failure rate of a single item. Figure 1: The Bathtub Curve Normal Life Period Does it Really Exist? Soft Error Rate (SER) is a fact of life for systems using solid state memory chips.