Fault Tolerance

Michael R. Lyu 吕荣聪 B.Sc. (National Taiwan University), M.Sc. (UC Santa Barbara), Ph.D. (UCLA) IEEE Fellow (2004); AAAS Fellow (2007); Croucher Senior Research Fellow (2008); IEEE Reliability Society Engineer of the Year (2010) Book: Handbook of Software Reliability Engineering.

Software Forensics Centre

SFC is primarily concerned with: symptoms of failure in software projects patterns of failure learning from failure in software projects using methods for analysing complex systems to predict failure in software projects recording of assumptions during projects feedback, improvement and evolution designing fault-tolerant software projects incremental development and achieving retained value after failure of software project developing narrative methods for analysing failures improving the practice of managing complex projects. The general aim of our work is to improve software development and management practice through empirical work. The Centre holds information and cases pertaining to a large volume of international failures.

Your Contribution We welcome your contributions and suggestions. PROMISE DATASETS PAGE. Memory Failure Project. Overview Our research focuses on the characteristics of memory hardware errors and their implications on software systems.

Memory Failure Project

A plethora of research works can be found on memory fault tolerance. Often times researchers use accelerated tests in their controlled environments to collect data. We have taken a different approach to conduct field tests that monitor computers in real time and record the actual errors happening on them. Memory errors can be categorized as soft errors and hard errors. Publications Project Members Xin Li, Phd Candidate @U of RochesterKai Shen@U of RochesterMichael Huang@U of Rochester Lingkun Chu And also we'd like to thank Tao Yang and Alex Wang at that have made this research and the data release possible. Contact Xin Li ( Link. Los Alamos National Laboratory: Computer Science Research: HPC-5. In order to enable open computer science research access to computer operational data is desparately needed.

Los Alamos National Laboratory: Computer Science Research: HPC-5

Data in the areas of failure, availability, usage, environment, performance, and workload characterization are some of the most desparately needed by computer science researchers. The following sets of data are provided under universal release to any computer science researcher to use to enable computer science work. All we ask is that if you use these data in your research that you recognize Los Alamos National Laboratory for providing these data.

The first set of data was made available in 2005 for times spanning 1995-2005, an update is being made available that adds 2005-09/2011 failure data Computer systems from which operational data was drawn Download system layout files here System 20 job exit status information We are happy to host these data to enable Research. USENIX - The Computer Failure Data Repository (CFDR) Download Nasa Space Science Free Tutorial in Ebook Pdf. Software Fault Tolerance: A Tutorial. Software Fault Tolerance. Behrooz Parhami. Page last updated on 2013 March 20 Affiliation and Contact Information Professor Behrooz Parhami Department of Electrical and Computer Engineering University of California Santa Barbara, CA 93106-9560, USA - See the top banner for e-mail and website addresses - Office location: Harold Frank Hall, Room 5155 - Deliveries: Harold Frank Hall, Room 4155 - Office phone: +1 805 893 3211 - Departmental fax: +1 805 893 3262 - Information for prospective graduate students - Maps and driving directions for UCSB visitors - Miscellaneous files & documents to download/read Teaching Assignments and Availability Three-Sentence Technical Biography Behrooz Parhami (PhD, UCLA 1973) is Professor of Electrical and Computer Engineering, and former Associate Dean for Academic Personnel, College of Engineering, at University of California, Santa Barbara, where he teaches and does research in computer arithmetic, parallel processing, and dependable computing.

Behrooz Parhami

