background preloader

SEMATECH e-Handbook of Statistical Methods

SEMATECH e-Handbook of Statistical Methods

Related:  R ResourcesDoE

Principal Component Analysis Often, it is not helpful or informative to only look at all the variables in a dataset for correlations or covariances. A preferable approach is to derive new variables from the original variables that preserve most of the information given by their variances. Principal component analysis is a widely used and popular statistical method for reducing data with many dimensions (variables) by projecting the data with fewer dimensions using linear combinations of the variables, known as principal components.

Design of Experiments (DOE) Tutorial Design of experiments (DOE) is a powerful tool that can be used in a variety of experimental situations. DOE allows for multiple input factors to be manipulated determining their effect on a desired output (response). By manipulating multiple inputs at the same time, DOE can identify important interactions that may be missed when experimenting with one factor at a time. All possible combinations can be investigated (full factorial) or only a portion of the possible combinations (fractional factorial). Fractional factorials will not be discussed here. When to Use DOE

Basic Steps of Applying Reliability Centered Maintenance (RCM) Part II Basic Steps of Applying Reliability Centered Maintenance (RCM) Part II Although there is a great deal of variation in the application of Reliability Centered Maintenance (RCM), most procedures include some or all of the seven steps shown below: Prepare for the Analysis Select the Equipment to Be Analyzed Identify Functions Identify Functional Failures Identify and Evaluate (Categorize) the Effects of Failure Identify the Causes of Failure Select Maintenance Tasks If we were to group the seven steps into three major blocks, these blocks would be: DEFINE (Steps 1, 2 and 3) ANALYZE (Steps 4, 5 and 6) ACT (Step 7) The previous issue of Reliability HotWire discussed the DEFINE stage.

Implementation of a basic reproducible data analysis workflow In a previous post, I described the principles of my basic reproducible data analysis workflow. Today, let’s be more practical and see how to implement it. Be noted that it is a basic workflow. The goal is to find a good balance between a minimal reproducible analysis and the ease to deploy it on any platform. This workflow will allow you to run a complete analysis based on multiple files (data files, R script, Rmd files…) just by launching a single R script. There are 3 main components in this workflow:

Exploratory data analysis In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA),[1] which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed.

Reliability Centered Maintenance (RCM) An Overview of Basic Concepts Reliability Centered Maintenance (RCM) analysis provides a structured framework for analyzing the functions and potential failures for a physical asset (such as an airplane, a manufacturing production line, etc.) with a focus on preserving system functions, rather than preserving equipment. RCM is used to develop scheduled maintenance plans that will provide an acceptable level of operability, with an acceptable level of risk, in an efficient and cost-effective manner. According to the SAE JA1011 standard, which describes the minimum criteria that a process must comply with to be called "RCM," a Reliability Centered Maintenance Process answers the following seven questions: What are the functions and associated desired standards of performance of the asset in its present operating context (functions)? In what ways can it fail to fulfill its functions (functional failures)?

Empirical Software Engineering using R: first draft available for download A draft of my book Empirical Software Engineering using R is now available for download. The book essentially comes in two parts: statistical techniques that are useful for analyzing software engineering data. What Can Classical Chinese Poetry Teach Us About Graphical Analysis? - Statistics and Quality Data Analysis A famous classical Chinese poem from the Song dynasty describes the views of a mist-covered mountain called Lushan. The poem was inscribed on the wall of a Buddhist monastery by Su Shi, a renowned poet, artist, and calligrapher of the 11th century. Deceptively simple, the poem captures the illusory nature of human perception. Written on the Wall of West Forest Temple --Su Shi From the side, it's a mountain ridge. Looking up, it's a single peak.

Introduction to the Temperature-Humidity Relationship Introduction to the Temperature-Humidity Relationship When performing accelerated life testing analysis, a life distribution and a life-stress relationship are required. The temperature-humidity (T-H) relationship, a variation of the Eyring relationship, has been proposed for predicting the life at use conditions when temperature and humidity are the accelerated stresses in a test. This combination model is given by: where: The “Ten Simple Rules for Reproducible Computational Research” are easy to reach for R users “Ten Simple Rules for Reproducible Computational Research” is a freely available paper on PLOS computational biology. As I’m currently very interested on the subject of reproducible data analysis, I will these ten rules and the possible implementation in R with my point of view of epidemiologist interested in healthcare data reuse. I will also check if my workflow comply with these rules. For those who are in a hurry, I summarised these rules and possible implementation in R in a table at the end of this post.