Empirical Software Engineering using R: first draft available for download A draft of my book Empirical Software Engineering using R is now available for download. The book essentially comes in two parts: statistical techniques that are useful for analyzing software engineering data. Design of Experiments (DOE) Tutorial Design of experiments (DOE) is a powerful tool that can be used in a variety of experimental situations. DOE allows for multiple input factors to be manipulated determining their effect on a desired output (response). By manipulating multiple inputs at the same time, DOE can identify important interactions that may be missed when experimenting with one factor at a time. All possible combinations can be investigated (full factorial) or only a portion of the possible combinations (fractional factorial). Fractional factorials will not be discussed here. When to Use DOE
Basic Steps of Applying Reliability Centered Maintenance (RCM) Part II Basic Steps of Applying Reliability Centered Maintenance (RCM) Part II Although there is a great deal of variation in the application of Reliability Centered Maintenance (RCM), most procedures include some or all of the seven steps shown below: Prepare for the Analysis Select the Equipment to Be Analyzed Identify Functions Identify Functional Failures Identify and Evaluate (Categorize) the Effects of Failure Identify the Causes of Failure Select Maintenance Tasks If we were to group the seven steps into three major blocks, these blocks would be: DEFINE (Steps 1, 2 and 3) ANALYZE (Steps 4, 5 and 6) ACT (Step 7) The previous issue of Reliability HotWire discussed the DEFINE stage.
The “Ten Simple Rules for Reproducible Computational Research” are easy to reach for R users “Ten Simple Rules for Reproducible Computational Research” is a freely available paper on PLOS computational biology. As I’m currently very interested on the subject of reproducible data analysis, I will these ten rules and the possible implementation in R with my point of view of epidemiologist interested in healthcare data reuse. I will also check if my workflow comply with these rules. For those who are in a hurry, I summarised these rules and possible implementation in R in a table at the end of this post. Exploratory data analysis In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed.
Reliability Centered Maintenance (RCM) An Overview of Basic Concepts Reliability Centered Maintenance (RCM) analysis provides a structured framework for analyzing the functions and potential failures for a physical asset (such as an airplane, a manufacturing production line, etc.) with a focus on preserving system functions, rather than preserving equipment. RCM is used to develop scheduled maintenance plans that will provide an acceptable level of operability, with an acceptable level of risk, in an efficient and cost-effective manner. According to the SAE JA1011 standard, which describes the minimum criteria that a process must comply with to be called "RCM," a Reliability Centered Maintenance Process answers the following seven questions: What are the functions and associated desired standards of performance of the asset in its present operating context (functions)? In what ways can it fail to fulfill its functions (functional failures)?
Version Control, File Sharing, and Collaboration Using GitHub and RStudio This is Part 3 of our “Getting Started with R Programming” series. For previous articles in the series, click here: Part 1, Part 2. This week, we are going to talk about using git and GitHub with RStudio to manage your projects. Git is a version control system, originally designed to help software developers work together on big projects.
What Can Classical Chinese Poetry Teach Us About Graphical Analysis? - Statistics and Quality Data Analysis A famous classical Chinese poem from the Song dynasty describes the views of a mist-covered mountain called Lushan. The poem was inscribed on the wall of a Buddhist monastery by Su Shi, a renowned poet, artist, and calligrapher of the 11th century. Deceptively simple, the poem captures the illusory nature of human perception. Written on the Wall of West Forest Temple --Su Shi From the side, it's a mountain ridge. Looking up, it's a single peak. Introduction to the Temperature-Humidity Relationship Introduction to the Temperature-Humidity Relationship When performing accelerated life testing analysis, a life distribution and a life-stress relationship are required. The temperature-humidity (T-H) relationship, a variation of the Eyring relationship, has been proposed for predicting the life at use conditions when temperature and humidity are the accelerated stresses in a test. This combination model is given by: where:
How to combine multiple CSV files into one using CMD - EmailEmail This is a trick which can save you a lot of time when working with a dataset spread across multiple CSV files. Using a simple CMD command it is possible to combine all the CSV’s into a single entity ready for all your pivot and table wizardry. Step 1 Save all of the CSV files into a single folder. Make sure that the folder is free from any CSV’s you do not want included in the compression.