background preloader

Data warehouse

Data warehouse
Data Warehouse Overview In computing, a data warehouse (DW, DWH), or an enterprise data warehouse (EDW), is a database used for reporting and data analysis. Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons. The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before it is used in the DW for reporting. A data warehouse constructed from integrated data source systems does not require ETL, staging databases, or operational data store databases. A data mart is a small data warehouse focused on a specific area of interest. This definition of the data warehouse focuses on data storage. History[edit]

http://en.wikipedia.org/wiki/Data_warehouse

Related:  ashokreddyData

Data Warehousing Concepts This chapter provides an overview of the Oracle data warehousing implementation. It includes: Note that this book is meant as a supplement to standard texts about data warehousing. This book focuses on Oracle-specific material and does not reproduce in detail material of a general nature. Data lake A data lake is a large storage repository and processing engine, they provide "massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs".[1] The term was coined by James Dixon, Pentaho chief technology officer.[2] Dixon used the term initially to contrast with "data mart", which is a smaller repository of interesting attributes extracted from the raw data. He wrote: "If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples." [3] Dixon argued that data marts have several inherent problems, and that data lakes are the optimal solution. Examples of data lakes[edit]

Using VLOOKUP in Excel VLOOKUP is one of Excel’s most useful functions, and it’s also one of the least understood. In this article, we demystify VLOOKUP by way of a real-life example. We’ll create a usable Invoice Template for a fictitious company. So what is VLOOKUP? Well, of course it’s an Excel function. This article will assume that the reader already has a passing understanding of Excel functions, and can use basic functions such as SUM, AVERAGE, and TODAY.

Data mining Data mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.[1] It is an interdisciplinary subfield of computer science.[1][2][3] The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.[1] Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.[1] Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.[4] Etymology[edit] In the 1960s, statisticians used terms like "Data Fishing" or "Data Dredging" to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. (4) Modeling

JasperReports It can be used in Java-enabled applications, including Java EE or web applications, to generate dynamic content. It reads its instructions from an XML or .jasper file. JasperReports is part of the Lisog open source stack initiative. Features[edit] JasperReports is an open source reporting library that can be embedded into any Java application. Features include: Business intelligence Business intelligence (BI) is the set of techniques and tools for the transformation of raw data into meaningful and useful information for business analysis purposes. BI technologies are capable of handling large amounts of unstructured data to help identify, develop and otherwise create new strategic business opportunities. The goal of BI is to allow for the easy interpretation of these large volumes of data. Identifying new opportunities and implementing an effective strategy based on insights can provide businesses with a competitive market advantage and long-term stability.[1] BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies are reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics.

Spark for Data Padawans Episode 3: Spark vs MapReduce After learning about Hadoop and distributed data storage, and what exactly Spark is in the previous episodes, it's time to dig a little deaper to understand why even if Spark is great, it isn't necessarily a miracle solution to all your data processing issues. It's time for Spark for super beginners episode 3! As always, I try to keep these articles as easy to understand as possible, but if you really are a super data padawan you probably need to have a quick look at episode 1 and episode 2 to understand what I'm talking about. You can always go back to a previous episode later:

Backdoor webserver using MySQL SQL Injection By David Maman, GreenSQL CTO MySQL Database is a great product used by thousand of websites. Various web applications use MySQL as their default database. Some of these applications are written with security in mind, and some are not. In this article, I would like to show you how SQL injection can be exploited to gain almost full control over your web server. Most people know that SQL injection allows attackers to retrieve database records, pass login screens, and change database content, through the creation of new administrative users. Entity–attribute–value model Entity–attribute–value model (EAV) is a data model to describe entities where the number of attributes (properties, parameters) that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. In mathematics, this model is known as a sparse matrix. EAV is also known as object–attribute–value model, vertical database model and open schema. There are certain cases where an EAV schematic is an optimal approach to data modelling for a problem domain. However, in many cases where data can be modelled in statically relational terms an EAV based approach is an anti-pattern which can lead to longer development times, poor use of database resources and more complex queries when compared to a relationally-modelled data schema. Structure of an EAV table[edit]

CO-4 grass used as fodder increases milk yield considerably In Kerala, even though 60 per cent of the milk requirement is met by procurement from other states like Tamil Nadu, Karnataka and Maharashtra, cattle rearing is fast declining due to high cost of production, labour shortage and shrinking land. Heavy dependence on other states for raw materials pushes up the cost of concentrate feeds. “Dry straw (hay) used to feed cattle has become scarce due to decline in area under rice cultivation. It becomes a dire necessity for dairy farmers to start growing green fodder (grass) if they desire to run their unit profitably,” says Dr.S. Prabhu Kumar, Zonal Project Director, ICAR, Zonal Project Directorate, Bangalore. Grow own fodder

Related:  Architectures