background preloader


Facebook Twitter

Jen Underwood sur Twitter : "Data Prep for #AzureML w/#SSIS #analytics... Blogs. Microsoft Azure Machine Learning enables businesses to perform cloud-based predictive analytics to understand what their data means.


Machine learning algorithms learn from data. It is critical that you feed them the right data for the problem you want to solve. Such data can be business transactional data or sensitive business data that is either on-premises or on the cloud. Also, even if you have good data, you need to make sure that it is in a useful scale and format. The more disciplined you are in handling of data, the more consistent and better results you are like likely to achieve.

Select DataPreprocess DataTransform Data Today, Azure Machine Learning studio already provide ways for you to select data, preprocess data and transform data. Many businesses have multiple different data sources within their IT infrastructure. Une productivité et des performances décuplées. Il est relativement aisé d’écrire une requête qui récupère des données pour un rapport, les affiche dans un des différents contrôles de rapport, puis déploie le rapport auprès des utilisateurs.

Une productivité et des performances décuplées

Mais que se passe-t-il lorsque même la requête la mieux écrite est trop longue à s’exécuter et que la lenteur d’exécution des rapports commence à irriter les utilisateurs ? Il n’existe aucune panacée à une requête mal écrite, mais les astuces SSRS suivantes peuvent améliorer les performances globales de vos rapports. Utilisation des instantanés pour éviter les goulets d’étranglement. Vous pouvez éviter les goulets d’étranglement de rapports dont l’exécution est excessivement longue en créant un instantané (ou snapshot) de rapport qui s’exécutera sur votre système pendant la nuit ou au cours des heures de faible activité.

Définition d’une pagination pour masquer le traitement. Mise en oeuvre de filtres aux fins de performances. Activation de l’exploration descendante pour l’accès aux détails. SSIS Community Tasks and Components - Home. Developing Integration Services Packages for High Performance. The process for designing SQL Server Integration Services (SSIS) packages is typically iterative.

Developing Integration Services Packages for High Performance

You start by getting the components working individually or in small sets, then concentrate on ensuring that the components will work in the correct sequence. During later iterations, you add in more components or adjust properties to perform error handling. Then, in a final pass, you might add in abstractions, taking advantage of variables and expressions to enable runtime changes for your package. But your work is not yet complete at this stage. Before you put your package into production, you need to take some more time to review your package with an eye toward preventing, or at least mitigating, performance problems. Bear in mind that there are various factors that can affect the performance of SSIS packages. Understanding Control Flow Performance Every SSIS package has at least one task in the control flow. Figure 1: Running executables in parallel to speed up control flow processing. SSIS - Faster, Simpler Alternatives to the SCD Transform - Benefic.

Many of the tables in your databases contain dimensional data – descriptive information about objects that can be grouped and organized at a higher level than an individual transaction.

SSIS - Faster, Simpler Alternatives to the SCD Transform - Benefic

And most of these dimension tables are sure to fit one of the definitions of a slowly changing dimension. When loading data into these tables using SSIS, you’ve likely used the slowly changing dimension (SCD) transform SSIS provides to handle at least a few of these data flow tasks. After all, you know you’re dealing with a SCD table and SSIS provides a SCD transform, so why not? Well, there are a few reasons why an alternative approach to maintaining your SCD tables may be to your benefit.

SSIS ETL: Handling Character Encodings and Conversions - Benefic. The Character Encoding Issue You’ve been happily loading data into your data warehouse for years.

SSIS ETL: Handling Character Encodings and Conversions - Benefic

But suddenly your SSIS ETL processes have started to fail. You dig into the issue and find that there are some funny looking text characters now arriving in your data. So now what? I often see this this scenario occur for the first time in a company in the names of people or locations. It could as simple as a system getting upgraded that your data warehouse sources its data from. Handling Extended Character Encodings Support Them Supporting the new character encodings is one way to solve the problem. A Framework for SSIS ETL Development - Benefic. The following is a brief generalized overview of a framework I developed on one of my projects for ETL processing using SQL Server Integration Services (SSIS).

A Framework for SSIS ETL Development - Benefic

It is meant to provide an organized, consistent, centrally configured and managed, and disaster-recovery and audit friendly environment in which SSIS ETL processes can be developed and executed. This SSIS framework utilizes a master package that all ETL processes are initiated from, a set of auxiliary packages that support the master package, a small set of tables that store ETL configuration, status, and execution log information, and a collection of data load packages that do the actual ETL processing.

The Master Package. SSIS Reporting Pack - Home. Alternatives to SSIS SCD Wizard Component. SSIS comes with an out-of-box SCD Wizard to handle Type 1 and Type 2 Slowly Changing Dimensions (SCD) which is a fundamental ETL requirement.

Alternatives to SSIS SCD Wizard Component

However the SCD wizard component has some serious drawbacks – both from operational and functional perspectives that make it unusable for practical purposes. A good summary on the shortcomings of SCD Wizard component can be found here Several workarounds have evolved over time and in this post I would like explore the different alternative options to handle Type 1 and Type 2 SCD without using the out-of-box SCD Wizard Component.

One of the best open source components available out there is SSIS Dimension Merge SCD Component ( formerly known as Kimball Method SSIS Slowly Changing Dimension Component) . This component can be downloaded from CodePlex.