This is the second course in the Data Warehousing for Business Intelligence specialization. Ideally, the courses should be taken in sequence.
In this course, you will learn exciting concepts and skills for designing data warehouses and creating data integration workflows. These are fundamental skills for data warehouse developers and administrators. You will have hands-on experience for data warehouse design and use open source products for manipulating pivot tables and creating data integration workflows. In the data integration assignment, you can use either Oracle, MySQL, or PostgreSQL databases. You will also gain conceptual background about maturity models, architectures, multidimensional models, and management practices, providing an organizational perspective about data warehouse development. If you are currently a business or information technology professional and want to become a data warehouse designer or administrator, this course will give you the knowledge and skills to do that. By the end of the course, you will have the design experience, software background, and organizational context that prepares you to succeed with data warehouse development projects.
In this course, you will create data warehouse designs and data integration workflows that satisfy the business intelligence needs of organizations. When you’re done with this course, you’ll be able to:
* Evaluate an organization for data warehouse maturity and business architecture alignment;
* Create a data warehouse design and reflect on alternative design methodologies and design goals;
* Create data integration workflows using prominent open source software;
* Reflect on the role of change data, refresh constraints, refresh frequency trade-offs, and data quality goals in data integration process design; and
* Perform operations on pivot tables to satisfy typical business analysis requests using prominent open source software
Data Warehouse Concepts and Architectures
-Module 1 introduces the course and covers concepts that provide a context for the remainder of this course. In the first two lessons, you’ll understand the objectives for the course and know what topics and assignments to expect. In the remaining lessons, you will learn about historical reasons for development of data warehouse technology, learning effects, business architectures, maturity models, project management issues, market trends, and employment opportunities. This informational module will ensure that you have the background for success in later modules that emphasize details and hands-on skills.You should also read about the software requirements in the lesson at the end of module 1. I recommend that you try to install the software this week before assignments begin in week 2.
Multidimensional Data Representation and Manipulation
-Now that you have the informational context for data warehouse development, you’ll start using data warehouse tools! In module 2, you will learn about the multidimensional representation of a data warehouse used by business analysts. You’ll apply what you’ve learned in practice and graded problems using WebPivotTable or Pivot4J, open source tools for manipulating pivot tables. At the end of this module, you will have solid background to communicate and assist business analysts who use a multidimensional representation of a data warehouse. After completing this module, you should proceed to module 3 to complete an assignment and quiz with either WebPivotTable or Pivot4J. Because Pivot4J can be difficult to install, I recommend completing the assignment and quiz using WebPivotTable.
Multidimensional Data Representation and Manipulation: Lesson Choices
-Choice 1 and 2: If completing the WebPivotTable assignment (choice 1), you should also complete the WebPivotTable quiz (choice 2). | Choice 3 and 4: If completing the Pivot4J assignment (choice 3), you should also complete the Pivot4J quiz (choice 4). Due to potential difficulty with installing Pivot4J, I recommend that you complete the WebPivotTable assignment and quiz.
Data Warehouse Design Practices and Methodologies
-This module emphasizes data warehouse design skills. Now that you understand the multidimensional representation used by business analysts, you are ready to learn about data warehouse design using a relational database. In practice, the multidimensional representation used by business analysts must be derived from a data warehouse design using a relational DBMS.You will learn about design patterns, summarizability problems, and design methodologies. You will apply these concepts to mini case studies about data warehouse design. At the end of the module, you will have created data warehouse designs based on data sources and business needs of hypothetical organizations.
Data Integration Concepts, Processes,and Techniques
-Module 4 extends your background about data warehouse development. After learning about schema design concepts and practices, you are ready to learn about data integration processing to populate and refresh a data warehouse. The informational background in module 4 covers concepts about data sources, data integration processes, and techniques for pattern matching and inexact matching of text. Module 4 provides a context for the software skills that you will learn in module 5.
Architectures, Features, and Details of Data Integration Tools
-Module 5 extends your background about data integration from module 4. Module 5 covers architectures, features, and details about data integration tools to complement the conceptual background in module 4. You will learn about the features of two open source data integration tools, Talend Open Studio and Pentaho Data Integration. You will use Pentaho Data Integration in a guided tutorial in preparation for a graded assignment involving Pentaho Data Integration. For the tutorial and assignment, you need to connect to a database server, Oracle, MySQL, or PostgreSQL. You should see Module 1 with installation Instructions for Pentaho Data Integration and these database servers.