click below
click below
Normal Size Small Size show me how
DATABASE - 6
Data Warehousing and OLAP
| Question | Answer |
|---|---|
| is the process of collecting, integrating, storing and managing data from multiple sources in a central repository. It enables organizations to organize large volumes of current and historical data for efficient querying, analysis and reporting | Data warehousing |
| Key Characteristics of Data Warehousing | Subject-Oriented Integrated Time-variant Non-volatile |
| This means that all relevant data about a subject is gathered and stored as a single set in a useful format such as customers, products and sales. | Subject-Oriented |
| Data Warehousing Architecture | • Bottom Tier – Data Sources and ETL • Middle Tier – OLAP Server • Top Tier – Front-end Tools |
| Data Sources: ETL (Extract, Transform, Load) Process: Data Warehouse Database: Metadata Data Marts OLAP (Online Analytical Processing) Tools End-User Access Tools | Components of Data Warehouse |
| These are the various operational systems, databases and external data feeds that provide raw data to be stored in the warehouse. | Data Sources |
| is responsible for extracting data from different sources, transforming it into a suitable format and loading it into the data warehouse. | ETL (Extract, Transform, Load) Process: |
| This is the central repository were cleaned and transformed data is stored. It is typically organized in a multidimensional format for efficient querying and reporting. | Data Warehouse Database |
| Describes the structure, source and usage of data within the warehouse, making it easier for users and systems to understand and work with the data | Metadata |
| These are smaller, more focused data repositories derived from the data warehouse, designed to meet the needs of specific business departments or functions. | Data Marts |
| OLAP tools allow users to analyze data in multiple dimensions, providing deeper insights and supporting complex analytical queries. | OLAP (Online Analytical Processing) Tools: |
| These are reporting and analysis tools, such as dashboards or Business Intelligence (BI) tools, that enable business users to query the data warehouse and generate report | End-User Access Tools |
| Components of Star Schema: | Fact Table Dimension Tables Measures Attributes OLAP operations |
| summarizes detailed data into higher-level information. | Roll-Up Operation |
| breaks summarized data into more detailed information. | Drill-Down Operation |
| selects one specific dimension from the data cube | Slice Operation |
| selects multiple dimensions or conditions from the data cube. | Dice Operation |
| changes the orientation of data to view it from another perspective | Pivot Operation |
| is the process of dividing a large database table into smaller parts to improve performance, storage management, and faster data access. | Partitioning |
| Types of Partitioning : | Horizontal Partitioning Vertical Partitioning |
| Horizontal Partitioning divides a table by rows | Horizontal Partitioning |
| Vertical Partitioning divides a table by columns. | Vertical Partitioning |
| Common Partitioning Strategies : | Range Partitioning Hash Partitioning List Partitioning |
| Data is divided based on value ranges. | Range Partitioning |
| A hash function determines where data is stored | Hash Partitioning |
| Data is divided based on predefined categories. | List Partitioning |