Trennlinie
Navigation: >Startseite >Informatik >Business Intelligence >>Data Warehousing
Trennlinie
Data Warehousing - Design Methodologies for Data Warehouses (DWHs)
(Based on the thesis [Kon04])

This section introduces into the fundamental data warehousing objects and design concepts as well as multidimensional data models. In the area of analysis-orientated database systems and DWH design, there are various modelling concepts for representing multidimensional schemas, namely:

Each DWH schema is build on fact tables and dimension tables. The fact tables are the core tables in the middle of a dimension schema. A fact table is structured by numeric facts (variables or measures [BuG01, page 200]) and foreign keys to join dimension tables as displayed in figure 2. The facts are usually historical and quantitative business transactions for analysis such as turnover or sales volume. In general, fact tables provide numeric, additive fields and can been aggregated by arithmetical operations [ORA03, Data Warehousing Objects] and they are associated with multiple dimension tables. Dimension tables, also called "lookup or reference tables" [ORA03, DataWarehousing Objects], stores usually textual and qualitative values for specification the measures in the fact tables. In general, they are composed of a primary key and dimension attributes (see figure 2). Dimension attributes are organised in certain logical structures, which enables to drill down or up fact data along a defined dimension hierarchy "to view different levels of granularity" [ORA03, Data Warehousing Objects]. For instance, an example Time Dimension table could be contain attributes in the following hierarchical order: Year - Quarter - Month - Day (see figure 1).

Trennlinie

Drilling

Figure 1: Example for a dimension hierarchy and navigation along a dimension table.

Trennlinie

An overall DWH and also appropriate data marts are main parts of BI architectures. They can be implemented by different relational or multidimensional schemas as mentioned previously. In the following, there is a short outline of the common used and aforementioned multidimensional data schemas:

  • The star schema is a star-shaped representation of denormalised dimension tables around a fact table [BuG01, page 201] (see figure 2a)). In general, the design of dimension tables and fact tables are based on 1:n relationships. The denormalised star schema provides an optimised query performance.
  • The snowflake schema is an extended star schema that includes further normalised dimensions to prevent possible occurrence of update anomalies [BuG01, page 201] and to eliminate redundancy. In other words, the dimension tables are broken up into subdimension tables instead of one large table which causes a reduced query performance. Figure 2b) presents a graphical representation of a snowflake schema.

Trennlinie

BIPlatform

Figure 2: Variety of common schema models for data warehousing (derived from [BuG01, page 201]) - details.

Trennlinie

  • The starflake schema is a hybrid of the star and snowflake schema [BuG01, page 204] (see figure 2c) ). The starflake is composed of a central fact table and a set of constituent denormalised and normalised dimension tables. In contrast to the denormalised tables, the normalised tables are further broken up into subdimension tables.
  • The galaxy schema or “multiple fact tables schema” [BuG01, page 205] is composed of multiple fact tables, which are associated partially with the same dimension tables [BuG01, page 204] as shown in figure 3d).
Link Icon
Trennlinie

Download article: Design_Methodologies_for_Data_Warehouses.pdf

Recent Changes

Valid XHTML 1.0 Strict