Scenario:
What is a
Star Schema? Which Schema is preferable in performance oriented way? Why?
Solution:
A Star
Schema is composed of 2 kinds of tables, one Fact Table and multiple Dimension
Tables.
It is
called a star schema because the entity-relationship diagram between dimensions
and fact tables resembles a star where one fact table is connected to multiple
dimensions. The center of the star schema consists of a large fact table and it
points towards the dimension tables. The advantage of star schema are slicing
down, performance increase and easy understanding of data.
F1act
Table contains the actual transactions or values that are being analyzed.
Dimension
Tables contain descriptive information about those transactions or values.
à
In Star Schemas, Dimension Tables are denormalized tables and Fact Tables are
highly
normalized.
Star Schema
Star
Schema is preferable because less number of joins will result in performance.
Because
Dimension Tables are denormalized, there will be no need to go for joins all
the time.
Steps in designing Star Schema
• Identify a business process for analysis(like sales).
• Identify measures or facts (sales dollar).
• Identify dimensions for facts(product dimension, location
dimension, time dimension, organization dimension).
• List the columns that describe each dimension.(region
name, branch name, region name).
• Determine the lowest level of summary in a fact table(sales
dollar).
Important aspects of Star Schema & Snow Flake Schema
• In a star schema every dimension will have a primary key.
• In a star schema, a dimension table will not have any
parent table.
• Whereas in a snow flake schema, a dimension table will
have one or more parent tables.
• Hierarchies for the dimensions are stored in the
dimensional table itself in star schema.
• Whereas hierachies are broken into separate tables in snow
flake schema. These hierachies helps to drill down the data from topmost
hierachies to the lowermost hierarchies.
Comments
Post a Comment