Skip to main content

What is a Star Schema? Which Schema is preferable in performance oriented way? Why?



                                                                                                                                                                      
Scenario:

What is a Star Schema? Which Schema is preferable in performance oriented way? Why?


Solution:
 A Star Schema is composed of 2 kinds of tables, one Fact Table and multiple Dimension Tables.

                        It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema are slicing down, performance increase and easy understanding of data.


F1act Table contains the actual transactions or values that are being analyzed.
Dimension Tables contain descriptive information about those transactions or values.

à In Star Schemas, Dimension Tables are denormalized tables and Fact Tables are highly   
     normalized.

Star Schema 

 

Star Schema is preferable because less number of joins will result in performance.
Because Dimension Tables are denormalized, there will be no need to go for joins all the time.


Steps in designing Star Schema
• Identify a business process for analysis(like sales).
• Identify measures or facts (sales dollar).
• Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension).
• List the columns that describe each dimension.(region name, branch name, region name).
• Determine the lowest level of summary in a fact table(sales dollar).
Important aspects of Star Schema & Snow Flake Schema
• In a star schema every dimension will have a primary key.
• In a star schema, a dimension table will not have any parent table.
• Whereas in a snow flake schema, a dimension table will have one or more parent tables.
• Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
• Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies.

Comments

Popular posts from this blog

CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error.

Scenario: I have 2 ports going through a dynamic lookup, and then to a router. In the router it is a simple case of inserting new target rows (NewRowLookup=1) or rejecting existing rows (NewRowLookup=0). However, when I run the session I'm getting the error: "CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error. The dynamic lookup cache only supports unique condition keys." I thought that I was bringing through duplicate values so I put a distinct on the SQ. There is also a not null filter on both ports. However, whilst investigating the initial error that is logged for a specific pair of values from the source, there is only 1 set of them (no duplicates). The pair exists on the target so surely should just return from the dynamic lookup newrowlookup=0. Is this some kind of persistent data in the cache that is causing this to think that it is duplicate data? I haven't got the persistent cache or...

SQL Transformation with examples

============================================================================================= SQL Transformation with examples   Use : SQL Transformation is a connected transformation used to process SQL queries in the midstream of a pipeline . We can insert, update, delete and retrieve rows from the database at run time using the SQL transformation. Use SQL transformation in script mode to run DDL (data definition language) statements like creating or dropping the tables. The following SQL statements can be used in the SQL transformation. Data Definition Statements (CREATE, ALTER, DROP, TRUNCATE, RENAME) DATA MANIPULATION statements (INSERT, UPDATE, DELETE, MERGE) DATA Retrieval Statement (SELECT) DATA Control Language Statements (GRANT, REVOKE) Transaction Control Statements (COMMIT, ROLLBACK) Scenario: Let’s say we want to create a temporary table in mapping while workflow is running for some intermediate calculation. We can use SQL transformat...

Load the session statistics such as Session Start & End Time, Success Rows, Failed Rows and Rejected Rows etc. into a database table for audit/log purpose.

                                                                                                                                                                     ...