CIS 4400 Lecture 4: Dimesional Modeling Process

2/13/19: Dimensional Modeling Process
http://holowczak.com/data-warehouse-dimensional-modeling/1/
• Dimensional Modeling Process (follow Kimball’s 4 Step)
o Requirements
▪ Document overall processes
▪ Identify one core business
▪ Identify source that supports this core business function
▪ Gather business user requirements. Develop KPI’s
o Define grain
▪ The smallest unit we can represent in a fact
▪ Build rest of the model around it
▪ Fundamental level of detail
▪ Three types of grains:
• Transaction: fact table contains a fact for each transaction. One to one
• Periodic/Screenshot: sales over a day. The grouping/collection of a
transaction. Time-based summary OR based on stores/location, ettc.
• Accumulating: summarizes a collection of business transactions over a
time. Custom ordering because it doesn’t happen one instance. E.g:
insurance claim. Different dates for this transaction. Unfolds over time.
Very transaction unfolds over time. Missing values @ first
▪ Example of country store
• Periodic grain: store, date, customer, salesperson, total $. Can’t drill
down on this
• Transaction grain: date, time, customer, salesperson, description, qty,
total $$. Each line item is a separate fact.
▪ Kimball advocates that we use transaction grains because we can create
additional snapshots. Not too worried about the process
▪ Inmon cares about the process time. He prefers periodic screenshots. Store
aggregates
▪ Hard to tell apart transactions if they don’t have a unique ID. Fine grain, fact
record.
o Dimensions: what surrounds the model
▪ To start off with sales fact, go and develop the surrounding dimensions
▪ Supply the context for the fact
▪ STARR schema
▪ Answer questions like who, what, where, why
▪ Extract ome plots of the grain and map out the dimension
• Normalize the relation
a. Product table, product category table. Combine both and denormalize
• Embellish dimension with additional attributes. Add more detail.
(Dimensional hierarchy)
. Time → calendar → fiscal
a. Used to navigate the information and drill down
b. Adding region to the database
• Add a surrogate key. Primary key. Distinct from the key that comes.
RTis key used in the rest of the model
Document Summary
2/13/19: dimensional modeling process http://holowczak. com/data-warehouse-dimensional-modeling/1: dimensional modeling process (follow kimball"s 4 step, requirements, document overall processes, gather business user requirements. One to one: periodic/screenshot: sales over a day. Time-based summary or based on stores/location, ettc: accumulating: summarizes a collection of business transactions over a time. Custom ordering because it doesn"t happen one instance. Missing values @ first: example of country store, periodic grain: store, date, customer, salesperson, total $. Can"t drill down on this: transaction grain: date, time, customer, salesperson, description, qty, total 7365. Each line item is a separate fact: kimball advocates that we use transaction grains because we can create additional snapshots. Store aggregates: hard to tell apart transactions if they don"t have a unique id. Combine both and denormalize: embellish dimension with additional attributes. Used to navigate the information and drill down. Adding region to the database: add a surrogate key. Rtis key used in the rest of the model.