INFO 340 Lecture Notes - Lecture 7: Data Warehouse, Star Schema, Fact Table

26 views2 pages
17 May 2018
School
Department
Course
INFO 340 Lecture 7
Concepts
o Distinguishing characteristics from relational database
Subject-oriented, rather than all aspects of business
E.g., customers
Integrated multiple data sources
Likely have different formatting Make consistent to present
unified view of data
E.g., insurance companies, municipal courts, etc.
Time-variant depth over time
Accurate for a very specific point in time; records based on fact
Non-volatile storable + doesn’t change over time
New data added to supplement, but not replace
Read only
- Queries range from simple to complex
- Problems
o Could be incomplete
o Data ownership
o High maintenance
o LEADS TO: increased risk + cost of data warehouses
- Star schema (dimensional model)
o Star shape allows data to be easily accessible; only one join
o Simple queries against complex data
- Data Warehouse Tools: ETL Processes
o Extraction identify where data is
2 methods
Top down - identify what is needed first, then go out + obtain
what is missing
Bottom down look at available data + use it
Data sources normally internal
o Transformation find data, assemble/aggregate it, and summarize it
Implemented w series of rules/functions
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Concepts: distinguishing characteristics from relational database, subject-oriented, rather than all aspects of business, e. g. , customers. Queries range from simple to complex. Problems: could be incomplete, data ownership, high maintenance, leads to: increased risk + cost of data warehouses. Star schema (dimensional model: star shape allows data to be easily accessible; only one join, simple queries against complex data. Implemented w series of rules/functions: loading load data + load any previously defined constraints. A su(cid:271)set; (cid:862)(cid:373)i(cid:374)i (cid:449)arehouse(cid:863), (cid:272)e(cid:374)tered arou(cid:374)d a depart(cid:373)e(cid:374)t or di(cid:448)isio(cid:374: not a data warehouse, support analytical requirements of a particular business unit (e. g. , sales department) Improve response time due to less data: why, future users more easily defined, lower cost, drawback less robust data for answering higher level questions, less complex, less expensive, see efficiency gains sooner. I(cid:374)(cid:373)o(cid:374)"s corporate i(cid:374)for(cid:373)atio(cid:374) fa(cid:272)tory (cid:894)cif(cid:895: ki(cid:373)(cid:271)all"s busi(cid:374)ess di(cid:373)e(cid:374)sio(cid:374)al life(cid:272)y(cid:272)le, most common bc less risky + easier to implement.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related Documents