MIS 325 Lecture Notes - Lecture 18: Transaction Processing System, Data Warehouse, Data Mart
Review: Data Warehousing and Business Intelligence
● Data sources
○ Transaction processing system:
■ Identify purchasing behavior of customers
■ Tracked transactions → Increased insights → Increased sales
○ Enterprise software (CRM, SCM, ERP):
■ Constantly creating new databases
■ Some data is coming from organization itself
■ Some data may be coming from outside the organization
○ Surveys
○ External sources
■ Partners
■ Data aggregators
● Poor information
○ Legacy systems: Older information systems that are incompatible with other
systems, technologies, and ways of conducting business
○ Operational data cannot always be queried
■ Most transactional databases are not set up to be simultaneously
accessed for reporting and analysis
■ Database analysis requires significant processing
● Data warehouse:
○ Subject-oriented:
■ Customers
■ Patients
■ Students
■ Products
○ Integrated:
■ Consistent naming conventions
■ Formats
■ Encoding structures
■ From multiple data sources
○ Time-variant: Can study trends and changes
○ Non-updated:
■ Read-only
■ Periodically refreshed
○ Collection of data used in support of management decision-making processes
● Data mart: A data warehouse that is limited in scope
● The ETL process
○ Extract operational data
○ Transform
○ Load to data warehouse
● History leading to data warehousing
○ Improvement in database technologies, especially relational DBMSs
Document Summary
Tracked transactions increased insights increased sales. Some data is coming from organization itself. Some data may be coming from outside the organization. Legacy systems: older information systems that are incompatible with other systems, technologies, and ways of conducting business. Most transactional databases are not set up to be simultaneously accessed for reporting and analysis. Collection of data used in support of management decision-making processes. Data mart: a data warehouse that is limited in scope. Improvement in database technologies, especially relational dbmss. Advances in computer hardwares, including mass storage and parallel architectures. Emergence of end-user computing with powerful interfaces and tools. Advances in middleware, enabling heterogeneous database connectivity. Recognition of difference between operational and informational systems. Integrated, company-wide view of high quality information (from disparate databases) Separation of operational and informational systems and data (for improved performance) A system that is used to run a business in real time. Narrow, planned, and simple updates and queries.