Study Guides (390,000)
CA (150,000)
UW (7,000)
CS (400)
CS338 (7)
Final

Review of Entire Course! Contains review content for the entire course! Neatly organized, with notes and definitions.


Department
Computer Science
Course Code
CS338
Professor
Khuzaima Daudjee
Study Guide
Final

This preview shows pages 1-2. to view the full 8 pages of the document.
Chapter 5 Indexing
File Systems and Disks
- Databases are stored in files
o 1 file per relation OR 1 file for entire database
- Files reside on disks
- DBMS must transfer to buffer for access (read, modify, update, delete)
- Data transferred between disk and memory in blocks (slow process)
Indexing
- Index consists of extra information (data structure) added to file to provide faster access to data
- Index is defined on one or more attributes of relation
- Index fined on attribute A of relation R
o Reduce execution time for selections that specify conditions involving A
o Increase execution time for insertions and deletions of tuples from R
o Increase size of file required to store R
B-trees
- Widely-used index structures
- Fully dynamic, easily grow and shrink
- Index blocks & data blocks (half full)
o Index blocks stores maximum of m keys and m+1 pointers
Each block stores at least |m/2| keys and |m/2|+1 pointers
o Data blocks stores maximum of n rows
Each block stores at least |(n+1)/2| records, contains 2 pointers
Clustering vs. Non-clustering Index
- Index on attribute A is a clustering index if tuples with similar values for A are stored together in
same block
- Other indices are non-clustering (or secondary) indices
- Relation may have at most 1 clustering index, and any number of non-clustering indices
Index vs. Sequential Scan
- Using an index is not always the best way to answer range query
o Depends on how many tuples the query will retrieve

Only pages 1-2 are available for preview. Some parts have been intentionally blurred.

Chapter 6 - Transactions
Transaction Properties (ACID)
- Atomic: indivisible, all-or-nothing
- Consistency: execution preserves database integrity
- Isolation: Updates are not visible until it commits (finishes successfully)
- Durable: survives failures
- Occurs either entirely or not at all
- Its effects will not be erased or undone by subsequent failures
- Concurrent transactions must appear to have been executed sequentially (one at a time)
Abort & Commit
- Transaction terminates by aborting, or by committing
o Commit: any updates become durable and visible to other transactions (“all”)
o Abort: Any updates are undone (erased) as if transaction never ran at all (“nothing”)
- Transaction that has started but not yet aborted or committed is active
- Commit
o Append Ti, ensure all log records are written (in case of buffering)
o Inform transaction manager commit complete, end transaction
- Abort
o Scan log backwards looking for items updated by Ti, restore old value, append Ti
o Inform transaction manager abort complete, end transaction
Using Transactions
- Concurrency Control: Guarantees that committed transactions appear to execute sequentially
- Recovery Management: Guarantees that committed transactions are durable, aborted
transactions have no effect on database
Storage Management
- Disks are persistent storage, permanent, infinite size (relatively), but SLOW
- Memory is transient storage, temporary, limited size, but FAST
- DBMS fetches data from disk into cache memory
o DB instance organized into pages
o Need page cache management strategy
o Handle “out of cache memory” problems
o When to write cache back to disk (page flushing)
- In-place update of disk blocks
Log-file Structure
- Recovery management done with database log
You're Reading a Preview

Unlock to view full version

Only pages 1-2 are available for preview. Some parts have been intentionally blurred.

- Log is read/append data structure, stored in a file
- Log records DBMS transactions
o UNDO: “before” modifications; Undo database changes made by transactions that abort
o REDO: “after” modifications; Redo work done by a transaction that commits
o BEGIN/COMMIT/ABORT: record transaction boundaries
Write-ahead Logging
- To ensure log is consistent with main database
o Log record must be written before corresponding page is flushed
o All log records must be written before commit
- RULE 1 (Atomicity): Each operation is known and can be undone if necessary
- RULE 2 (Durability): Effect of committed transaction is known
Log for Recovery
- Recover from system failure
o Determine which transactions were active when failure occurred, undo their database
updated
o Recreate the committed updates that may have been lost
- METHOD
o Scan log from tail to head (backward)
Create list of committed and rolled-back transactions
Undo updates of active transactions
o Scan log from head to tail (forward)
Redo updates of committed transactions
Ignore rolled-back transactions
o Maybe restart active transactions
Concurrency Control
- DBMS processes several transactions simultaneously, faster than serially
- Must ensure concurrent transactions appear to be processed serially (serializable)
- Ri [x] means transaction Ti reads object x
- Wi [x] means transaction Ti writes object x
- Schedule: Arbitrary ordering of read and write operations for a set of transactions (preserving
relative orders within each)
Two-phase Locking
- Most DBMSs use locking to guarantee that only serializable executions occure
- Shared Lock: required to read an object
- Exclusive Lock: required to write an object
- Rules for Locks
You're Reading a Preview

Unlock to view full version