Review of Entire Course! Contains review content for the entire course! Neatly organized, with notes and definitions.

137 views8 pages

For unlimited access to Study Guides, a Grade+ subscription is required.

Chapter 5 Indexing
File Systems and Disks
- Databases are stored in files
o 1 file per relation OR 1 file for entire database
- Files reside on disks
- DBMS must transfer to buffer for access (read, modify, update, delete)
- Data transferred between disk and memory in blocks (slow process)
Indexing
- Index consists of extra information (data structure) added to file to provide faster access to data
- Index is defined on one or more attributes of relation
- Index fined on attribute A of relation R
o Reduce execution time for selections that specify conditions involving A
o Increase execution time for insertions and deletions of tuples from R
o Increase size of file required to store R
B-trees
- Widely-used index structures
- Fully dynamic, easily grow and shrink
- Index blocks & data blocks (half full)
o Index blocks stores maximum of m keys and m+1 pointers
Each block stores at least |m/2| keys and |m/2|+1 pointers
o Data blocks stores maximum of n rows
Each block stores at least |(n+1)/2| records, contains 2 pointers
Clustering vs. Non-clustering Index
- Index on attribute A is a clustering index if tuples with similar values for A are stored together in
same block
- Other indices are non-clustering (or secondary) indices
- Relation may have at most 1 clustering index, and any number of non-clustering indices
Index vs. Sequential Scan
- Using an index is not always the best way to answer range query
o Depends on how many tuples the query will retrieve
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 8 pages and 3 million more documents.

Already have an account? Log in
Chapter 6 - Transactions
Transaction Properties (ACID)
- Atomic: indivisible, all-or-nothing
- Consistency: execution preserves database integrity
- Isolation: Updates are not visible until it commits (finishes successfully)
- Durable: survives failures
- Occurs either entirely or not at all
- Its effects will not be erased or undone by subsequent failures
- Concurrent transactions must appear to have been executed sequentially (one at a time)
Abort & Commit
- Transaction terminates by aborting, or by committing
o Commit: any updates become durable and visible to other transactions (“all”)
o Abort: Any updates are undone (erased) as if transaction never ran at all (“nothing”)
- Transaction that has started but not yet aborted or committed is active
- Commit
o Append Ti, ensure all log records are written (in case of buffering)
o Inform transaction manager commit complete, end transaction
- Abort
o Scan log backwards looking for items updated by Ti, restore old value, append Ti
o Inform transaction manager abort complete, end transaction
Using Transactions
- Concurrency Control: Guarantees that committed transactions appear to execute sequentially
- Recovery Management: Guarantees that committed transactions are durable, aborted
transactions have no effect on database
Storage Management
- Disks are persistent storage, permanent, infinite size (relatively), but SLOW
- Memory is transient storage, temporary, limited size, but FAST
- DBMS fetches data from disk into cache memory
o DB instance organized into pages
o Need page cache management strategy
o Handle “out of cache memory” problems
o When to write cache back to disk (page flushing)
- In-place update of disk blocks
Log-file Structure
- Recovery management done with database log
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 8 pages and 3 million more documents.

Already have an account? Log in
- Log is read/append data structure, stored in a file
- Log records DBMS transactions
o UNDO: “before” modifications; Undo database changes made by transactions that abort
o REDO: “after” modifications; Redo work done by a transaction that commits
o BEGIN/COMMIT/ABORT: record transaction boundaries
Write-ahead Logging
- To ensure log is consistent with main database
o Log record must be written before corresponding page is flushed
o All log records must be written before commit
- RULE 1 (Atomicity): Each operation is known and can be undone if necessary
- RULE 2 (Durability): Effect of committed transaction is known
Log for Recovery
- Recover from system failure
o Determine which transactions were active when failure occurred, undo their database
updated
o Recreate the committed updates that may have been lost
- METHOD
o Scan log from tail to head (backward)
Create list of committed and rolled-back transactions
Undo updates of active transactions
o Scan log from head to tail (forward)
Redo updates of committed transactions
Ignore rolled-back transactions
o Maybe restart active transactions
Concurrency Control
- DBMS processes several transactions simultaneously, faster than serially
- Must ensure concurrent transactions appear to be processed serially (serializable)
- Ri [x] means transaction Ti reads object x
- Wi [x] means transaction Ti writes object x
- Schedule: Arbitrary ordering of read and write operations for a set of transactions (preserving
relative orders within each)
Two-phase Locking
- Most DBMSs use locking to guarantee that only serializable executions occure
- Shared Lock: required to read an object
- Exclusive Lock: required to write an object
- Rules for Locks
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 8 pages and 3 million more documents.

Already have an account? Log in

Get access

Grade+
$10 USD/m
Billed $120 USD annually
Homework Help
Class Notes
Textbook Notes
40 Verified Answers
Study Guides
1 Booster Class