Class Notes (835,107)
Canada (508,933)
CS 240 (49)

Cheat Sheet #3 I tried to fit in as much details as I could in one page- its useful for reviewing

1 Page
Unlock Document

Computer Science
CS 240
Therese Biedl

Dictionary as: Unsorted array: Multi-Dimensional data Dictionaries for text strings Claim: sorting items by their probabilities of Storage: Binary Tries: time O(# of bits in x) space O( sum access in non-increasing order minimizes the cost -Combine multi-dimensional key into one key: of the bits in all keys) Insert: 1- leaf: no action is of Search. impossible to search by one aspect of keys required 2- internal: add nodes or ( - no bits left : Proof Idea: If not in the right order, then if we - Use dictionary for each dimension: wastes space nothing - bits left :add ) exchange two items which are in the wrong -Use dictionaries for multi-dimensional data: Fix prefix issue: - allow interior nodes to store order, then the cost decreases. quad, k-d, range keys – add $ at end of every word Move-To-Front heuristic: Upon a successful Running Time 1D: Equal(n) reportall keys , Tries for String: alphabet sigma, store multiway search, bring the element forward to the front. O(k+logn) where k is # of reported elements tree, each node has up to sigma + 1 children, (used when not knowing prob ahead of time)The Quadtrees: Consider a square that encloses all might have unnecessary node cost of shift is already dealt with in the search, points .Break it into four quadrants.Break each Compressed Tries: eliminate nodes with only one works well in practice quadrant into quadrants recursively until every child , & nodes store an index indicating the Sorted Array: square contains at most one point. Worst Case character it splits by Interpolation Search: when dividing quadrants but not efficiently Idea: take advantage of the value of the key to [x.x+x’/2] x [y,y+y’/2] , Range search Equal(n) String Matching determine its location even if empty. very easy to compute & handle , Applications: information retrieval, compression, data- mining, classification Steps: - search(x) in array in index range[L,R] space wasteful , they don’t reflect the nodes, they -compute ration t = (A[R]-x)/(A[R]- A[L]) just divide into quadrants Naïve alg (brute Force) : worst input P= (a^m-1)*b , T = a^n - compare result with key at floor(t*L + (1-t)*R) KdTrees: kd-trees split rectangle into two part of , worst case performance O (n-m+1)*m , start with first char and decide the part to recurse in median of the point in it. Build kd time in pattern and skip by one in Tif miss match Run Time: O(n) worst case but if keys are random Equal(nlogn) height Equal(logn) assume no two Boyer-Moore ( Matching P backwards ) alg: heuristics : then O(loglogn) search cost points with same x,y coordinates, fix by breaking - Character jump: check char occurrence in pattern & shift - Suffix shift: check suffix occurrence in pattern & shift Gallop Search: ties according to the other coordinate, costly after Good for large arrays The number of comparison insert & delete causes unbalance, so do periodic No improvement in worst case to naïve P=ba..a T=aa.aa is still O (logn) (we have at most doubled the rebalancing. Range search O(k+G) & G is the # of Worst case running time O(m*(n-m+1) + |alphabet| ) number of comparisons compared to binary regions unsuccessfully & not fully in R. # of On typical English text the alg probes 30% of chars in T search. region that intersect a line satisfies recurrence KMP alg: Failing fxn run time O(m) , KMP run time : 1- Ordered Lists: relation Q(n)=2*Q(n/4)+1 , and solves to increase I & j by one, 2- decrease j by atleast one, 3- increase Insert/delete: O(1) , search O(n) ( no BS) Q(n)=O(root(n)) I by one .(1) and (3) execute in time at most n, and (2) Skip Lists: Complexity RSKD O(k+root(n)), not that easy but executes no more often than (1),hence a total of 2n Idea: each key (node) has more than one link ( we no bad cases like quadtree comparison. Hence total run time O(m+n) make binary search work for lists , Nodes are Range Search: Each item will be stored Suffix tree: idea: Preprocess text T rather than pattern P. ordered in increasing order and have multiple repeatedly O(logn) times. Use BST inside BS, Observation: P is a substring of T ifand only if P is a prefix of some suffix of T. build a compressed trie that stores all varied links . Expected time for all operations is space O(nlogn) Equal(log n). In practice, performsbetter Insert point by x, from leaf walk back up & insert suffixes of text T. -insert suffixes in decreasing order of than balanced BSTs. P[#links =i] = 1/ 2^I , lower point in subtrees of the nodes. Problem length.- if a suffix is a prefix of another suffix, we do not or increase the dummy node heightwhen needed rebalancing at most liner time . Range search insert it. Store two indexes l , r on each node v (both O(k+log^2(n)) internal nodes and leaves) where node v corresponds to Naïve Boyer-Moore KMP Orthogonal Range Search : we specify a range for substring T[l ..r ]. How: goes throu the tree looking for Preprocess P - O(m+|sigma|) O(m) - Brute-Force prefix matching P , if node reached is less than M then Preprocess T - - - certain O(n^2)s of the Naive(P[0..m −1],T[0..n − 1]) Search P in T O(n*m) O(n*m) O(n+m) O(m) for i 0 to n − m do unsuccessful, only check for at least M for successful search Extra
More Less

Related notes for CS 240

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.