This

**preview**shows half of the first page. to view the full**1 pages of the document.**Algorithm: A step-by-step process for

carrying out a series of computations.

Program: An implementation of an

algorithm.

Running time or memory requirement

of programs are dependent on

hardware, OS, and compiler.

Efficiency depend on(time,memory)

Worst-case:Tworst(n) = max {T(I)| |I| = n}

Best-case: Tbest(n) = min {T(I)| |I| = n}

Average-case: Tavg(n) = 1/{T(I)| |I| = n}

Notation Definition

f (n) is O(g(n)) there exist constants c > 0 and n0 > 0

s.t. ((all)n => n0)(0 <= f (n) <= cg(n))

f (n) is Omega(g(n)) there exist constants c > 0 and n0 > 0

s.t. ((all)n => n0)(0 _<= cg(n) <= f (n))

f (n) is equal(g(n)) there exist constants c1, c2 > 0 and n0 > 0

s.t. ((all)n => n0)(0 <= c1g(n) <= f (n) <= c2g(n))

f (n) is o(g(n)) for all constants c > 0, there exists a constant n0 > 0

s.t. ((all)n _=>n0)(0 <= f (n) < cg(n))

f (n) is smallomega(g(n)) for all constants c > 0, there exists

constant n0 > 0 >

s.t. ((all)n _ n0)(0 <= cg(n)) < f (n))

x=log(a)b -> a^x = b

log(a)b = log(c)a/log(c)b

log a + log b = log a*b

sum(0 to n) i = n(n+1)/2

sum(0 to n-1) q^i =(1-q^n)/(1-q)

sum(1 to n) 1/I = O(logn)

Priority queue

Unsorted array

1- (selection sort)

Insert: O(1)

Extract Max : O(n)

Sorted Array

2- (Insertion sort)

Insert : O(n)

Extract Max: O(1)

3- (heap)

Insert: O(logn)

EMax: O(logn)

Build PQ: O(n)

Little space used

Sorting in O(nlogn)

Heap: is a type of binary tree,stores items with keys,all full, last level to the

left, key of parent larger than children(max-heap), height: Equal(logn)

Advantage storing way: easier to implement, mores space efficiency

Indexes: Left( 2*i ), Right( 2*i + 1 ) , parent( floor(i/2))

Building heaps:

Solution1: start with empty heap & insert one by one & perform bubble

up (1toN) worst case time Equal(n*logn)

Solution2: heapify which bubbles down(n/2 to 1) worst case Equal(n)

Sorting using heaps O(n*logn)

QuickSort Running Time

BestCase splits into two Equal(n*logn)

WorstCase min or max Equal(n^2)

AverageCase split into (k-1)&(n-k) Equal(n*logn)

Auxiliary Space Equal(n) because recursion depth

Recurse on small parts causes auxiliary O(logn)

A randomized algorithm is an algorithm that uses

random numbers inits execution & depends on it

T(exp)(x) = E[T(x, R)] = Sum T(x, R).Pr [R].

QuickSort is the fastest comparison-based sorting

algorithm in practice in the average case.

Comparison based sorting

Lower bound Omega(n*logn)

Sort /Running time/Analysis

Selection Sort (n^2) worst-case

Insertion Sort (n^2) worst-case

Merge Sort (n*log n) worst-case

Heap Sort (n*log n) worst-case

Quick Sort (n^2) worst-case (fastest in

practice in average case but not in worst

case Randomized Quick Sort (n*log n)

Linear Time

Counting Sort: stable , but need to set value {0..k}

Running time : Equal(n+k) when k = O(n)

Radix Sort: elements with same number of digits(d)

Each digit has value in {0…k-1}

Running time: Equal(d(n+k)) when k = O(n) and d

is constant

Dictionary ADT insert search delete

Unordered array E(1) E(n) E(1)

Ordered array E(n) E(log n) E(n)

Binary search tree E(height) E(height) E(height)

Balanced node: A node is balanced if the heights of its

two subtrees differ by at most one (an empty subtree is

defined to have height −1).

AVL Tree: A Binary Search Tree such that every node is

balanced, Height AVL is O(logn)

Insert O(logn) perform at most one rotation or double

Delete O(logn) perform as many times as needed

Rotations in O(1)

B-Trees: A generalization of binary search trees

- Node x stores k keys in non-decreasing order and

references to its k + 1 children.

- Keys of subtrees at children are between the

corresponding boundary keys at x.

- all levels are the at the same level

- B-tree of order M: each node has at most M − 1

keys (and therefore atmost M children), also each

non-root node has at least ⌈M/2⌉− 1 keys

(and therefore at least ⌈M/2⌉ children)

A B-tree of order M with n keys has

Height Equal(log(M/2) n) = Equal(log n).

Search Time: O (height. Log(2)M) = O (log(M/2)n)

*log(2)M = O (log n).

Insert: put in leaf, if leaf is full ( aka m-1 keys ) then

preemptively split into two ( roof(M-1) -1 ) &

floor(M/2) ) put median as parent

Delete: if leaf delete, if internal swap then delete, if

leaf has less than ( roof((M/2)-1) then Replenish

- Transfer: borrow take from sibling and give to

parent, and take from parent put in node

- Merge: if transfer doesn’t work, then just merge

with sibling and bring down parent

Smallest value of M is 4

Hashing:

- Keys range from {0…U-1} buckets or slots

- duplicates cause collisions

- Space O(U) ; insert, delete, search take Equal(1)

- Uniform hashing :make it equally likely

- interpret keys as natural numbers, or division method h(k) = k mod M and M is a prime, multiplication method

h(k) = floor(M(kA – floor(kA) ) ) where 0 < A < 1 is a suitable constant

Dealing with collisions

Chaining : each bucket is a unsorted linked-list, load factor alfa = n/M where n is the # of elements ,

expected length is alfa, for search expected time is Equal(alfa + 1) , keep load factor constant, no explicit

need to rehash

- Opening addressing: fxns to store key, use probe sequence , keep load factor smaller than 1 , rehashing is

required often, fast computation(linear hashing) vs better average time( double hashing) , deletion is

problematic , and no space is wasted

-linear probing : h(k,i) = h(k) + i mod M ; easy to compute , log runs of occupied buckets tend to build

Up ( primary clustering )

-quadratic = h(k,i) = h(k) + c1i + c2i^2 mod M l fairly easy so compute, carefully chose c1,c2 to obtain

permutation ( no primary clustering but for two same hash values will have same probe sequence

-double hashing: h(k,i) = h1(k) + ih2(k) mod M ; must chose h2 carfully to avoid clustering, not too

hard to perform and compute in practice

Deleting will cause the slot to be tagged as deleted slot , this requires re-hashing to avoid over accumulation of

deleted cells

Average search in linear probing – successful ½(1+1/(1-alfa))

-unsuccessful ½(1+1/(1-alfa)^2)

Average seach in double hashing: -successful 1/alfa * ln(1/(1-alfa))

-unsuccessful 1/ (1-alfa)

Primary clustering :it is the phenomeunon where the target blocks of consecutive non-empty buckets are most

likely to grow yet larger, it slows down running time for linear probing (long probe sequence)

In practice, usually hasing is preferred over balanced trees, & avoid patterns in the input data

Predecessor: at most right in leftsubtree Successor : at most left in rightsubtree

Extendable hashing:

-hashing is external memory ( we need to min the #

of memory blocks transfers) , build a hashtable

where we can add more external memory without

the need to rehash everything

-problems with hashing:

Open address incurs many block transfers

Chaining spreads to more than a single block

Rehashing

Hash Function h:U -> {0,1}^L(L is binary sequence)

Parts:

-Root: is a hash table with refrences to leaf blocks;

indexes by d leading bits of hash value

-leaves: leaf block in external memory, each of size

at most S(keys are ordered in leaf)

The directory ( root) has size 2^d ( d is the order)

where d <=L

Local depth K , keys within a block degree agree on

the first K bits , all bits with K leading bits are in the

block

For a block with local depth K, 2^(d-k) references

point to that block

◮ f (n) = o(g(n)) if L = 0.

◮ f (n) = equal(g(n)) if 0 < L < infinity

◮ f (n) = smallomega(g(n)) if L = infinity

Preorder: root-L-R

Inorder:L-root-R

postOrder: L-R-root

Extendable hashing operations: Insert : normally or

- block is full = S and ( k > d ) : split into two blocks,

separate according to k+1 bits , update references ,

increase k for those blocks, repeat until k=d

- Block is full =S and ( k =d ) : double the size of

direct ( d=d+1) update references, then split

Disadvantage: wasteful space, but if uniform

distributed then pages expected 69% full

The hashtable is much smaller than # of stored

keyes therefore is may fit in main memory , most

operations should take access o(1) leaf pages, for

more space add a block( rarly need to copy to hash

table, never move all items in dictionary

If x < A[left] then return not found

Else if right – left = 1 & A[left] = x then return left

Else step = 1

While (left + step <= right )

If A[left+step] < x then

Step= step x 2

Else GallopSearch( x, A , left+step/2, left+step)

###### You're Reading a Preview

Unlock to view full version