CSI 2110 Summary.pdf

17 Pages

Computer Science
Course Code
Paola Flocchini

This preview shows pages 1,2,3,4. Sign up to view the full 17 pages of the document.
CSI 2110 Summary Fall 2012 CSI 2110 Summary, Fall 2012 Analysis of Algorithms Algorithm - a step by step procedure for solving a problem in a finite amount of time. Analyzing an algorithm means determining its efficiency. Primitive operations - low-level computations independent from the programming language that can be identified in the psuedocode. Big-Oh - given two functions and , we say is if and only if there are positive constant and such that for . It means that an algorithm has the complexity of AT MOST .  Just multiply all terms by the highest degree of , add, and you have your (the coefficient) and (the function).   Logarithms alwaysin base 2 in this class unless otherwise stated  Always want the lowest possible bound, approximation should be as tight as possible Drop lower order terms and constant factors  Big-Omega - is such that for all . It means an algorithm has the complexity of AT LEAST . Big-Theta - is if it is( ) ( ) It means the complexity IS EXACTLY . Properties of logarithms and exponentials:   ( )       Remember the following equations:  ∑  ∑ ( )  ∑   ∑  ∑ Only if the graph is implemented as an adjacency list Stacks and Queues Stacks - First in, last out. Queues - First in, first out. 1 | P a g e CSI 2110 Summary, Fall 2012 Extendable Arrays  Growth function is Regular push: 1 Special push: Cost of creating the new array ( ) + copying the old elements into the new array (length of old array) + 1 Phase: starts at creation of the new array and ends with the last element being pushed. The one pushed after that is a special push, and starts a new phase. Complexities Method ArrayList LinkedList Unsorted Seq SortedSeq size isEmpty get replace insert remove minKey/minElement removeMin Selection Sort Using an external data structure - insert your elements into the first data structure, find the smallest one, add it into your second data structure. Takes time. In place (Not using an extra data structure) - search through the array, find the smallest, add it to the front of the array. Repeat until it's sorted.  Complexity: Average and worst case: performs in operations regardless of input, since is always executed in . Insertion Sort Using an external data structure - insert the first element of the first data structure into the other one, but adding each element before or after the existing elements according to size. Then, and add it into the old ADT. In place - Take the second element and switch it with the first if needed. Take the third element and switch it with the second, and then the first if need be, and so on.  Complexity: Average and worst case is . Trees Graph - consists of vertices and edges. Tree - a graph that contains no cycles. Root - a node without a parent. Internal node - a node with at least one child. External node - a node without any children. Ancestor - parent, grandparent, great-grandparent, etc. Descendent - child, grandchild, great-grandchild. Subtree - a tree consisting of a node and its descendants. Distance - number of edges between two nodes. 2 | P a g e CSI 2110 Summary, Fall 2012 Depth - number of ancestors. Height - the maximum depth. Preorder - visit, left, right. Postorder - left, right, visit. Inorder - left, visit, right. Binary Trees  Each node has at most two children.  Examples: decision trees (yes/no), arithmetic expressions, Full binary tree - each node is either a leaf, or has two children.  In the book, children are completed with dummy nodes, and all trees are considered full. Perfect binary tree - a full binary tree with all the leaves at the same level. Complete binary tree - perfect until the level , then one or more leaves at level . Properties of Binary Trees  o In Perfect Binary Trees:     Maximum number of nodes at each level is Properties of Height  Binary:  Binary (Full):  Binary(Complete): (integer part of )  Binary(Perfect): Implementing Complete Binary Trees with ArrayList  , , , , , , , , all have complexity .  Left child of is if .  Right child of is if .  Parent of is if . Root is if   Is a leaf? True if . Representing General Trees A tree T is represented by a binary tree T' with the following algorithm, seen in class:  .  First child in in is the left child of in  First sibling of in is right child of in  If is a leaf and has no siblings, then the children of are leaves. 3 | P a g e CSI 2110 Summary, Fall 2012  If is internal and v is its first child theis the left child of in  If has a sibling immediately following it, is the right child of in . Heaps MinHeap - All children are larger than its parent. MaxHeap - All children are smaller than its parent. Height of a heap: a heap storing keys has a height of Removal - Remove the top element   Replace with last key in the heap.  Begin the downheap. Downheap  Compares the parent with the smallest child.  If the child is smaller, switch the two.  Keep going.  Stops when the key is greater than the keys of both its children or the bottom of the heap is reached.  Insertion  Add key into the next available position.  Begin upheap. Upheap  Similar to downheap.  Swap parent-child keys out of order. Regular Heap Construction -We could insert items one at a time with a sequence of heap insertions, ∑ . But we can do better with bottom-up heap construction. Bottom-up Heap Construction (WILL be a question on midterm and/or final exam!)  Idea: recursively rearrange each subtree in the heap starting with the leaves.  Switch starting from the bottom, work your way up to bigger subtrees and rearrange them.  To construct a heap bottom-up, construct a tree with the number of nodes you need to insert. Start from the back of the array, and start adding from the right of the subtree, filling all the leaves.  These are all okay. Move on to the right of the level and fill in the entire level (again, working backwards from the array). Rearrange those subtrees.  Continue to the next level. Fill, rearrange, etc. Number of Swaps: How many nodes at level ? There are . Will do swaps.   Complexity is which is the best we can possibly get. Heaps are implemented the same way as binary trees. Heap Sort - Cost is 4 | P a g e CSI 2110 Summary, Fall 2012 Heap Sort In Place  Phase 1 - We build a max-heap to occupy the whole structure  Phase 2 - We start the part "sequence" empty and we grow it by removing at each step ( ) the max value from the heap and by adding it to the part "sequence", always maintaining the heap properties for the "heap" part.  Take the root and switch it with the rightmost child. Reheap. Keep going along the last row the same way. Switch with root, reheap, etc. Maps and Dictionaries Map - multiple items with the same key are NOT allowed Dictionary - multipleitems with the same key ARE allowed Unordered Sequence  Searching and removing takes time  Inserting takes time.  Applications to log files (frequent insertions, rare searches and removals) Ordered Sequence - Array-based  Searching takes with binary search.  Inserting and removing takes time.  Applications to look-up tables (frequent searches, rare insertions and removals).  Start in the middle, if the key we're searching for if larger, go to the right. If it's smaller, go to the left. Binary Search Trees  A binary tree such that keys stored to the left are less than , and keys stored to the right are greater than or equal to .  External nodes do not hold elements but serve as place holders.  Inorder traversal gives you the keys in increasing order 5 | P a g e CSI 2110 Summary, Fall 2012  Complexity: o Worst case: where all keys are to the right or to the left. o Best case: where leaves are on the same level or on an adjacent level. Insertion  Always insert at the end of the search tree based on the correct order  If you're adding a double, always insert to the right, not to the left of the tree. Deletion  If it’s at the end, you can just remove it.  If it's not at the end, replace it with the next in the inorder traversal. AVL Trees  AVL trees are balanced.  They are binary search trees that for every internal node of , the heights of the children can differ by at most .  Height of an AVL tree is always .  The height is . Insertion  Balanced - if for ever node , the height of the 's children differ by at most 1.  It tree becomes unbalanced, we need to rebalance.  Rebalance o Identify three nodes (grandparent, parent, child) and the 4 subtrees attached to them. o Find the node whose grandparent is unbalanced. This node (child) is , the parent is , and grandparent is .  Choice of is not unique o Identify the four subtrees, left to right, as and o The first who comes in the inorder traversal is , second is , and third is . Removal  Remove the same way as in a binary search tree. However, this may cause an imbalance. 6 | P a g e CSI 2110 Summary, Fall 2012 Complexity  Searching, inserting, and removing are all , that's what makes AVL trees so nice. Trees  A tree is a multi-way search tree with the following properties: o Node size property: every internal node has at most four children o Depth property: all external nodes have the same depth  Can't have more than four children or less than two children.  Depending on the number of children, an internal node of a tree is called a -node, -node, or -node.   Searching in a tree with items takes time  Min number of items: when all internal nodes have key and children: ,  Maximum number of items :when all internal nodes have three keys and four children. ∑ Insertion  Insert similar to a binary search tree. Insert at the end after (binary) searching where it goes.  May cause overflow, since you're only allowed three elements in one node. o Take the third element, send it up, and make new nodes out of the first two (one) and the fourth one (two).  Insertion takes time Deletion  Replace the deleted key with the inorder successor  Can cause underflow, might need to fuse nodes together to fix this. To handle an underflow at node with parent , we consider two cases: o Case 1: the adjacent siblings ofare 2-nodes.  Fusion operation: we merge with an adjacent sibling and move an item from to the merged node .  After a fusion, the underflow may propagate to the parent . o Case 2: an adjacent sibling of is a -node or a -node.  Transfer operation: 1. We move a child from to 2. We move an item from to 3. We move an item from to  After a transfer, no underflow occurs.  Deleting takes time. Note that searching, inserting, and deleting all take time. 7 | P a g e CSI 2110 Summary, Fall 2012 Hash Tables Problem A / Address Generation: Construction of the function . It needs to be simple to calculate, and must uniformly distribute the elements in the table. For all keys , is the position of in the table. This position is an integer. Also, ( ) if . Searching for a key and inserting a key (all dictionary ADT operations) takes time. We have the function ( ) where we have the two following sub-functions: 1. Hash code map o They reinterpret a key as an integer. They need to: i. Give the same result for the same key ii. Provide a good "spread" o Polynomial accumulation  We partition the bits of the key into a sequence of components of fixed length, .  Then we evaluate the polynomial at a fixed value , ignoring overflows.  Especially suitable for strings. o Examples  Memory address (we reinterpret the memory address of the key object as an integer, this is the default hash code for all Java objects)  Integer cast (Reinterpret the bits of the key as an integer)  Component sum (We partition the bits of the key into components of fixed length and we sum the components). 2. Compression map o They take the output of the hash code and compress into the desired range. o If the result of the hash code was the same, the result of the compression map should be the same. o Compression maps should maximize "spread" so as to minimize collisions o Examples  Division: where is usually chosen to be a prime number (number theory).  Multiply Add Divide (MAD): where and are nonnegative integers such that . Problem B / Collision Resolution: What strategy do we use if two keys map to the same location ? Load factor of a Hash table: where is the number of elements and is the number of cells. The smaller the load factor, the better. Linear Probing: We have . Consider a hash table A that uses linear probing. In order to search, we do the following:  :We start at cell , and we probe consecutive locations until one of the following occurs: o An item with key is found o An empty cell is found o cells have been unsuccessfully probed To handle insertions and deletions, we introduce a special object, called AVAILABLE, denoted , which replaces deleted elements. 8 | P a g e CSI 2110 Summary, Fall 2012  : We search for an item with key . If such an item is found, re replace it with the special object and we return element . Otherwise, we return  : We throw an exception if the table is full. We start at cell . We probe consecutive cells until one of the following occurs: o A cell is found that is either empty or labelled o cells have been unsuccessfully probed We store item in cell . Performance of Linear Probing: Average number of probes is where is the loading factor. Even if it'd 90% full, on average you'll find it in 5 or 6 tries, so it, which is very good. In the worst case, hash is not better than AVL, when clustering occurs, it's . Quadratic Probing: We have . The problem with this is that modulo is hard to calculate and it only visits half of the table but it's not a big deal. Similar problem: you avoid linear clustering, but every key that's mapped to the same cell will follow the same path. There's more distributed clustering, which should be avoided. Called secondary clustering. Double Hashing: We have where is the primary hashing function and is the secondary hashing function. Bubble Sort You literally bubble up the largest element. Move from the front to the end, Bubble the largest  value to the end using pairwise comparisons and swapping. Once you go through the whole array, you start from the first element again and go through the same way.  In order to detect that an array is already sorted so we don't have to go through it again unnecessarily, we can use a boolean flag. If no swaps occurred, we know that the collection is already sorted. The flag needs to be reset after each "bubble up".  Complexity is . Recursive Sorts Divide and Conquer paradigm:  Divide: div
More Less
Unlock Document

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.