Class Notes
(811,039)

Canada
(494,452)

University of Toronto Scarborough
(30,849)

Computer Science
(300)

CSCC73H3
(10)

Vassos Hadzilacos
(10)

Lecture

# Assignment 2 solution

by
OneClass4706

Unlock Document

University of Toronto Scarborough

Computer Science

CSCC73H3

Vassos Hadzilacos

Fall

Description

Computer Science C73 October 19, 2010
Scarborough Campus University of Toronto
Solutions for Homework Assignment #2
Answer to Question 1. Let D and D be 1he two d2tabases, each containing n elements. Recall that
the median of 2n (distinct) elements has exactly n elements larger than it, and exactly n − 1 elements
smaller than it. Let Query(t,D ) return the element ranked t-th in the database D . Let m = ⌊n/2⌋. By
i i
applying Query(m,D ) and 1uery(m,D ) we can ﬁnd2the values of the m-th elements, a and a , of D 1 2 1
and D ,2respectively. Now we claim that by comparing a to a , w1 can e2iminate as potential candidates
for the overall median half of the elements of each database. To see why, suppose a < a (th1 cas2 a < a 2 1
is symmetric). Then, none of the ﬁrst m elements in D (i.e., th1 elements ranked 1..m in D ) can be 1
the overall median, because there are more than n larger elements in D and D — 1amely, t2e elements
ranked m + 1..n in D and 1he elements ranked m..n in D . Similarly,2none of the last m elements in
D (i.e., the elements ranked n − m + 1..n in D can be the overall median, because there are more than
2 2
n − 1 smaller elements in D and1D — name2y, the elements ranked 1..m in D and the elemen1s ranked
1..n − m in D . 2hus, with two queries we can eliminate from consideration half the elements of each
database. Continuing in the same way, after ⌈log n⌉ steps we will have eliminated from consideration all
2
but one element in each database. At this point, we can retrieve the remaining two elements from the
databases and return the minimum as the median.
This divide-and-conquer algorithm is shown below as the procedure DB-Median(f ,f ,k). The inter-
1 2
pretation of the parameters is that the search for the overall median of the elements in D and D has 1 2
been narrowed to the elements ranked f ..f +1k −11 in D and the 1lements ranked f ..f + k − 1 i2 D 2 2
To ﬁnd the overall median of the elements in the two databases, we call DB-Median(1,1,n).
DB-Median(f ,f ,k1 2
if k = 1 then
▯ ▯
return min Query(f ,D 1,Qu1ry(f ,D ) 2 2
else
m 1= f +1⌊k/2⌋ − 1; m :=2f + ⌊2/2⌋ − 1;
if Query(m ,1 ) 1 Query(m ,D ) t2en2return DB-Median(m + 1,f ,⌈k/2⌉) 1 2
else return DB-Median(f ,m +11,⌈2/2⌉)
The correctness of this algorithm can be shown by proving that for any k, 1 ≤ k ≤ n, if f ,f are 1 2
positive integers such that f + 1 − 1 ≤ n and f + k − 12≤ n, then DB-Median(f ,f ,k) returns th1 2
median of the 2k elements ranked f ..f 1k −1 in D and f ..1 +k −1 2n 2 . This is pro2ed by complete
induction on k, where the essential observation is the point we saw earlier: depending on the outcome of the
comparison between the elements ranked ⌊k/2⌋ in these two ranges, we can eliminate from consideration
half the elements in each range.
Let Q(n) denote the number queries applied to D and D by the call DB-Median(1,1,n). This
1 2
quantity is described by the recurrence Q(n) = Q(⌈n/2⌉)+2: we make two queries to retrieve the medians
of two databases each of size n, and narrow our search to two databases each of size ⌈n/2⌉. In terms of
the parameters of the Master Theorem, we have a = 1, b = 2 and d = 0, i.e., a = b . Thus, Q(n) = d
Θ(n logn) = Θ(logn).
Answer to Question 2.
a. The standard algorithm from merging two sorted lists of length n and n tak1s time p2oportional to
n + n . The suggested algorithm proceeds in k − 1 stages. In stage 1 it merges two lists of length n each,
1 2
producing a list of length 2n. In stage i, 1 ≤ i < k, it merges the list produced in stage i − 1 (of length
i ▯ n) and a list of length n, producing a list of length (i + 1)n. Thus, for all i such that 1 ≤ i < k, stage
1 i requires time proportional to (i + 1)n. The overall time complexity (over all k − 1 stages) is therefore
Θ ▯ k−1 n(i + 1) = Θ(nk ).
i=1
b. A better algorithm is to (a) divide the k lists into two groups, each consisting of (about) k/2 lists;
(b) recursively merge the lists in each group; and (c) merge the resulting two lists (each of size kn/2 into a
single list. The recursion continues until k = 1, in which it simply returns the input list (there is nothing
to merge).
The time complexity of this algorithm is described by the following recurrence:
▯
T(k) = 2T(k/2) + kn, if k > 1
1, if k = 1
(Here we assume, for simplicity, that k is a power of 2. If not, the general term of the recurrence becomes
T(k) = T(⌈k/2⌉)+T(⌊k/2⌉)+kn, and the solution is within a constant factor of the solution for the special
case.) Unwinding the recurrence using standard techniques (e.g., from CSCB36) yields T(k) = Θ(nk logk),
which is better than Θ(nk ).
Answer to Question 3. Professor N. O’Bright’s proposed algorithm is wrong. Here is a counterexample.
The graph is a “triangle” with nodes a, b, and c, and edges {a,b}, {b,c}, and {c,a} with weights 100, 1,
and 1, respectively. Suppose the initial partition of the set of nodes is int1 the node sets V = {a,b}, and
V2= {c}. In that case, N. O’Bright’s algorithm will construct a spanning tree consisting of edge {a,b},
and one of the other two edges. This spanning tree has cost 101, while the spanning tree consisting of
{b,c} and {c,a} has cost 2. Thus, N. O’Bright’s algorithm doesn’t always ﬁnd the MST.
Answer to Question 4. The product of 1 + x + 2x and 2 + 3x is a polynomial of degree 3. In order to
interpolate the product, we must therefore evaluate each of the given polynomials in at least 4 points —
conveniently a power of 2. The polynomial multiplication algorithm that uses the FFT proceeds as follows:
(1) Use the FFT to evaluate each of the given polynomials at the four 4th roots of unity, namely 1,i,−1,−i.
Speciﬁcally, for each of the polynomials, computa,i), whera is the vector of the polynomial’s
coeﬃcients padded out with 0s to length four. Note that here we are using i as the primitive 4th root
of unity. This computation yields, for each of the given polynomials, a vector of length four that is a
value representation (at four points) for each of the given polynomials.
(2) Multiply component-wise the two vectors

More
Less
Related notes for CSCC73H3