CSE 4412 Study Guide - Data Mining, Association Rule Learning, The Fp

88 views6 pages
York University
CSE 4412: Data Mining
Solutions to Assignment 1
(Fall 2012)
1. We prove this by contradiction. Let’s assume there is a frequent itemset of length k+1 which is
missed by the join step. Let {i1, i2, …, ik, ik+1} be that frequent itemset. Since {i1, i2, …, ik, ik+1} is
frequent, its subsets, {i1, i2, …, ik-1, ik}and {i1, i2, …, ik-1, ik+1}, must be frequent according to the
Apriori property. This means that {i1, i2, …, ik-1, ik}and {i1, i2, …, ik-1, ik+1} are in Lk. Because they
are in Lk and they share the first k-1 items, they will be joined into {i1, i2, …, ik, ik+1} when
generating Ck+1. This contradicts the assumption made at the beginning.
2.
(a) Finding all frequent itemsets containing item i.
min_sup_count=min_sup* (number of transactions) =30%*10=3
So 1-item frequent itemset as follow in the Header Table;
Since we are only looking for frequent itemsets containing item i, I put i to be the last item in the Head
Table in order to prune the tree because we only need to scan the i-conditional pattern base from the
leaves of the FP-tree.
Original F-list: F-list after moving i at the end:
Item Count Item Count
a 7 a 7
i 6 c 5
c 5 r 5
r 5 s 4
s 4 e 3
e 3 l 3
l 3 i 6
The transactions after removing infrequent items and having items ordered according to the modified F-
lis:
TID Items bought Ordered frequent items
1 {s, u, s, a, n} { a, s}
2 {s, a, r, a} {a, r, s}
3 {s, a, m} {a, s}
4 {r, i, c, h, a, r, d} {a, c, r, i}
5 {e, r, i, c} {c, r, e, i}
6 {n, i, c, k} {c, i}
7 {p, a, t, r, i, c, k} {a, c, r, i}
8 {e, m, i, l, y} {e, l, i}
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 6 pages and 3 million more documents.

Already have an account? Log in
9 {c, h, a, r, l, e, s} {a, c, r, s, e, l}
10 {l, i, d, a} {a, l, i}
In the following step, we could have only built the tree with transactions that contain item i. So all the
paths with a leaf other than i can be removed. But we build the whole anyways.
The FP- tree is as follows:
i-conditional-pattern base:
acr: 2 al: 1 cre: 1 c: 1 el: 1
After removing local infrequent items: car: 2 a: 1 cr: 1 c: 1
In this step, first generate frequent itemset: {a, i} {r, i} {c, i}
To learn longer patterns containing i, we build the i-conditional FP-tree:
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 6 pages and 3 million more documents.

Already have an account? Log in

Get OneClass Grade+

Unlimited access to all notes and study guides.

Grade+All Inclusive
$10 USD/m
You will be charged $120 USD upfront and auto renewed at the end of each cycle. You may cancel anytime under Payment Settings. For more information, see our Terms and Privacy.
Payments are encrypted using 256-bit SSL. Powered by Stripe.