# CSE 4412 Study Guide - Data Mining, Association Rule Learning, The Fp

88 views6 pages
19 Nov 2012
School
Course
Professor
York University
CSE 4412: Data Mining
Solutions to Assignment 1
(Fall 2012)
1. We prove this by contradiction. Let’s assume there is a frequent itemset of length k+1 which is
missed by the join step. Let {i1, i2, …, ik, ik+1} be that frequent itemset. Since {i1, i2, …, ik, ik+1} is
frequent, its subsets, {i1, i2, …, ik-1, ik}and {i1, i2, …, ik-1, ik+1}, must be frequent according to the
Apriori property. This means that {i1, i2, …, ik-1, ik}and {i1, i2, …, ik-1, ik+1} are in Lk. Because they
are in Lk and they share the first k-1 items, they will be joined into {i1, i2, …, ik, ik+1} when
2.
(a) Finding all frequent itemsets containing item i.
min_sup_count=min_sup* (number of transactions) =30%*10=3
Since we are only looking for frequent itemsets containing item i, I put i to be the last item in the Head
Table in order to prune the tree because we only need to scan the i-conditional pattern base from the
leaves of the FP-tree.
Original F-list: F-list after moving i at the end:
Item Count Item Count
a 7 a 7
i 6 c 5
c 5 r 5
r 5 s 4
s 4 e 3
e 3 l 3
l 3 i 6
The transactions after removing infrequent items and having items ordered according to the modified F-
lis:
TID Items bought Ordered frequent items
1 {s, u, s, a, n} { a, s}
2 {s, a, r, a} {a, r, s}
3 {s, a, m} {a, s}
4 {r, i, c, h, a, r, d} {a, c, r, i}
5 {e, r, i, c} {c, r, e, i}
6 {n, i, c, k} {c, i}
7 {p, a, t, r, i, c, k} {a, c, r, i}
8 {e, m, i, l, y} {e, l, i}
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 6 pages and 3 million more documents.

9 {c, h, a, r, l, e, s} {a, c, r, s, e, l}
10 {l, i, d, a} {a, l, i}
In the following step, we could have only built the tree with transactions that contain item i. So all the
paths with a leaf other than i can be removed. But we build the whole anyways.
The FP- tree is as follows:
i-conditional-pattern base:
acr: 2 al: 1 cre: 1 c: 1 el: 1
After removing local infrequent items: car: 2 a: 1 cr: 1 c: 1
In this step, first generate frequent itemset: {a, i} {r, i} {c, i}
To learn longer patterns containing i, we build the i-conditional FP-tree:
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 6 pages and 3 million more documents.