Department

Computer Science

Course Code

CMPSC 448

Professor

Daniel Kifer

CMPSC 448: Machine Learning and AI: HW 4 (Due March 25)

Name:

1. Instructions

•Upload to ANGEL one zipped ﬁle (to avoid an ANGEL bug) containing:

–Your version of hw4stub.py

•You cannot look at anyone else’s code.

•All code in hw4stub.py should be inside functions (importing hw3stub.py should not cause code to

execute).

•To check your code, type “python hw4tester.py” at a command prompt. Your code will be graded

based on correctness on diﬀerent inputs (using Python version 2.x).

2. Gradient Descent

In the following questions, we use the following notation. ~w = (w1, w2, . . . , wk) is the weight vector

with dimension k. The data is {(~x1, t1),...,(~xn, tn)}and has nrecords. Each ~xjis a feature vector whose

components are represented as ~xj= (xj1, xj2, . . . , xjk ).

We will be using the linear regression model whose prediction for a feature vector ~xjis ~w ·~xj=

k

P

i=1

wixji.

Question 1. If tjis the target and ~w ·~xjis the prediction, we can measure the discrepancy using one-half

squared error: f(tj, ~w ·~xj) = 1

2(tj−~w ·~xj)2. The average error over the training set is then:

1

2n

n

X

j=1

(tj−~w ·~xj)2=1

2n

n

X

j=1 tj−

k

X

i=1

wixji!2

=1

2n

n

X

j=1

(tj−w1xj1−w2xj2− · · · − wkxjk)2

•In hw4stub.py, ﬁll in the function lsq gradient(t,data,w), which will return the gradient as a

numpy array with shape (k, )(e.g., numpy.array([1,2,3])). The parameter tis the numpy array

(with shape (n, )) of target values, so t[j] is the target for the jth record. The parameter data is

a numpy array with shape (n, k)where each row corresponds to a feature vector (row jis ~xj). The

parameter wis a numpy array with shape (k, )and corresponds to the current value of the model

parameters at which we want the gradient.

•In hw4stub.py, ﬁll in the function gradient descent lsq(t,data,w,eta) where the parameter eta

is a scalar (the learning rate). This function takes in wand performs 1 iteration of gradient descent

with learning rate η. The output is the new weight vector.

Question 2. We will now change the error function to be

φ(tj, ~w ·~xj) = (2∗1

2(tj−~w ·~xj)2if tj−~w ·~xj≥0

1∗1

2(tj−~w ·~xj)2if tj−~w ·~xj<0

The average error over the training set is then:

1

n

n

X

j=1

φ(tj, ~w ·~xj)

•In hw4stub.py, ﬁll in the function phi gradient(t,data,w) to return the gradient.

•In hw4stub.py, ﬁll in the function gradient descent phi(t,data,w,eta) to return the weight vector

after one iteration of gradient descent.

1

Over 90% improved by at least one letter grade.

OneClass has been such a huge help in my studies at UofT especially since I am a transfer student. OneClass is the study buddy I never had before and definitely gives me the extra push to get from a B to an A!

Leah — University of Toronto

Balancing social life With academics can be difficult, that is why I'm so glad that OneClass is out there where I can find the top notes for all of my classes. Now I can be the all-star student I want to be.

Saarim — University of Michigan

As a college student living on a college budget, I love how easy it is to earn gift cards just by submitting my notes.

Jenna — University of Wisconsin

OneClass has allowed me to catch up with my most difficult course! #lifesaver

Anne — University of California

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.