BIOD08H3 Lecture Notes - Lecture 7: Computational Neuroscience, Linear Separability, Sigmoid Function

15 views22 pages
Lecture 08 Deep Learning
[slide 1]
Deep learning is actually a very board research program which has a root in computational neuroscience, that has now started to
influence neuroscience again with its impact on the neuroscience is increasing exponentially
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 22 pages and 3 million more documents.

Already have an account? Log in
[slide 2]
How do we optimize a neural network? This is going to be the perspectives of how we are going to investigate deep learning, in
which we have already started talking about it last week
Recalling from last week, while the synaptic plasticity and memory hypothesis (abbrev: spm hypothesis) has been generally accepted
in the field
Our ability of learning and remembering things is fundamentally the result of the ability in changing the synaptic connections
between the neurons, in order to alter the way the information is flowing through the circuits
The way that we are going to frame this problem of how the synaptic connections are being altered in order to learn something, is
through defining a loss function
Loss function, is simply saying how badly you are doing something, which can be any function that measures the badness
While the goal of the learning agent, is to minimize the loss in order to maximize the learning.
Mathematically, it means we are interested in finding the synapses weights, W*, that provide the minimal amount of loss
At the minimum, learning has to involved the improvement of the minimization of the loss function, meaning overtime, the loss
must goes down
o This is being expressed through the expression with time t using the loss function L
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 22 pages and 3 million more documents.

Already have an account? Log in
[slide 3]
However, the problem that we face is that we have a stupidly large number of synapses, hence how are we going to update the
information?
Recalling the algorithm being taught is gradient descent
Gradient decent, basically expressing the loss function is the function of the synaptic weights (x-axis), and y-axis the Loss
function
We are going to update the gradient by going down the gradient of this loss function meaning we are going to determine the
slope of the function, and whatever the slope of the function is decreases, that would be the direction we are going for
One important thing to keep in mind is that these updates needed to be done in small steps because you can image that if you
have a function that
o For example, if the slope in this particular synaptic weight is telling us that it is decreasing, thus going in the right
direction. However if you go too far in the next step, you are going to jump to the other side of the function
o This is why the variable α, is being introduced, that is responsible in the controlling of the step sizes
As W becomes W- the slope of the loss function multiplied by α
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 22 pages and 3 million more documents.

Already have an account? Log in

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents