Learning Theory Definitions
change in behavior that lasts for a long period of time.
the doing of a behavior not necessarily means something was learnt.
4 stages of
learning: Acquisition, fluency, generalization, maintenance
parsimony: unless there is evidence to the contrary, you must
account for a phenomenon with the simplest explanation available.
The study of behavior
this is learning very simple
Conditioning: a.k.a Pavlovian conditioning an association
between things. Leaning that things go together. A predictable
relationship that one thing is related to another.
predicts an unconditioned stimulus and causes a conditioned response.
Conditioning: Skinnerian conditioning an animal learns that
its behavior has consequences. Discriminative stimulus (your command)
4 possibilities: Positive
Reinforcement, Negative Reinforcement, Positive Punishment, Negative
learning: learning that occurs when something happens that is
not related to anything else. A stimulus causes a response.
the behavior of turning the head and attention towards a new voice or
The ability to get used to and stop reacting to meaningless stimuli.
Recovery: return of the pre-habituation response to the stimuli.
opposite if habituation. The reaction to a stimulus becomes even
stronger when the stimulus is being shown repeatedly.
sometimes confused with habituation. Adaptation, however, has nothing to
do with learning. It is simply the tiring of the sensory neurons of the
nervous system to perceive the stimulus.
aka pre-exposure effect. This is very similar to habituation. It is the
learning to ignore things that have no meaning to the animals life. I
learn to ignore the doorbell when I realize that it doesnt mean people
anything the dog likes intrinsically food, water, cuddling, etc.
these are reinforcers that become associated with a primary reinforcer
and hence become important to the dog (clicker).
Reinforcement Schedule (CRF) every occurrence of the response
is followed with a reward. The best for first teaching a behavior.
Reinforcement Schedule (PRF) aka Intermittent Reinforcement
Schedule. Responding is rewarded only after certain responses have
been completed. Schedules are:
- Fixed ratio (FR)
a reward is given after every set number of responses on
FR-5 schedule every 5 responses receive a reward. Very high
and steady response rate except for the post-reinforcement pause
after the reward.
- Variable ratio (VR)
the number of responses for a reward changes from one reward
to another on a VR-5 the average will be 5 responses. Response
is high and steady with minimal post reinforcement pause. You
need to worry about ratio strain that happens when the variable
ratio average is increased too fast.
- Random ratio
(RR) similar to variable ratio
- Fixed interval (FI)
a reward is given only after a specific interval of time has
elapsed from the previous reward. On FI-5 the reward comes only
if a response occurs after 5 seconds from the last reward.
Responses can be weak right after the reward this is called
the fixed interval scallop.
interval (VI) the interval that is required to elapse in order
to get the reward changes from one reward to another. On VI-5
the average is 5 seconds but it can be 10 seconds once and 1
second the 2nd time etc.
Reinforcement Schedule (DRF) only certain rates or certain
types of responses are reinforced. Schedules are:
reinforcement of high rates of behavior (DRH) the response has
to happen within a certain time after the last response
reinforcement of low rates of behavior (DRL) - the response has
to happen only after certain time has elapsed after the last
reinforcement of other behaviors (DRO) reward is given to
behaviors that are different from one specific behavior.
reinforcement of incompatible behaviors (DRI) only responses
that are cannot happen while doing another (unwanted) behavior
are being reinforced.
reinforcement of excellent behavior (DRE) very important
for dog training we reward those behaviors that are better
than the ones already accomplished we reward the best sit, the
best down, etc.
Reinforcement Schedule the response must be maintained for an
entire interval for a reward to be given. Schedules are:
- Fixed Duration
(FD) maintain a behavior for a fixed time to get a reward
stay exercise is good example
Duration (VD) maintain a behavior for a changed period of time
- Random Duration
(RD) similar to variable duration
Premack Theory of
Reinforcement: developed by David Premack in the mid 1960s. The
opportunity to engage in certain behaviors is reinforcement on its own.
Another way to put is in order to get item B you must complete item
A if you want to eat desert, you must eat our veggies first. If you
want to chase that Frisbee, you must sit first.
the ability to respond to a specific stimulus. For example, sit only
when the word sit is being said.
The response has to happen in every place or time. The dog needs to
learn for example that the cue sit means that it has to sit regardless
or location, time or distractions.
Achieving great generalization by the dog. This means that the dog cal
respond in the same manner to a discriminative stimulus every time, all
discriminative stimulus that is very visible or noticeable. It tends to
get the most attention of the dog.
Stimuli that are not noticed by the animal because there are more
salient stimuli around. A salient stimulus overshadows many other
The phenomenon in which a stimulus is being disregarded by an animal if
presented together with an already salient and established stimulus.
This is why you need to introduce the new cue before the lure and not
The ability to choose between two or more different things. Can be
hard to teach. See pages 90-96 in Excel-Erated Learning.
the predisposition of the animal to learn classical conditioning easier
with certain unconditioned stimuli and conditioned stimuli, and not
others. For example it is easier to learn a flavor with illness then a
visual sign with illness. This is very adaptive.
Neurosis: An outcome of a dog forced to make a discrimination
that is no longer possible. Dog may show great anxiety.
what happens when rewards are no longer being given. The behavior is
degraded until it is no longer offered. This is not unlearning. The dog
simply learns a new rule.
Recovery: a behavior that happens after it was allegedly
extinct. The behavior that was extinct suddenly reappears. Aka
Reinforcement Extinction Effect (PREE): Continuing to engage in
a behavior despite the fact that reward is being given anymore. Usually
due to a variable ratio schedule of reinforcement.
Response: The response to a negative reinforcement of positive
punishment method of training. The dog is engaging in a behavior in
order to avoid or escape a certain aversive outcome. Can be signaled or
Helplessness: if the aversive does not follow a signal or the
dog is not allowed to escape the aversive, the dog will eventually lay
down and become immobile, after learning that there is nothing it can do
to stop the aversive from happening.
shaping by successive approximations.
A method of teaching a new
behavior in which any behavior that begins to resemble the wanted
behavior is reinforced. Gradually, the standard of the behavior that is
reinforced is increased to resemble the wanted behavior.
Manipulating the animal or the environment in a way that makes the dog
do the behavior. There can be visual prompts as well known as lures. The
problem is that the dog learns that the prompt means the behavior needs
to be done and the prompt must be faded as soon as possible.
method of teaching a complex sequence of behaviors. Each behavior
signals the other behavior that eventually signals a reward. Backward
chaining is the most efficient way usually in this method the last
behavior is trained first followed by a reward. Then we go backwards and
a very strong animal learning process in which the animal starts to
touch or manipulate a conditioned reinforcer (clicker) in order to get
the primary reinforcer. Can be a strong training tool. For example
teaching retrieve to a non retriever. Make the ball, dumbbell a
conditioned reinforcer for food and the dog will start trying to
Emotional Response (CER): establishing a classically
conditioned emotional response (usually fear). This is the base for many
fears and phobias in many dogs. This process is very resistant to
Counterconditioning and desensitization: The methods use
to try and eliminate CERs.
Desensitization is the
process in which we produce a very low level of the stimulus that
produces fear and slowly work up to a full stimulus. Together with this
we need to use counter conditioning which is the association if this
stimulus with a positive consequence.
Learning: the process in which animals learn to avoid a certain
food. This happens very fast and is very adaptive.
Flooding / Response
Prevention: the process in which the fear eliciting stimulus is
shown in a full blown way without the subject being able to escape. This
can sometimes work but is unethical and more often then not, only makes