Learning Theory Definitions
and Concepts
Learning: a
change in behavior that lasts for a long period of time.
Performance:
the doing of a behavior not necessarily means something was learnt.
4 stages of
learning: Acquisition, fluency, generalization, maintenance
Principle of
parsimony: unless there is evidence to the contrary, you must
account for a phenomenon with the simplest explanation available.
Behaviorism:
The study of behavior
Conditioning:
this is learning very simple
Classical
Conditioning: a.k.a Pavlovian conditioning an association
between things. Leaning that things go together. A predictable
relationship that one thing is related to another.
Conditioned stimulus
predicts an unconditioned stimulus and causes a conditioned response.
Operant
Conditioning: Skinnerian conditioning an animal learns that
its behavior has consequences. Discriminative stimulus (your command)
Response Consequence.
4 possibilities: Positive
Reinforcement, Negative Reinforcement, Positive Punishment, Negative
Punishment
Single event
learning: learning that occurs when something happens that is
not related to anything else. A stimulus causes a response.
Orienting Response:
the behavior of turning the head and attention towards a new voice or
visual stimulus.
Habituation:
The ability to get used to and stop reacting to meaningless stimuli.
Spontaneous
Recovery: return of the pre-habituation response to the stimuli.
Sensitization:
opposite if habituation. The reaction to a stimulus becomes even
stronger when the stimulus is being shown repeatedly.
Adaptation:
sometimes confused with habituation. Adaptation, however, has nothing to
do with learning. It is simply the tiring of the sensory neurons of the
nervous system to perceive the stimulus.
Learned Irrelevance:
aka pre-exposure effect. This is very similar to habituation. It is the
learning to ignore things that have no meaning to the animals life. I
learn to ignore the doorbell when I realize that it doesnt mean people
are coming.
Primary Reinforcer:
anything the dog likes intrinsically food, water, cuddling, etc.
Secondary Reinforcer:
these are reinforcers that become associated with a primary reinforcer
and hence become important to the dog (clicker).
Schedules of
Reinforcement:
- Continuous
Reinforcement Schedule (CRF) every occurrence of the response
is followed with a reward. The best for first teaching a behavior.
- Partial
Reinforcement Schedule (PRF) aka Intermittent Reinforcement
Schedule. Responding is rewarded only after certain responses have
been completed. Schedules are:
- Fixed ratio (FR)
a reward is given after every set number of responses on
FR-5 schedule every 5 responses receive a reward. Very high
and steady response rate except for the post-reinforcement pause
after the reward.
- Variable ratio (VR)
the number of responses for a reward changes from one reward
to another on a VR-5 the average will be 5 responses. Response
is high and steady with minimal post reinforcement pause. You
need to worry about ratio strain that happens when the variable
ratio average is increased too fast.
- Random ratio
(RR) similar to variable ratio
- Fixed interval (FI)
a reward is given only after a specific interval of time has
elapsed from the previous reward. On FI-5 the reward comes only
if a response occurs after 5 seconds from the last reward.
Responses can be weak right after the reward this is called
the fixed interval scallop.
- Variable
interval (VI) the interval that is required to elapse in order
to get the reward changes from one reward to another. On VI-5
the average is 5 seconds but it can be 10 seconds once and 1
second the 2nd time etc.
- Differential
Reinforcement Schedule (DRF) only certain rates or certain
types of responses are reinforced. Schedules are:
- Differential
reinforcement of high rates of behavior (DRH) the response has
to happen within a certain time after the last response
- Differential
reinforcement of low rates of behavior (DRL) - the response has
to happen only after certain time has elapsed after the last
response.
- Differential
reinforcement of other behaviors (DRO) reward is given to
behaviors that are different from one specific behavior.
- Differential
reinforcement of incompatible behaviors (DRI) only responses
that are cannot happen while doing another (unwanted) behavior
are being reinforced.
- Differential
reinforcement of excellent behavior (DRE) very important
for dog training we reward those behaviors that are better
than the ones already accomplished we reward the best sit, the
best down, etc.
- Duration
Reinforcement Schedule the response must be maintained for an
entire interval for a reward to be given. Schedules are:
- Fixed Duration
(FD) maintain a behavior for a fixed time to get a reward
stay exercise is good example
- Variable
Duration (VD) maintain a behavior for a changed period of time
- Random Duration
(RD) similar to variable duration
Premack Theory of
Reinforcement: developed by David Premack in the mid 1960s. The
opportunity to engage in certain behaviors is reinforcement on its own.
Another way to put is in order to get item B you must complete item
A if you want to eat desert, you must eat our veggies first. If you
want to chase that Frisbee, you must sit first.
Discrimination:
the ability to respond to a specific stimulus. For example, sit only
when the word sit is being said.
Generalization:
The response has to happen in every place or time. The dog needs to
learn for example that the cue sit means that it has to sit regardless
or location, time or distractions.
Proofing:
Achieving great generalization by the dog. This means that the dog cal
respond in the same manner to a discriminative stimulus every time, all
the time.
Salience: a
discriminative stimulus that is very visible or noticeable. It tends to
get the most attention of the dog.
Overshadowing:
Stimuli that are not noticed by the animal because there are more
salient stimuli around. A salient stimulus overshadows many other
stimuli
Blocking:
The phenomenon in which a stimulus is being disregarded by an animal if
presented together with an already salient and established stimulus.
This is why you need to introduce the new cue before the lure and not
with it.
Discrimination:
The ability to choose between two or more different things. Can be
hard to teach. See pages 90-96 in Excel-Erated Learning.
Preparedness:
the predisposition of the animal to learn classical conditioning easier
with certain unconditioned stimuli and conditioned stimuli, and not
others. For example it is easier to learn a flavor with illness then a
visual sign with illness. This is very adaptive.
Experimental
Neurosis: An outcome of a dog forced to make a discrimination
that is no longer possible. Dog may show great anxiety.
Extinction:
what happens when rewards are no longer being given. The behavior is
degraded until it is no longer offered. This is not unlearning. The dog
simply learns a new rule.
Spontaneous
Recovery: a behavior that happens after it was allegedly
extinct. The behavior that was extinct suddenly reappears. Aka
extinction burst.
Partial
Reinforcement Extinction Effect (PREE): Continuing to engage in
a behavior despite the fact that reward is being given anymore. Usually
due to a variable ratio schedule of reinforcement.
Escape/Avoidance
Response: The response to a negative reinforcement of positive
punishment method of training. The dog is engaging in a behavior in
order to avoid or escape a certain aversive outcome. Can be signaled or
un-signaled.
Learned
Helplessness: if the aversive does not follow a signal or the
dog is not allowed to escape the aversive, the dog will eventually lay
down and become immobile, after learning that there is nothing it can do
to stop the aversive from happening.
Shaping: aka
shaping by successive approximations.
A method of teaching a new
behavior in which any behavior that begins to resemble the wanted
behavior is reinforced. Gradually, the standard of the behavior that is
reinforced is increased to resemble the wanted behavior.
Prompting:
Manipulating the animal or the environment in a way that makes the dog
do the behavior. There can be visual prompts as well known as lures. The
problem is that the dog learns that the prompt means the behavior needs
to be done and the prompt must be faded as soon as possible.
Chaining: A
method of teaching a complex sequence of behaviors. Each behavior
signals the other behavior that eventually signals a reward. Backward
chaining is the most efficient way usually in this method the last
behavior is trained first followed by a reward. Then we go backwards and
add behaviors.
Autoshaping:
a very strong animal learning process in which the animal starts to
touch or manipulate a conditioned reinforcer (clicker) in order to get
the primary reinforcer. Can be a strong training tool. For example
teaching retrieve to a non retriever. Make the ball, dumbbell a
conditioned reinforcer for food and the dog will start trying to
manipulate it.
Conditioned
Emotional Response (CER): establishing a classically
conditioned emotional response (usually fear). This is the base for many
fears and phobias in many dogs. This process is very resistant to
extinction.
Counterconditioning and desensitization: The methods use
to try and eliminate CERs.
Desensitization is the
process in which we produce a very low level of the stimulus that
produces fear and slowly work up to a full stimulus. Together with this
we need to use counter conditioning which is the association if this
stimulus with a positive consequence.
Taste Aversion
Learning: the process in which animals learn to avoid a certain
food. This happens very fast and is very adaptive.
Flooding / Response
Prevention: the process in which the fear eliciting stimulus is
shown in a full blown way without the subject being able to escape. This
can sometimes work but is unethical and more often then not, only makes
more harm.
|