Clicker Training USA

An Online Clicker Training Community

home     About     Background     Articles     Clicker Agility     Recommended Books     Training Videos     Equipment     FAQ    

Therapy Dogs     Learning Theory 101    Links    Contact 

 

Learning Theory Definitions

Classical Conditioning

Operant Conditioning

Desensitization & Counter Conditioning

Schedules of Reinforcements

Proofing & Generalization

Shaping

 

Learning Theory Definitions and Concepts
 

Learning: a change in behavior that lasts for a long period of time.

Performance: the doing of a behavior – not necessarily means something was learnt.

4 stages of learning: Acquisition, fluency, generalization, maintenance

Principle of parsimony: unless there is evidence to the contrary, you must account for a phenomenon with the simplest explanation available.

Behaviorism: The study of behavior

Conditioning: this is learning – very simple

Classical Conditioning: a.k.a Pavlovian conditioning – an association between things. Leaning that things go together. A predictable relationship that one thing is related to another. 

Conditioned stimulus predicts an unconditioned stimulus and causes a conditioned response.

Operant Conditioning: Skinnerian conditioning – an animal learns that its behavior has consequences. Discriminative stimulus (your command) – Response – Consequence.

4 possibilities: Positive Reinforcement, Negative Reinforcement, Positive Punishment, Negative Punishment

Single event learning: learning that occurs when something happens that is not related to anything else. A stimulus causes a response.

Orienting Response: the behavior of turning the head and attention towards a new voice or visual stimulus.

Habituation: The ability to get used to and stop reacting to meaningless stimuli.

Spontaneous Recovery: return of the pre-habituation response to the stimuli.

Sensitization: opposite if habituation. The reaction to a stimulus becomes even stronger when the stimulus is being shown repeatedly.

Adaptation: sometimes confused with habituation. Adaptation, however, has nothing to do with learning. It is simply the tiring of the sensory neurons of the nervous system to perceive the stimulus.

Learned Irrelevance: aka pre-exposure effect. This is very similar to habituation. It is the learning to ignore things that have no meaning to the animal’s life. I learn to ignore the doorbell when I realize that it doesn’t mean people are coming.

Primary Reinforcer: anything the dog likes intrinsically – food, water, cuddling, etc.

Secondary Reinforcer: these are reinforcers that become associated with a primary reinforcer and hence become important to the dog (clicker).

Schedules of Reinforcement:

  1. Continuous Reinforcement Schedule (CRF) – every occurrence of the response is followed with a reward. The best for first teaching a behavior.
  2. Partial Reinforcement Schedule (PRF) – aka Intermittent Reinforcement Schedule. Responding is rewarded only after certain responses have been completed. Schedules are:
    1. Fixed ratio (FR) – a reward is given after every set number of responses – on FR-5 schedule – every 5 responses receive a reward. Very high and steady response rate except for the post-reinforcement pause after the reward.
    2. Variable ratio (VR) – the number of responses for a reward changes from one reward to another – on a VR-5 the average will be 5 responses. Response is high and steady with minimal post reinforcement pause. You need to worry about ratio strain that happens when the variable ratio average is increased too fast.
    3. Random ratio (RR) – similar to variable ratio
    4. Fixed interval (FI) – a reward is given only after a specific interval of time has elapsed from the previous reward. On FI-5 the reward comes only if a response occurs after 5 seconds from the last reward. Responses can be weak right after the reward – this is called the fixed interval scallop.
    5. Variable interval (VI) – the interval that is required to elapse in order to get the reward changes from one reward to another. On VI-5 the average is 5 seconds but it can be 10 seconds once and 1 second the 2nd time etc.
  3. Differential Reinforcement Schedule (DRF) – only certain rates or certain types of responses are reinforced. Schedules are:
    1. Differential reinforcement of high rates of behavior (DRH) – the response has to happen within a certain time after the last response
    2. Differential reinforcement of low rates of behavior (DRL) - the response has to happen only after certain time has elapsed after the last response.
    3. Differential reinforcement of other behaviors (DRO) – reward is given to behaviors that are different from one specific behavior.
    4. Differential reinforcement of incompatible behaviors (DRI) – only responses that are cannot happen while doing another (unwanted) behavior are being reinforced.
    5. Differential reinforcement of excellent behavior (DRE) – very important for dog training – we reward those behaviors that are better than the ones already accomplished – we reward the best sit, the best down, etc.
  4. Duration Reinforcement Schedule – the response must be maintained for an entire interval for a reward to be given. Schedules are:
    1. Fixed Duration (FD) – maintain a behavior for a fixed time to get a reward – stay exercise is  good example
    2. Variable Duration (VD) – maintain a behavior for a changed period of time
    3. Random Duration (RD) – similar to variable duration

Premack Theory of Reinforcement: developed by David Premack in the mid 1960’s. The opportunity to engage in certain behaviors is reinforcement on its own. Another way to put is in order to get item “B” you must complete item “A” – if you want to eat desert, you must eat our veggies first. If you want to chase that Frisbee, you must sit first.

Discrimination: the ability to respond to a specific stimulus. For example, sit only when the word ‘sit’ is being said.

Generalization: The response has to happen in every place or time. The dog needs to learn for example that the cue ‘sit’ means that it has to sit regardless or location, time or distractions.

Proofing: Achieving great generalization by the dog. This means that the dog cal respond in the same manner to a discriminative stimulus every time, all the time.

Salience: a discriminative stimulus that is very visible or noticeable. It tends to get the most attention of the dog.

Overshadowing: Stimuli that are not noticed by the animal because there are more salient stimuli around. A salient stimulus overshadows many other stimuli

Blocking: The phenomenon in which a stimulus is being disregarded by an animal if presented together with an already salient and established stimulus. This is why you need to introduce the new cue before the lure and not with it.

Discrimination: The ability to choose between two or more different things. Can be hard to teach. See pages 90-96 in Excel-Erated Learning.

Preparedness: the predisposition of the animal to learn classical conditioning easier with certain unconditioned stimuli and conditioned stimuli, and not others. For example it is easier to learn a flavor with illness then a visual sign with illness. This is very adaptive.

Experimental Neurosis: An outcome of a dog forced to make a discrimination that is no longer possible. Dog may show great anxiety.

Extinction: what happens when rewards are no longer being given. The behavior is degraded until it is no longer offered. This is not unlearning. The dog simply learns a new rule.

Spontaneous Recovery: a behavior that happens after it was allegedly extinct. The behavior that was extinct suddenly reappears. Aka extinction burst.

Partial Reinforcement Extinction Effect (PREE): Continuing to engage in a behavior despite the fact that reward is being given anymore. Usually due to a variable ratio schedule of reinforcement.

Escape/Avoidance Response: The response to a negative reinforcement of positive punishment method of training. The dog is engaging in a behavior in order to avoid or escape a certain aversive outcome. Can be signaled or un-signaled.

Learned Helplessness: if the aversive does not follow a signal or the dog is not allowed to escape the aversive, the dog will eventually lay down and become immobile, after learning that there is nothing it can do to stop the aversive from happening.

Shaping: aka shaping by successive approximations.

A method of teaching a new behavior in which any behavior that begins to resemble the wanted behavior is reinforced. Gradually, the standard of the behavior that is reinforced is increased to resemble the wanted behavior.

Prompting: Manipulating the animal or the environment in a way that makes the dog do the behavior. There can be visual prompts as well known as lures. The problem is that the dog learns that the prompt means the behavior needs to be done and the prompt must be faded as soon as possible.

Chaining: A method of teaching a complex sequence of behaviors. Each behavior signals the other behavior that eventually signals a reward. Backward chaining is the most efficient way usually – in this method the last behavior is trained first followed by a reward. Then we go backwards and add behaviors.

Autoshaping: a very strong animal learning process in which the animal starts to touch or manipulate a conditioned reinforcer (clicker) in order to get the primary reinforcer. Can be a strong training tool. For example teaching retrieve to a non retriever. Make the ball, dumbbell a conditioned reinforcer for food and the dog will start trying to manipulate it.

Conditioned Emotional Response (CER):  establishing a classically conditioned emotional response (usually fear). This is the base for many fears and phobias in many dogs. This process is very resistant to extinction.

Counterconditioning and desensitization: The methods use to try and eliminate CER’s.

Desensitization is the process in which we produce a very low level of the stimulus that produces fear and slowly work up to a full stimulus. Together with this we need to use counter conditioning which is the association if this stimulus with a positive consequence.

Taste Aversion Learning: the process in which animals learn to avoid a certain food. This happens very fast and is very adaptive.

Flooding / Response Prevention: the process in which the fear eliciting stimulus is shown in a full blown way without the subject being able to escape. This can sometimes work but is unethical and more often then not, only makes more harm.

 

 

Copyright © clickertrainusa.com  2005 - All Rights Reserved
Webmaster - Disclaimer - About