What makes us chose between watching our favorite TV show and writing the paper due in a couple of weeks? Why do we opt to walk our dog instead of making that important phone call to a difficult client? Why does our dog bark for attention or steal socks out of the laundry basket? All through the day we’re faced with an endless selection of actions to choose from, some important, others not so much. Whether we’re a human, a dog, a cat or a horse, the choices we make are the result of a number of variables from the experienced consequences from the environment. In applied behavior analysis, the choices a person or an animal makes, is the direct result of the ‘matching law’. At the last conference ‘out of the lab, into the field’, Parvene Farhoody revealed how this law of effect not only directly influences the animal’s behavior, but also the efficiency and accuracy of our training.
The choices we make are the direct result of a number of variables, such as the rate of reinforcement (how many times we’ve been reinforced for the behavior), the quality of the reinforcement (how much did we appreciate that reinforcement), or the reinforcement delay (how soon did we get the reinforcement). If you were paid $1,000 for every phone call you had to make, would you still chose to walk your dog over calling a difficult client? According to the matching law, the chances of choosing one behavior over another are the direct equivalent of how much those behaviors have been reinforced. In other words, faced with two options, A and B, if A was reinforced twice as much as B, we would chose option A twice as often as we would chose option B. Let’s say your dog got clicked and treated 10 times when sitting at a 90° angle from your body and 5 times when sitting parallel in good alignment. As a result, when asked to ‘sit’, your dog would be twice as likely to sit at a 90° angle.
The ‘matching law’ was first brought to light by R.J. Herrnstein (1961). When working with pigeons in a skinner box, Herrnstein realized that the number of times they would peck one button over another was directly correlated with the rate of rewards they had received for pecking each one of them. In other words, behavior matches reinforcement. If 70% of rewards were given for pecking the right button, the pigeon would be hitting the right button 70% of the time, thus the term ‘matching law’. The matching law questions the idea of free will, predicting our choices from a mathematical equation. It quickly became a tool for behavior analysts and psychologist to understand and predict many behavior patterns, such as tactics used by children during conflicts with their parents. But behavior out of the lab is subject to many more variables and couldn’t be quantified accurately without taking them into consideration. The initial matching law has since been slightly modified to a generalized matching equation (Baum, 1974) that describes the influence of behavior under a wider range of conditions, taking into consideration potential bias.
Why is the matching law important to keep in mind when working with dogs or animals in general? Here are a few examples of when we should think of the matching law:
1/ If faced with two possible behaviors, an animal will be more likely to choose the one that has been rewarded most often. It’s therefore important to provide conditions in which making the ‘right choice’ is easier than the alternative. Efficient training has to include manipulating the environment so that the ‘right choices’ require little effort and receive high rate of immediate reinforcement. An example of this is when keeping a puppy confined or closely supervised while taking him out every half hour if needed and rewarding him for pottying outside. The fewer the number of accidents, the faster and more efficient the training.
2/ Every time we reinforce unwanted behaviors, we reduce the strength of wanted behaviors. If we’re looking for a perfect ‘sit’, in alignment with our body, we should avoid rewarding a ‘sit’ that is out of position and/or slowly executed. According to matching law, every rewarded ‘sit’ that is less than ‘at criteria’, weakens the chances of a perfect ‘sit’.
3/ While shaping, the longer we stay on intermediate behaviors, the more we strengthen those behaviors over the target behavior. If we try to get each step perfectly before moving to the next one, instead of moving on quickly, we make all those steps stronger and therefore more likely to be repeated. We often believe that shaping a behavior ultimately makes it stronger. If we apply matching law however, this simply doesn’t hold true. In shaping, it can take 20-40 clicks (or more) to get the target behavior. During the training session, we’re likely to have clicked the dog more often for intermediate behaviors than for the target behavior. When luring, however, the target behavior is reached sooner, so in the end, the target behavior will have a longer reinforcement history over the non-criterion behaviors.
4/ We need to make sure we don’t reward behaviors we don’t want. If while training a dog to sit, we click when she shifts her weight on one hip, instead of sitting straight, we’re likely to see that behavior repeated. This may not be a problem when working with a pet, but if working at a professional of competition level, small details may make a big difference and getting rid of such behaviors will ultimately require more work than making sure they’re not reinforced in the first place. According to Farhoody, ‘It’s better to miss reinforcing a wanted behavior than to reinforce an unwanted behavior’.
These are just a few examples of how the matching law directly influences the efficiency and accuracy of our training. We can also see how it applies to behavior modification protocols. It’s not always easy to identify all the reinforcements in a particular situation, especially if there are many different variables, but overall, the matching law has become a very practical tool for behavior analysts when describing behavior. McDowell (1988), for instance, showed how the self-mutilating behaviors of a young boy matched the rate of reprimands that he received, revealing their reinforcement value. In animal training, we could certainly identify many instances where this law applies and could be used to modify the environment for better and faster results.
Jennifer Cattet Ph.D.
Excellent stuff!
Now I KNOW why I am not a fan of “pure shaping” as advocated by the “pure clicker trainers”. 🙂
“When luring, however, the target behavior is reached sooner, so in the end, the target behavior will have a longer reinforcement history over the non-criterion behaviors”.
Luring relies on pattern learning and many repetitions because the dog is merely following food and doing little thinking. A good shaping protocol will have more steps but quicker learning time because the criteria is broken down and errors or plateaus in progress are massively reduced. The steps are so well split that there is a progress is smooth and successive. Also the dog is actively thinking and not blindly following a food lure. Luring can often take longer than shaping, because extraneous body language cues need to be faded out. There’s little to no extraneous cues that need to be faded out in shaping.
Well, good one must say. Got to know more about animal behavior and their matching law. Keep updating such a worthy informative posts.
Hey there…Well initially before reading this post I have no Idea regarding this matching law but after reading this I have got a sort of knowledge. I think the four points that you have mention in your post is quite good and fantastic. Thank you so much for sharing.
Interesting read. I have a question about it.
You write ” While shaping, the longer we stay on intermediate behaviors, the more we strengthen those behaviors over the target behavior. ” Ok, sounds very logical. You get what you reinforce, right.
In practise I haven’t really noticed this happening with horses, though.
What if you move fast through those intermediate behaviours (like only 2- 3 reinforcements before raising the criterion).
Behaviour that doesn’t get rewarded gets extinct. Isn’t the extinction burst here that seems to strengthen the behaviour of the intermediate behaviours?
If the target behaviour is shaped every intermediate behaviour is only reinforced 2-3 times while the target behaviour gets rewarded much more often. After all, we want to train a target behaviour in different contexts. This is the way I strengthen target behaviour before changing to a variable ration reward schedule.