Sunday, November 22, 2009

The Negative Effects of Positive Reinforcement

Here's another article originally posted at my PsychologyToday blog: 

The Negative Effects of Positive Reinforcement
In the first article in this series, Why Behavioral Science Is Losing the Training Wars, I described two examples of learning in dogs that can't be explained through either the pack leader model of training or learning theory, and suggested that the reason the positive training movement hasn't dominated the current training landscape is that behavioral science isn't as scientific as positive (or +R) trainers claim.

In the second, Is Behavioral Science Failing Our Dogs, I described how my two examples can only be explained completely and satisfactorily through a simple energy theory which operates primarily on the principle that all behaviors, instinctive or learned, are designed to reduce a dog's internal tension or stress:

stimulus (energy-in) > increased tension > behavior (energy-out) > release

It's all pure energy.

This idea may seem strange at first, but after all, the universe started out as energy. It then differentiated into subatomic particles, then into atoms of hydrogen, then helium, and up the periodic table. At a certain point some atoms were joined, energetically, into various kinds of molecules. At a point beyond that some of these molecules developed into living organisms, which then evolved and developed into the rich complexity of nature we see all around us (and inside of us) today. From the Big Bang to the dog run, energy continues to manifest itself in everything your dog does, from the way the neurons fire inside his brain to the way his tail wags when you come home from work.

In presenting his energy theory, former police dog trainer and natural philosopher Kevin Behan, writes: "The irreducible essence of anything is always a function of energy. I'm proposing that the nature of dogs is also a function of an energetic makeup rather than a [mental or] psychological one." (Read more here.)

My good friend Alexandra Semyonova -- a highly-respected and well-known dog trainer in the Netherlands, who uses only positive reinforcement techniques -- wrote to me not long ago, saying, "Your energy theory1 is not as far-fetched as people may think. It's just that you have to think interdisciplinarily to get it. You could say that an energy exchange with the environment doesn't only take place through food. As two dogs look at each other [or play together], the electrical patterns in their brains change. This can trigger changes in physical structure. And because those brains are a sort of solidified past, those dogs will be responsive to [the] kind of energy related to that past, and not to some other kind of energy that wasn't present or important at the time2."

To me, that's brilliant. And it's exactly how operant conditioning works (when it does work). The reinforcement for "good" behaviors isn't the result of an external object, event, or marker; it's due to the way a dog's emotional energy flows and finds a satisfying release. The more satisfying the release, the more deeply the behavior it's coupled with is learned3.

Mind you, when I talk about energy I'm not being vague and new-agey. I'm talking about nervous or emotional energy. Nervous energy is essentially electric: the movement of electrons through one neuron into the next. It's choppy; it has an unpleasant stop/start feel. Plus it's hard to control; it operates on its own, almost forcing an animal to obey its (the energy's) own needs. True, nervous energy is necessary for an animal's survival, but it has nothing to do with animal happiness. That's a problem, because both dominance training and operant conditioning rely primarily on survival feelings to get their effects: with a dominance-trained dog it's the need to avoid danger (i.e., a correction), with a positively-reinforced dog it's the need for food (remember, behavioral science got its start with Pavlov's dogs salivating at the sound of a bell, and continued with Skinner's rats pressing levers to obtain food pellets).

Emotional energy, though, is magnetic, flowing, and can be very pleasant. Yes, a dog may occasionally feel stressed if he has more emotional energy than his system can carry, especially if he has no way to resolve or release it. But at least he has more control over what he can do with it. And as long as he has that feeling, he's not distressed or thrown completely off-balance by the weight of excess emotion.

So it seems to me that despite Skinner's brilliance, instinctive biological needs actually interfere with an animal's capacity to learn, while positive emotions are the bedrock of learning. Behavior modification via survival needs is also almost wholly dependent on repetition and artificial reinforcement, not to mention the process of occasionally withholding rewards through a variable reinforcement ratio, which can be very stressful for a dog. Karen Pryor, one of the key figures of the positive training movement, writes in a 2006 article: "Reinforcement may go from predictable to a little unpredictable back to predictable, as you climb, step by step, toward your ultimate goal. Sometimes a novice animal may find this [variability] very disconcerting. If two or three expected reinforcers fail to materialize, the animal may simply give up and quit on you. You can see this clearly on the video of my fish learning to swim through a hoop. When three tries ‘didn't work' the fish not only quit trying, he had an emotional collapse, lying on the bottom of the tank in visible distress4."

Not only is this kind of training not positive, it actually proves that all behavior is learned, not through reinforcers, but through the reduction of internal tension or stress. The more stressed a dog is -- as with a variable ratio of reinforcement -- the deeper a behavior is learned when that stress is resolved. But learning through flow is anything but stressful. It also doesn't require reinforcements because it's an immensely pleasurable experience on its own. Plus it takes place instantly and automatically.

So no matter how well-conditioned our dogs become, no matter how much a part of their brain "salivates" at the sound of a clicker5, or works to gain a reward, on a certain level dogs are not very happy when they're subjected to learning through operant conditioning.
Not happy? Are you serious?

Deadly serious. I mean, think about it. Somewhere in the back of every dog trainer's mind is that image of Pavlov's dogs, salivating at the sound of a bell ringing. That's the apex of conditioning. Yet no one seems to consider how unhappy those dogs must have been. And let's not even get into the stress Skinner's lab rats and pigeons were feeling. So, yes, on a certain level, positive reinforcement is actually an unpleasant experience.

I know that may sound crazy, but the current trend in child-rearing and education tells us that positive reinforcement is undermining learning and happiness in our kids. In her Psychology Today blog, Creating in Flow, social psychologist Susan K. Perry quotes Teresa Amabile of Harvard. "If rewards become prominent in children's minds, they may overwhelm the intrinsic joy of doing something interesting and personally challenging." Kids who are given rewards for reading, for example, tend to choose shorter books in order to get more rewards, while children who are motivated by a love of learning will read anything that catches their fancy, just for the pure joy of it.

Every positive trainer reading this will assert that they see that kind of joy in their dog's eyes when their clients' dogs are learning through +R. I can only say that they must be seeing things differently than I do5. I would also argue that whatever happiness dogs do experience in a clicker class or by working for variable-ratio food rewards, it isn't because of the technique, it's probably because -- just like young children -- dogs are so hungry for learning and are designed to latch onto anything that gives them something to do with their energy -- especially in a social context -- that they're supplying their own emotional flow in order to help them move past the unpleasant aspects of conditioning techniques.

There's no doubt that there has to be a payoff for learning. That's the one simple truth of Skinner's theory. But if the payoff doesn't reduce internal tension, or spark feelings of pure joy, it will automatically create unhappiness and resistance in dogs just as it creates uncertainty and resentment in children.

The positive training movement defined itself from the outset as being a kinder and more scientific alternative to dominance training. And that's true. But dogs aren't lab rats. And there's a "new kid" in town, a method that's even kinder and may be more scientific.
If you've read my first article you'd know that my primary reason for discussing the holes I see in behavioral science is that dogs are trying to tell us something about the nature of consciousness. Lab rats and helper monkeys don't have the emotional capacity dogs do, so using survival feelings to condition them works fine most of the time. But dogs are different. Only canines, homo sapiens (and some cetaceans) have the ability to override instinct in favor of emotion. That's an amazing thing. And it's part of what makes dogs the current "it" species for cognitive scientists7.

We all love our dogs and we all want what's best for them. So I would challenge anyone reading this: if you believe operant conditioning is scientific, then be scientific and test Kevin Behan's energy theory for yourself. Next time I'll give you one simple exercise that will not only enable you to do that, it might just improve the lives of every dog you know.

LCK

Footnotes:

1) It's not my energy theory, though for some reason Semyonova likes to think it is. As this article states, it was developed by Kevin Behan. Oddly enough though, Semyonova and I are the first two people to describe canine social structure as part of a self-emergent system, long before we had a meeting of the minds online. Semyonova did it 2002 in her longitudinal study found at www.nonlineardogs.com, while I did it as a bit of passing dialogue in my first novel, A Nose for Murder, also published in 2002. Meanwhile, Kevin Behan described pack social structure -- particularly while hunting -- as a bottom-up, self-emergent system in his 1992 book, Natural Dog Training, even though he hadn't heard of emergence theory at the time: "Since each individual has different sensitivity to prey making, we observe the emergence of order -- the creation of a group and a pack -- out of what was chaos."

2) It's a physiological fact that certain sensory details associated with past emotional experiences can not only bring memories flooding back, they can often make you feel as if you're actually re-living that past event. One of the strongest of these mnemonic triggers comes through the sense of smell. For example, when your present-day nostrils inhale the same perfume worn by that wonderful girl you were in love with back in college, your olfactory nerves and the part of your hippocampus holding memories of her vibrate at the same frequency once again.

From Discover Magazine: "Quantum physics may explain the mysterious biological process of smell ... says biophysicist Luca Turin, who first published his controversial hypothesis in 1996 while teaching at University College London. Then, as now, the prevailing notion was that the sensation of different smells is triggered when molecules called odorants fit into receptors in our nostrils like three-dimensional puzzle pieces snapping into place. The glitch here, for Turin, was that molecules with similar shapes do not necessarily smell anything like one another. Pinanethiol [C10H18S] has a strong grapefruit odor, for instance, while its near-twin pinanol [C10H18O] smells of pine needles. Smell must be triggered, [Turin] concluded, by some criteria other than an odorant's shape alone.

"What is really happening, Turin posited, is that the approximately 350 types of human smell receptors perform an act of quantum tunneling when a new odorant enters the nostril and reaches the olfactory nerve. After the odorant attaches to one of the nerve's receptors, electrons from that receptor tunnel through the odorant, jiggling it back and forth. In this view, the odorant's unique pattern of vibration is what makes a rose smell rosy and a wet dog smell wet-doggy. It is the frequency of vibration, not the shape, that determines the scent of a molecule."

So Alexandra Semyonova's statement -- "dogs will be responsive to [the] kind of energy related to that past, and not to some other kind of energy that wasn't present or important at the time" -- is right on target.

3) While it's true that learning still takes place in relation to a dog's history (as behavioral scientists tell us), the key element isn't a conscious mental process such as thinking of past experiences and figuring out how to apply them to the present moment, or of learning through consequences or trial-and-error (which would all require that the dog be able to engage in mental time travel and/or propositional thinking). It's simply about the dog vibrating at the same frequency in the here-and-now moment as he did in the past: in other words, learning is a funciton of energy, not a mental thought process.

4) Pryor goes on to say, "Casinos, believe me, use the power of the variable ratio schedule to develop behaviors, such as playing slot machines, that are very resistant to extinction, despite highly variable and unpredictable reinforcement."

So are we training dogs or creating gambling addicts?

5) Clicker training was invented by Keller Breland -- a student and later a colleague of B.F. Skinner -- as a way of marking behaviors while working with hunting dogs at a distance. Breland later taught Karen Pryor how to use clicks and whistles to train dolphins. Here's how Pryor describes the process:

"The trainer clicks at the moment the behavior occurs: the horse raises its hoof, the trainer clicks simultaneously. The dog sits, the trainer clicks. Clicking is like taking a picture of the behavior the trainer wishes to reinforce. After ‘taking the picture,' the trainer gives the animal something it likes, usually a small piece of food. Very soon (sometimes within two or three clicks), an animal will associate the sound of the click with something it likes: the reward. Since it wishes to repeat that pleasurable experience, it will repeat the action it was doing when it heard the click."

So again we're using Pavlov's dogs as a template.

6) Another figurehead of the positive training movement, Jean Donaldson, clicker trained her dog to hump her leg on cue. (I know!) In my experience dogs only exhibit that behavior when they're in a state of frustration, not joy. Yet Donaldson insists that her dog "seems to have fun" doing it. Plus it makes her (Donaldson) laugh. (For the full article, click here.)

Finally (on this point), the fact that +R trainers see joy in the dogs they train doesn't mean much; after all, I'm sure Cesar Millan -- the nemesis of the positive training movement -- sees joy in the eyes of the dogs he works with too.

7) Virginia Morrell writes in Science Magazine: "Dogs are fast becoming the it animal for evolutionary cognition research. Our canine pals, researchers say, are excellent subjects for studying the building blocks underlying mental abilities, particularly those involving social cognition. Their special relationship with humans is also seen as worthy of study in its own right; some researchers see Canis familiaris as a case of convergent evolution with humans because we share some similar behavioral traits. ... Some researchers even think that dogs may teach us more about the evolution ... of our social mind than can our closest kin, the chimpanzee, because Fido is so adept at reading and responding to human communication cues."

1 comment:

  1. Thank you. This is the first time I have been able to clearly understand why positive reinforcement is not the best course of action.

    ReplyDelete