Learning processes and the neural analysis of conditioning

Guerra, Luiz Guilherme Gomes Cardim; Silva, Maria Teresa Araujo

doi:10.3922/j.psns.2010.2.009

Abstract

Classical and operant conditioning principles, such as the behavioral discrepancy-derived assumption that reinforcement always selects antecedent stimulus and response relations, have been studied at the neural level, mainly by observing the strengthening of neuronal responses or synaptic connections. A review of the literature on the neural basis of behavior provided extensive scientific data that indicate a synthesis between the two conditioning processes based mainly on stimulus control in learning tasks. The resulting analysis revealed the following aspects. Dopamine acts as a behavioral discrepancy signal in the midbrain pathway of positive reinforcement, leading toward the nucleus accumbens. Dopamine modulates both types of conditioning in the Aplysia mollusk and in mammals. In vivo and in vitro mollusk preparations show convergence of both types of conditioning in the same motor neuron. Frontal cortical neurons are involved in behavioral discrimination in reversal and extinction procedures, and these neurons preferentially deliver glutamate through conditioned stimulus or discriminative stimulus pathways. Discriminative neural responses can reliably precede operant movements and can also be common to stimuli that share complex symbolic relations. The present article discusses convergent and divergent points between conditioning paradigms at the neural level of analysis to advance our knowledge on reinforcement.

behavior analysis; classical conditioning; dopamine; neuronal plasticity; operant conditioning; reinforcement

BEHAVIOR/SYSTEMS/COGNITION

Learning processes and the neural analysis of conditioning

Luiz Guilherme Gomes Cardim Guerra; Maria Teresa Araujo Silva

Universidade de São Paulo, São Paulo, SP, Brazil

^{Correspondence regarding this article should be directed to} Correspondence regarding this article should be directed to: Luiz Guilherme G. C. Guerra Rua Dona Antônia de Queirós, 504, cj. 25 São Paulo, SP, Brazil, CEP 01307-013 Phone: +55-11-3257-1899 E-mail: lugui@conscientia.com.br

ABSTRACT

Classical and operant conditioning principles, such as the behavioral discrepancy-derived assumption that reinforcement always selects antecedent stimulus and response relations, have been studied at the neural level, mainly by observing the strengthening of neuronal responses or synaptic connections. A review of the literature on the neural basis of behavior provided extensive scientific data that indicate a synthesis between the two conditioning processes based mainly on stimulus control in learning tasks. The resulting analysis revealed the following aspects. Dopamine acts as a behavioral discrepancy signal in the midbrain pathway of positive reinforcement, leading toward the nucleus accumbens. Dopamine modulates both types of conditioning in the Aplysia mollusk and in mammals. In vivo and in vitro mollusk preparations show convergence of both types of conditioning in the same motor neuron. Frontal cortical neurons are involved in behavioral discrimination in reversal and extinction procedures, and these neurons preferentially deliver glutamate through conditioned stimulus or discriminative stimulus pathways. Discriminative neural responses can reliably precede operant movements and can also be common to stimuli that share complex symbolic relations. The present article discusses convergent and divergent points between conditioning paradigms at the neural level of analysis to advance our knowledge on reinforcement.

Keywords: behavior analysis, classical conditioning, dopamine, neuronal plasticity, operant conditioning, reinforcement.

Introduction

I. P. Pavlov's work represents a historical landmark in the development of modern neuroscience. Using the tools available at the time, he focused on cerebral processes in his theoretical and experimental investigation of behavior. The process of classical conditioning was extensively demonstrated in his laboratory and remains one of the most fruitful models in modern neuroscience.

Pavlov and his group found that decerebrated animals do not exhibit conditioned reflexes, whereas unconditioned responses were preserved, including orientation reflexes toward a sound source, escape from movement restraint, and rejection of acid dropped in the mucous membrane of the mouth (Pavlov, 1927). To Pavlov, the conditioned reflex obviously recruited a higher level of cortical activity compared with the simple unconditioned reflex arc. At that higher level, a set of reflexes could occur, ranging from conditioned stimulus (CS) perception to conditioned response (CR) evocation. Pavlov can also be considered indirectly responsible for the original exploration of operant conditioning. J. Konorski, a pioneer in operant behavior studies, made several visits to Russia to interact with Pavlov's laboratory (Chilingaryan, 2001). Konorski and Miller (1937) conducted an experiment showing that paw flexion elicited by shock could also be reinforced by food in deprived animals. Commenting on this experiment, Pavlov (1936/1986) concluded that the operant behavior had occurred because of the following events: (1) the paw's movement was perceived by kinesthetic neurons, (2) this neural perception was paired with the unconditioned food reflex, (3) as a result of that pairing, food deprivation began triggering the kinesthetic neurons which, in turn, elicited the output paw flexion. Supposedly, the kinesthetic and motor neurons send projections to each other, so the paw's movement can control and be controlled by the sensory neurons. In summary, in Pavlov's view of the operant experiment conducted by Konorski and Miller, the antecedent stimulation of the kinesthetic neurons evoked the operant motor response. This conclusion follows from Pavlov's reflex paradigm, which considers that any behavior should be attached to some kind of antecedent stimulus control.

Pavlov repeatedly emphasized that reflex physiology should provide the foundations for psychological functions. Although he usually described functional relations between observable variables, such as stimuli and responses, his focus was on the supposed underlying biological dynamics. Later, a different approach was used by B. F. Skinner, who dismissed any reference to a physiological level when analyzing psychological phenomena. The behavioral laws described by Skinner refer exclusively to relations between environmental stimuli and responses. For Skinner, these environmental relations would be sufficient to understand and control behavior, and any mention of biology would be useless or even counterproductive in the construction of the independent science of behavior.

In the Skinnerian tradition, classical and operant contingencies are treated as independent units of analysis because the causal stimulus of the reflex is generally the antecedent stimulus, whereas the causal stimulus of operant behavior is the consequent stimulus (termed reinforcer). Nevertheless, a common element between both analytical units is the environmental selection of relations between antecedent stimuli and responses. In classical conditioning, the reinforcer is responsible for the selection of CS-CR relations and is the unconditioned stimulus (US) that precedes the unconditioned response (UR). In operant conditioning, the reinforcer (S^R) is the stimulus that follows the response and selects relations between discriminative stimulus (S^D) and operant response (S^D-R relations).

Skinner was limited by the physiological knowledge of his lifetime and could not fully evaluate the role of neurophysiologic events in operant and classical contingencies of reinforcement. Although he admitted the importance of the developing physiology showing the organism's inner changes during and after learning tasks (Skinner, 1974), he apparently subjected behavioral neuroscience to behavior analysis (Skinner, 1938, 1988a). The present article holds the view that the relationship among levels of behavioral and physiological analysis is bidirectional. Today's psychology can take advantage of studies with drugs, specific brain lesions, and biochemical markers of neural networks correlated with behavioral events, in the same way that the psychology of Pavlov's time benefited from implanting stomach fistulas or extirpating cerebral hemispheres of dogs.

A proposal for a unified principle of conditioning

Donahoe and Palmer (1994) suggested that the UR produced by the presentation of an intense or biologically relevant US sensitizes the organism to new sources of learning. This US-induced change in ongoing responses (i.e., the production of the UR) is called behavioral discrepancy, a theoretical concept that basically indicates a difference between ongoing and elicited behavior (Donahoe, 2003; Donahoe & Palmer, 1994).

Behavioral discrepancy would be the internal signal of environmental stimuli relevant to survival. The UR then emerges as the biological correlate of the US and also as the actual cause of conditioning. The primary function of the UR in classical conditioning was confirmed by Donahoe and Vegas (2004), who used the pigeon's throat reflex (UR) in response to water (US), taking advantage of the natural delay between the US and UR typical of such a reflex. These authors varied pairing parameters between the CS and UR, including presenting the CS within the interval between the US and UR. In this last situation, CR evocation was verified (i.e., the US-CS-UR sequence reliably produced the CR, similar to the usual CS-US-UR sequence). Therefore, reverse conditioning was efficient, in which the CS follows the US.

Operant conditioning is also addressed here because the reinforcer, among other functions, is always an eliciting stimulus. Donahoe and Palmer (1994) considered that food has a reinforcing function and conditioning power because it first elicits changes in current responses (e.g., salivation) that sensitize the organism to the forthcoming (S^R) and surrounding (S^D) critical events.

Behavioral discrepancy appears to offer a parsimonious principle that serves both types of conditioning. Among the wide range of responses and stimuli occurring in a time continuum, the US is not preceded either by stimuli or responses, but it actually is always preceded by both. As a result, the US-produced discrepancy selects its better correlated precedent events (i.e., environmental events in the case of classical conditioning and behavioral events in the case of operant conditioning). When stimuli reliably precede the US, classical conditioning is in effect. When responses reliably precede the US, operant conditioning occurs. In this latter case, the S^D is also selected because of its regular presence when discrepancy occurs.

The aforementioned theoretical considerations led Donahoe and Palmer (1994) to propose the unified principle of reinforcement, which states that classical and operant conditioning share the same unit of selection by reinforcement: environment-behavior relations. For convenience, the word reinforcement will be used herein as a generic term to refer to the selection of environment-behavior relations. If specifying whether the contingency is classical or operant is necessary, then the terms unconditioned stimulus (US) or primary reinforcer (S^R) will be used, respectively.

The unified principle of conditioning settles on a neural basis

Donahoe and Palmer (1994) and Donahoe, Palmer, and Burgos (1997) inferred that antecedent stimuli always have a causal function in evocated responses. Responses are intrinsically linked to stimulus control. At the neural level, this relational function can be characterized by altering anatomic connections and the efficacy of synapses between sensory neurons (the stimulus pathways) and motor neurons (the response pathways). The learning history of an organism requires changes in brain anatomy and physiology because environment-behavior relations are strengthened as neural connections. Such connections would cover the temporal gaps between the current learning and the later expression of learning.

The neural connections proposed by Donahoe and Palmer (1994) and Donahoe et al. (1997) add a pivotal biological feature to the "unified principle of reinforcement." This biobehavioral principle includes the neurosciences inside of behavior analysis. From a biological perspective of behavioral phenomena, reinforcement simply strengthens connections among neurons. Neural circuits of reinforcement would take part in ontogenetic selection, similar to genes in phylogenesis (Silva, 2005). This is why stimulus control by an antecedent stimulus guides behavior. In a given context, the currently activated synapses are virtually the same synapses strengthened during learning within this context.

Role of neural events in behavioral contingencies

The traditional view of behavior analysis of contingencies avoids intraorganic relations and considers only interrelations between the organism and external environment as truly behavioral phenomena. Skinner and skinnerians always maintained that only a purely behavioral approach is able to predict and control behavior (Baum, 1994; Catania, 1998; Chiesa, 1994; Donahoe & Palmer, 1994; Sidman, 1960; Skinner, 1953), and neurosciences could only describe what occurs in an organism while it behaves in mechanical terms (Skinner, 1953, 1974).

Several behavior analysts, however, assert that internal/biological events may occur based on reinforcement contingencies (Silva, Gonçalves, & Garcia-Mijares, 2007). Morris, Lazo and Smith (2004, p. 165) consider that, in a behavioral system, "the variables of which behavior is a function lie outside behavior, but not outside the organism." McIlvane and Dube (1997) share this opinion and affirm that neural antecedents might serve as links of a behavioral chain, rather than merely a physiological chain.

DiFiore et al. (2000) suggest that the knowledge of cerebral wave patterns may provide insights into correct interventions in people with developmental deficits. For example, two wave patterns are related to stimulus processing. An initial pattern is related to sensory contact with the stimulus, and another wave pattern correlates with abstract behavior milliseconds later. Abnormal patterns could then reveal whether perceptual or language deficits are present. Normal patterns, in contrast, could reveal potential performance even in the absence of overt response and help people with motor disorders such as cerebral palsy. Therefore, brain waves may be a reliable indicator of behavioral discrimination when responses are not readily detectable.

Skinner (1988b, p. 485) mentioned the feasibility of giving neural events the same status as muscular responses: "I see no reason why we should not call the action of efferent nerves behavior either, since no muscular response is needed for reinforcement."

Neural analysis and the integration of conditioning paradigms

Brain activity is topographically very distinct from responses observed at the behavioral level, which does not mean that they have a distinct nature. The organism is always the one that behaves, and behavior is expressed through the hands, speech organs, and brain (Silva et al., 2007). Behavior analysis follows the same principles for both neural and muscular responses. Reinforced responses may bypass the brain and involve only the spinal cord and muscles (e.g., the operant strengthening of motor responses in transected animals) (Gómez-Pinilla et al., 2007). Reinforced responses may also be only cerebral. For example, rats accurately learned to obtain a reinforcer only by emitting a neural pattern that correlated with the bar pressing response (Nicolelis & Chapin, 2002). Additionally, rats discriminated the cortical stimulation S^D that signaled different directions to follow to obtain the reinforcer (i.e., mesocortical stimulation) (Talwar et al., 2002). In both examples, which will be discussed in more detail later, neural events served as the S^D, response, or S^R, depending on the programmed contingency.

An important issue is the analytic unit of behavior analysis that is assessed in the central nervous system. Is it a single neuron, a set of neurons, or a circuit? Even a single neuron emits an operant response analog (Fetz & Finocchio, 1971). If learning depends on strengthening synapses, considering individual neuron activity as the main dependent variable of brain activity is unfeasible (Donahoe et al., 1997). Possibly because of this, increased activity was not found when reinforcement contingencies existed for individual neuronal discharges in a hippocampal slice. However, reinforcement was successful for discharges in trains, which are more representative of synaptic activity (Stein, Xue, & Belluzzi, 1993). Should more neurons be observed so that their joint action establishes a response to be reinforced? A relevant datum came from the aforementioned experiment by Nicolelis and Chapin (2002), in which the reinforced activity of a small set of neurons replaced operant motor responses.

Brain activity can play a role in behavior analysis. A detailed description of neural function could contribute to improvements in the knowledge of conditioning paradigms. Classical and operant learning has not only the aforementioned common theoretical basis, but also common brain structures, pathways, and neurotransmitters. Could common neural mechanisms match stimulus control for both types of conditioning? Could brain data strengthen Donahoe and Palmer's (1994) view that reinforcement, no matter to which conditioning, simply selects environment-behavior relations (i.e., associations between antecedent stimuli and responses)? The present paper evaluates whether the behavioral analysis of neural variables may reveal similarities between the nature of classical and operant conditioning by focusing mainly on stimulus control of learning tasks. The aim is to consider a feasible single conditioning category based on the vast literature about the neural side of behavioral phenomena.

Description and analysis of neural events in conditioning

Brief description of the reinforcement circuit

In the historic experiment by Olds and Milner (1954), non-deprived rats tirelessly worked for electrical stimulation of the limbic septal area. This paper opened a vast field directed toward elucidating the structures, circuits, and cellular and molecular processes activated by conditioning. Today, the most accepted theories of the neural basis of conditioning in mammals involve a set of limbic structures activated by the neurotransmitter dopamine. Dopamine is released when the organism is presented with stimuli relevant to learning, such as unconditioned, conditioned, reinforcing, and discriminative stimuli. The activity of dopaminergic neurons is sensitive to different presentation probabilities of such stimuli.

A very simplified description of the main structures and neural pathways related to learning will now be presented. Dopaminergic pathways linking mesencephalic structures to limbic structures are pivotal for reinforcement efficacy. Specifically, activation of the substantia nigra and ventral tegmental area causes dopamine release in the striatum and nucleus accumbens, where dopamine serves as a neural signal of behavioral discrepancy.

The caudate nucleus and putamen are part of the striatum, a complex whose influence on motor function has been widely confirmed by experimental and clinical findings. For example, the caudate-putamen and its two efferent pathways, the globus pallidus and substantia nigra, are involved in the difficulty to initiate movements and muscular rigidity observed in Parkinson's disease (Heimer, 1994; Saint-Cyr, 2003). The caudate receives cortical inputs from associative areas, including the frontal, parietal, and temporal cortices, and the putamen receives glutamatergic excitatory inputs from motor and somatosensory cortices. Because of such cortical inputs, Heimer (1994) concluded that the putamen constitutes the main striatal motor structure. Conversely, clinical evidence indicates that pathologies related to the caudate hinder complex behaviors, thus leading to behavioral disturbances such as abulia or impulsiveness (Mesulam, 2000).

The nucleus accumbens is located in the ventral portion of the striatal complex, and because of its specific input and output networks, it forms an interface between motor (e.g., dorsal striatum), motivational (e.g., ventral tegmental area, hippocampus, and hypothalamus), and associative systems (e.g., cerebral cortex). The main dopaminergic input of the nucleus accumbens comes from the ventral tegmental area. Other inputs mainly deliver glutamate to the nucleus accumbens. Such projections come from frontal and orbitofrontal cortices, the temporal lobe, the basolateral amygdala, and the hippocampal formation, the latter of which is considered an important structure in strengthening synapses of memory for recent events (Nestler, Hyman, & Malenka, 2001). Efferent pathways of motor areas are similar to those of the caudate-putamen and link to the substantia nigra, globus pallidus, subthalamic nucleus, and prefrontal cortex via the thalamus (Martin, 1998). The nucleus accumbens is intimately involved in emotional and cognitive function, particularly because of its connections with the thalamus, limbic system, and virtually the entire cerebral cortex (DeLong, 2000).

Similar cytological characteristics indicate that the nucleus accumbens and caudate-putamen form a single striatal complex. Therefore, the nucleus accumbens and caudate-putamen are commonly referred to as, respectively, the ventral striatum and dorsal striatum. However, behavioral, pharmacological, and physiological data show that the nucleus accumbens and caudate-putamen are involved in distinct aspects of behavior. For example, dopaminergic enhancers, such as amphetamine and apomorphine, produce locomotor hyperactivity at low doses and motor stereotypy at high doses. These patterns are probably attributable to dopaminergic release in the nucleus accumbens and dorsal striatum, respectively, because (1) lesions of the entire striatal complex hinders both hyperactivity and stereotypy, and (2) lesions of only the nucleus accumbens or intra-accumbens administration of dopaminergic antagonists hinders only hyperactivity (Deutch, Bourdelais, & Zahm, 1993).

The nucleus accumbens is also internally divided into anatomically and functionally distinct structures: the core and shell. The core has morphology similar to the caudate-putamen, whereas the shell resembles the extended amygdala (Heimer, 1994). Behavioral data appear to corroborate the nucleus accumbens anatomy because the core is intimately related to motor responses, and the shell is mainly related to motivation. For example, the shell is mainly involved in moderate stress, and the involvement of the core increases when an intense or long-lasting stressor evokes motor responses of high magnitude (Deutch et al., 1993). The shell of the nucleus accumbens is also more sensitive to addictive drugs, such as cocaine, amphetamine, nicotine, and opioids, which increase dopamine release in the shell (Kupfermann, Kandel, & Iversen, 2000). The shell networks are also "motivational" and substantially connected to the amygdala and autonomic and endocrine centers of the hypothalamus and brainstem, structures that control sexual, defensive, aggressive, and fear behavior (Heimer, 1994).

Importantly, although the nucleus accumbens represents the most promising convergent site of conditioning events, the behavioral function of dopamine is not restricted to its activity in the nucleus accumbens. For example, phencyclidine and cocaine are self-administered directly into the nucleus accumbens and prefrontal cortex, and cocaine is more avidly self-administered directly into the olfactory tubercle than the nucleus accumbens. Additionally, both dopamine and dopaminergic agonists apparently facilitate conditioning and improve performance even when they are administered into the amygdala, hippocampus, and caudate nucleus (dorsal striatum). Therefore, the relationship between dopaminergic activity and behavioral learning goes beyond the nucleus accumbens (Wise, 2004).

Behavioral data derived mainly from appetitive conditioning studies appear to converge on the aforementioned anatomical pathways. The corresponding physiological processes will be detailed in the following sections.

Brief note on dopamine and behavior

Dopamine is a basic neurotransmitter for several complex behaviors, and its malfunctioning, for example, is related to motor symptoms in Parkinson's disease and attentional and cognitive deficits in schizophrenia (Alves, Guerra, & Silva, 1999; Heimer, 1994; Lubow, 1998). The importance of dopamine in positive reinforcement mechanisms has been widely supported by several experiments and is frequently mentioned in review articles and textbooks (e.g., Donahoe & Palmer, 1994; McKim, 2007; Wise, 2004).

Dopaminergic system manipulations have substantial effects on conditioning. Systemic administration of dopaminergic antagonists attenuates the reinforcing value of several stimuli, such as water, food, sexual contact, amphetamine, cocaine, and electrical stimulation of the hypothalamus (Wise, 2004). Dopaminergic depletion, therefore, simulates the reinforcer devaluation observed in alimentary satiation (Morgan, 1974). When administration of dopaminergic antagonists occurs before conditioning, learning simply does not occur; if administration occurs after conditioning, then learned responses resurge as soon as the dopaminergic system returns to normal functioning (Wise, 2004). Moreover, intermittent administration of the dopaminergic antagonist haloperidol causes resistance to extinction indistinguishable from that produced by intermittent positive reinforcement (Ettenberg & Camp, 1986a, b). This finding suggests that partial non-reinforcement by dopaminergic activation may constitute a biological parallel of partial non-reinforcement by food or water in traditional operant procedures. Conversely, dopamine agonists facilitate response learning in either primary or secondary reinforcement (Wise, 2004). Better operant performance is obtained, for example, after the injection of amphetamine, an indirect dopamine agonist, into the nucleus accumbens (Taylor & Robbins, 1984), whereas dopamine antagonists reverse the amphetamine-induced effect (Wolterink et al., 1993).

Dopaminergic cells also influence classical conditioning notably associated with attentional behavior. For example, a substantial dopamine discharge of single neurons has been observed upon the presentation of a novel appetitive US (Schultz, 1999).

Glutamate is intimately related with dopamine in conditioning processes (Kelley, 2004; Nestler et al., 2001). The most accepted hypothesis of the neural modulation of reinforcement states that dopaminergic inputs from the ventral tegmental area modulate the activity of synaptic terminals within the nucleus accumbens coming from glutamatergic cortical neurons. However, the role of glutamate as a neural signal of discrepancy and modulation agent for γ-aminobutyric acid-ergic nucleus accumbens neurons remains a theory that lacks further substantiation. The role of dopamine in behavioral processes is uncontested, and this neurotransmitter has priority in the present article.

Dopaminergic mesencephalic afferents of the nucleus accumbens and conditioning

The knowledge of dopaminergic function in conditioning processes has substantially improved with the electrophysiological research of W. Schultz and colleagues. In their studies, the focus relied on the firing of individual dopaminergic neurons in two mesencephalic structures, the substantia nigra and ventral tegmental area, which are related to motor activity and motivation, respectively. According to Schultz (1999), dopaminergic neurons signal the presentation of relevant events, which can be an appetitive stimulus, its antecedent, or a novel stimulus. If the event is aversive, then the dopaminergic signal is not sustained.

To demonstrate that a relevant event has occurred, a momentary increase in the firing rate of most mesocortical dopaminergic neurons is observed in various limbic and cortical regions. This general dopaminergic signal, as assumed by Schultz, modulates active corticostriatal synapses at the moment of reinforcement. For the purpose of learning, the reinforcer and its biological correlate (i.e., dopamine neuron firing) link to their accompanying contiguous events (the CS or S^D and its related corticostriatal synapses). The temporal contiguity (in addition to the actual causality, according to Rescorla, 1988) between events is a basic condition of the processes of classical and operant conditioning. Additionally, dopamine reuptake is not immediate, which permits its prolonged action in the corticostriatal synapses to be selected.

In addition to supposedly acting as a global signal of unexpected reinforcement, dopamine has other important functions in conditioning, demonstrated by the examples below that were extracted from experiments that measured, in real time, the activity of individual neurons of monkeys:

(1) Dopaminergic mesencephalic neurons showed an increased firing rate in response to an unpredicted appetitive US during the initial trials of classical conditioning. When this US no longer implied behavioral discrepancy or novelty, dopaminergic activity returned to baseline levels, and a transference of cell firing to the CS presentations occurred. After learning became stable, dopaminergic overactivation was no longer altered by the presentation of the (now predicted) US (Fiorillo, Tobler, & Schultz, 2003).

(2) Dopaminergic mesencephalic neurons showed a decreased firing rate when the predicted US was not presented. This datum could also be explained by behavioral discrepancy because a predicted event did not actually occur (Schultz, Dayan, & Montague, 1997).

(3) When an interval was interposed between the CS and US, the firing rate of dopaminergic mesencephalic cells gradually increased and reached maximum values immediately before the US, which is evidence of temporal discrimination. This phenomenon was noteworthy when distinct CSs were employed to indicate different reinforcement probabilities-the greater the level of uncertainty about reinforcement, the greater the neuronal activity (Fiorillo et al., 2003).

(4) In a go/no-go task, distinct visual S^Ds indicated to monkeys whether they should, for few seconds, hold a joystick or release it and move their hand toward a target to obtain a delayed S^R (fruit juice). Neurons in the caudate were gradually more active during the interval between the S^D and S^R, and putamen neurons became more active during those few seconds between the S^D and the release movement. Neither the caudate nor putamen neurons altered their baseline activity when a third S^D indicated that the task would not be reinforced, suggesting discrimination of the context for reinforcement at the neuronal level (Schultz, 2000; Schultz, Tremblay, & Hollerman, 2003). Therefore, the behavioral data by Schultz (2000) and Schultz et al. (2003) are consistent with neuroanatomical data that indicate that the caudate receives inputs from associative cortical areas, and the putamen receives inputs from the motor cortex (Heimer, 1994).

The dopaminergic system is involved in behavioral discrepancy caused by both US detection (items 1 and 2 above) and the level of uncertainty of US presentation (item 3 above). Moreover, the dopaminergic system can transfer the US-eliciting properties to the antecedent stimulus (item 1 above) and shows selectivity in procedures of contingency discrimination (item 4 above).

W. Schultz purported that the dopamine released by mesencephalic neurons in the substantia nigra and ventral tegmental area generates a global reinforcement signal to striatal and cortical areas. The dopaminergic signal could interact, for example, with the prefrontal cortex and one of its subdivisions, the orbitofrontal cortex, which will be shown to react sensitively to antecedent stimuli and the reinforcer value.

Cortical afferents of the nucleus accumbens and conditioning

The prefrontal cortex greatly intervenes in behavioral processes because of its important projections to the nucleus accumbens and massive interconnections with associative cortical areas. These latter networks allow information coming from different sensory modalities (such as somatosensory, gustative, and visual inputs) to be codified and integrated in the prefrontal cortex (Mesulam, 2000). Such a richness of sensory afferents distinguishes prefrontal neurons from dopaminergic mesencephalic neurons, which provide a general signal for the presence of relevant stimuli (Rolls, 2000). The prefrontal cortex, therefore, is an excellent site for the establishment of neural connections correlated with associated stimuli. As such, it must be sensitive to tasks involving changes in stimulus control and learning of discriminative responses. For example, patients with prefrontal lesions show remarkable social maladjustment, indicating a loss of stimulus control at the level of environmental processes.

Orbitofrontal lesions cause substantial deficits in discrimination reversal performance in rats (Chudasama & Robbins, 2003) and monkeys (Butter, 1969; Jones & Mishkin, 1972). Animals persist in responding under original stimulus control. Consistent with these findings, monkeys with orbitofrontal lesions are also insensitive to the extinction procedure (Butter, 1969; Rolls, 2000) and continue to respond as if the previous reinforcement contingency was still active. When the brain is working normally, extinction matches increased orbitofrontal activity, confirmed by mapping human brain activity when face figures (the CS) are no longer paired with unpleasant odors (the US) (Gottfried & Dolan, 2004). Discrimination reversal also depends on dopamine modulation in the ventral striatum, notably the caudate and nucleus accumbens, possibly because of striato-frontal networks (Cools, Lewis, Clark, Barker, & Robbins, 2007; Roberts, 2008).

Stimulus control is also linked to the orbitofrontal cortex in tasks involving the discrimination of reinforcing value. Discriminative responses of orbitofrontal neurons in monkeys were evoked by visual S^Ds that signaled two appetitive S^Rs, raisins and apples. These neurons differentially responded to the preferred S^R (i.e., raisins) and its corresponding S^D. Later, the raisins were replaced by cereal, and then the apple became the preferred choice. Interestingly, neurons responded to the apple and its S^D, similar to the previous responses to raisins. Thus, the crucial variables here are not the physical properties of the presented reinforcers, but rather their relative value (Hassani, Cromwell, & Schultz, 2001; Schultz, 2000, 2004; Tremblay & Schultz, 1999). These data appear to parallel an experiment by Nicola, Yun, Wakabayashi, and Fields (2004), which showed differential discharges in nucleus accumbens neurons in three situations: the absence of the S^D and the presence of the S^D followed or not by a discriminative response (i.e., nosepoke reinforced by sucrose). Discriminative responses of nucleus accumbens neurons signaled the presence of the S^D, and they still "informed" whether an operant response would be emitted. As noted by Nicola et al. (2004), the discriminative response from nucleus accumbens neurons possibly derives from the massive glutamatergic input from the orbitofrontal cortex.

Relevant data have also been derived from the prefrontal cortex. Matsumoto, Suzuki, and Tanaka (2003) recorded the activity of prefrontal neurons in monkeys exposed to a go/no-go procedure, in which distinct S^Ds signaled reinforcement for moving a joystick or holding it without changing its course. The peculiarity of this procedure is the interposition of a time interval between these S^Ds and a go stimulus, which merely signals that the motor response must be initiated. Several visual S^Ds were used, and their function could be reversed in different trial blocks to guarantee the independence of the physical properties of the S^D. The neural records during the interval between S^D presentation and response emission showed that (1) some neurons fired only in reinforced trials, (2) other neurons fired only in non-reinforced trials, and (3) a third set of neurons fired in response to specific contingencies, such as S^D→ move the joystick → S^R+, S^D→ do not move the joystick → S^R+, or S^Δ→ any response → absence of S^R+. In summary, contingencies that produce overt discriminated responses also produce consistently differentiated neural activity, and prefrontal activity consistent with discriminated overt responses can precede these.

Even more impressive are the data from Schoenbaum, Chiba, and Gallagher (1999) on early neural discrimination. These authors recorded the activity of neurons from the basolateral amygdale¹ 1 According to Mesulam (2000), the amygdala is part of the cerebral cortex and because of its simplified cytoarchitecture is designed as a corticoid structure. which, similar to frontal cortical areas, sends massive projections to the nucleus accumbens. In different trials, rats were exposed to different odors that signaled the availability of two drinking stimuli: a pleasant sucrose solution or an aversive quinine solution (the latter of which could be avoided by simply waiting for the end of the trial). New odors were introduced in every session, and the odors that were already used could have their function reversed to ensure the control by the contingencies and not by the chemical properties of the odors. Therefore, the animals had to constantly relearn the contingencies signaled by the changeable S^Ds. The results clearly demonstrated discriminative neuronal responses to the positive or aversive contingencies. In fact, discrimination occurred as soon as the animals smelled the odors before they could reach the solution dispenser, which showed that responding was controlled by the S^D and not by the S^R. However, the most impressive results showed that in every block the discrimination at the neuronal level was observed before the behavioral discrimination (i.e., neural accuracy always preceded behavioral accuracy), and stimulus control effectively impacted neural activity even when rats were still drinking quinine. Notice that this precocious response of amygdala neurons was observed trials before the achievement of good behavioral performance and did not merely anticipate behavior during single trials, which was observed by Matsumoto et al. (2003), Schultz (2000), and Schultz et al. (2003). Schoenbaum et al. (1999) also found that the discriminative activity of orbitofrontal cortex neurons precisely correlated with behavioral discrimination. The "delay" in the orbitofrontal cortex regarding the amygdala and the "punctuality" regarding the behavioral responses are compatible with the orbitofrontal anatomical-functional properties of integrating information in motor planning. Prefrontal regions receive many low-level inputs. Some of these, such as the amygdala input, do not promptly, if at all, control behavior (Wallis, 2008). In such a case, the early discrimination from the amygdala appears to be an important example of a subtle and complex cognitive process that supposedly competes with other associative areas prior to the selection of an output response.

The refined results from experiments using neural responses as dependent variables make even clearer the adequacy of neural events as the object of study of behavioral science. In the future, other cognitive or emotional signals will certainly be able to be decoded to help solve human problems. Nature does not distinguish physical events from mental events, and this is the reason why even thoughts-in the form of discriminative responses or anticipation to movements-have a material component subject to measurement.

Up to this point, anatomical and functional properties of conditioning processes have been analyzed. Now we must understand what happens at the cellular level.

In vitro operant conditioning in mammalian neurons

The operant response extensively studied in intact organisms may also be emitted by single and isolated neurons. Stein and Belluzzi (1989), Stein et al. (1993), and Stein, Xue, and Belluzzi (1994) studied operant responses of individual in vitro cells from the hippocampus of rats. Their procedure involved basic and elegant response shaping.

After exhibiting a baseline firing pattern, the hippocampal cells received postsynaptic dopamine or dopamine agonists contingent on the emission of that pattern. The action potential train (burst) frequency requirement for the dopamine injection gradually increased. Other cells received glutamate contingent on the response. Neurons receiving dopamine and dopaminergic agonists, but not glutamate, showed increased frequency of activity bursts, indicating that even individual in vitro neurons could show operant learning² 2 The neurons that were not reinforced by postsynaptic glutamate are known to show LTP under the action of presynaptic glutamate. .

In the studies led by L. Stein, the increase in frequency was obtained only when reinforcement was contingent on bursts, not on isolated action potentials (spikes). This suggested to Donahoe et al. (1997) that dopamine has the function of selecting synaptic connections (not only responses) through the mechanism of postsynaptic long-term potentiation (LTP) because the synaptic glutamate release that produces greater sensitivity of the postsynaptic neuron depends on the bursts of the hippocampal presynaptic neurons. Donahoe and colleagues further assumed that the sequence of bursts reinforced by dopamine in experiments by L. Stein may also be caused in intact organisms by signals released from potentiated postsynaptic neurons. These retrograde signals (a candidate is nitric oxide, according to Deutch & Roth, 1999) would reach presynaptic neurons and alter their activity.

In the previous topics, the systemic aspects of reinforcement were expressed within specific circuitry. In the present topic, the operant learning also emerges as a property of hippocampal neurons and perhaps neurons of other brain areas that, similar to the hippocampus, play a role in reinforcement circuitry.

In vitro and in vivo operant and classical conditioning in mollusk neurons

The cellular basis of learning has begun to be well established. However, the specificity of cellular processes can be even more refined when central nervous system features are more easily accessed. This is the case with the mollusk Aplysia californica. Aplysia has a very simple and well known nervous system, containing approximately only 20,000 neurons, which makes it an excellent model to study the relationships between neurons and behavior. Studies of the neuronal foundations of operant behavior based on this mollusk are developing. From a series of experiments with positive reinforcement, the team led by J. H. Byrne suggested a possible basic site for conditioning, located in neurons of the buccal ganglia.

In Aplysia, food intake has as its neuronal correlate the activation of esophageal neurons, which carries the dopaminergic reinforcement signal to other systems. In an in vitro preparation, Nargeot, Baxter, and Byrne (1997) and Brembs, Baxter, and Byrne (2004) applied electrical pulses to the esophageal nerve when the buccal nerve exhibited a motor pattern typical of alimentary behavior. Activation of the esophageal nerve was presented as the reinforcer for responses from buccal nerves. These responses were also elicited by antecedent tonic stimulation of the buccal nerve to create a baseline. The results showed a significant increase in the response from buccal nerves when the S^R was administered immediately after the response or with a short delay between them, and the response was extinguished when the tonic stimulation was not followed by the reinforcer. Nargeot et al. (1997) also found discrimination with regard to the antecedent tonic stimulation signaling reinforcement. After a period without stimulation (and, consequently, with virtually absent spontaneous response emissions), responding resumed its vigor as soon as the stimulation was presented again. The similarity with an autoshaping procedure must be noticed. The response initially elicited by tonic stimulation during baseline began to be controlled also by its consequence, and the antecedent stimulus became an operant S^D, in addition to its classical function.

Neurons of the esophageal nerve contain dopamine, which once more emerges as a neurotransmitter for sensory signals of external events. The operant conditioning of the buccal nerve response was hindered by administration of the dopaminergic antagonist methylergonovine (Nargeot, Baxter, Patterson, & Byrne, 1999).

A specific buccal neuron, called B51, was especially sensitive to operant conditioning and showed plasticity properties caused by learning. Dopamine reinforcement effects on the activity of B51 removed from naive animals produced the same increased membrane excitability that was observed in neuron B51 of intact animals after food reinforcement (Brembs, Lorenzetti, Reyes, Baxter, & Byrne, 2002). In addition to providing additional evidence of the relevance of dopamine, this datum established that neuron B51 is a possible site of convergence for operant behavior and the reinforcing stimulus. The interest in the molecular mechanisms of Aplysia's buccal dopaminergic neurons then increased. Barbas et al. (2006) succeeded in cloning active dopamine₁-like receptors in that mollusk.

Byrne's team were interested in the similarities between classical and operant conditioning paradigms and investigated whether neuron B51 could be a cellular site that links reflexive and operant behavior (Lorenzetti, Mozzachiodi, Baxter, & Byrne, 2006). Neuron B51 is pivotal for alimentary behavior because its depolarization elicits neural patterns correlated with the biting response (which is basically a response for food ingestion), and its hyperpolarization inhibits such patterns. Therefore, it is a neuron critical for eliciting neural food patterns and is also sensitive to the consequences of these patterns. In a study by Lorenzetti et al. (2006), the objective was to verify whether the classical conditioning could also change the properties of B51. In in vivo and in vitro preparations, a tactile or neural CS (i.e., stimulation of nerve AT4) was paired with presentation of food or stimulation of the esophageal nerve. These pairings resulted in conditioned elicitation of in vivo biting responses and in vitro neural biting patterns. Following training, neuron B51 of dissected animals and neuron B51 directly trained in vitro showed equal plasticity, demonstrating replication between the in vivo and in vitro data. Unlike what happened in operant conditioning, diminished excitability of B51 was found in classical conditioning. However, such diminished excitability was compensated by the increased efficacy of synaptic input over B51 through the CS pathway. This made B51 generally more active. Briefly, decreased excitability would hinder and increased synaptic efficacy would facilitate the production of biting by neuron B51. Overall, some results with classical conditioning are similar to those obtained with operant conditioning. For example, the reinforcement pathway (esophageal nerve) and transmitter involved (dopamine) are the same, and B51 is a common cellular site of plasticity (Baxter & Byrne, 2006; Lorenzetti et al., 2006). Nonetheless, the excitability factor disclosed a fundamental difference because operant responses would be facilitated by the properties of the neuron itself, and reflex responses would be facilitated by synapse properties (Baxter & Byrne, 2006; Brembs et al., 2002).

Considering the above experimental description, the significant similarity between reinforcement mechanisms found in different species of animals must be stressed. For example, dopamine's role is uncontested, and it commonly interacts with glutamate in intact organisms. Notably, reinforcement mechanisms involve aspects common to mollusks and mammals and some of their individual neurons.

Neural events can replace behavioral events in operant contingency

Technical developments in neurosciences simulated, directly in the brain, the elements of a contingency. In a study by Talwar et al. (2002), rats discriminated electrical stimulations in cortical regions representing the vibrissae, which served as a signal for them to move to the left or right. Reinforcement was electrical stimulation directly applied to the medial forebrain bundle (MFB), which links the ventral tegmental area and nucleus accumbens. The animals learned the task very well, based only on internal stimulation serving as the S^D and S^R. Thus, the function of an external S^D and appetitive S^R was simulated within the organism. Also, the reinforced response can be neural. In an experiment by Nicolelis and Chapin (2002), rats and monkeys obtained appetitive reinforcers contingent on the emission of a neural activity pattern from the motor cortex, which had been correlated with a previously shaped operant motor response. A noticeable increase in the frequency of such a pattern was recorded, which demonstrated reinforcement. Another relevant datum from the above experiment is that capturing activity only from a small neuronal population (50-100 neurons) was sufficient for an algorithm to code real-time neural activity correlated with the operant motor response and to transmit such information for the system to release reinforcement contingent on that neural pattern. The precision of the algorithm was so great that it could anticipate the topography of arm movements performed by monkeys. Therefore, neural responses can be reinforced similarly to movements, and even the activity of a few dozens of neurons can be the response unit to be reinforced.

When an associative cortical area is some synapses distant from the neural output of the operant response, its activity record tends to reveal aspects of response planning, rather than performance (Hoshi, 2008; Scherberger and Andersen, 2007). Among the research on antecedent stimulus function and motor planning, the experiment by Musallam, Corneil, Greger, Scherberger, and Andersen (2004) must be highlighted. These authors implanted electrodes in the brains of monkeys in a parietal area that intermediates visual and premotor cortices and precedes reaching movements by various synapses. In their procedure, each time a S^D was shown on a screen, its specific position should be touched after approximately 1.5 s. An algorithm decoded neural activity prior to specific movements and thus could foresee during the 1.5 s interval the position that the animal's hand would reach. Thus, real movement could be dispensed, and the animal's "intention" could be reinforced. Subsequently, two S^Ds indicated different quantities, qualities, and probabilities of the reinforcer, and the algorithm prediction became even more accurate when the S^D signaled a preferred reinforcer variation. In summary, operant neural responses can reliably precede operant motor responses, similar to the study by Schoenbaum et al. (1999) regarding discrimination, showing that neural accuracy precedes behavioral accuracy. Moreover, neural responses also indicate the value of positive reinforcers. According to Musallam et al. (2004), the decoding of parietal activity revealed a correlate of "thought" that could be a neural basis of intentions and expectations. The assumption of the neural collection of thoughts should be seriously considered because information from the parietal cortex still must traverse a long way to produce a motor response and must at least pass through the premotor and primary motor cortices.

A possible application derived from these methods for reinforcing or reading neural responses is the development of equipment for assessing people with motor dysfunction, who generally have vision-related areas preserved. These areas can therefore provide signals of intentional movement. Neural knowledge turns to complex aspects of behavior, as in this case of concomitant coding, in associative cerebral areas of visual, motor, and motivational information (Musallam et al., 2004). As discussed below, other associative areas, such as the hippocampus and entorhinal cortex, also play a role in learning complex relations involved in cognition.

Cerebral structures and neural events in symbolic relations

The stimulus equivalence paradigm involves creating arbitrary categories formed by stimuli that do not have physical similarities. The paradigm is thus considered a basis for symbolic stimulus association and complex stimulus control. In equivalent class formation, different pairs of sample and comparison stimuli (e.g., A1B1, A1C1, A2B2, and A2C2) must be grouped according to a procedure of conditional discrimination called matching-to-sample. Conditional to the presentation of a single sample stimulus in each trial (e.g., A1 in one trial, A2 in another trial) is only one correct comparison stimulus (comparison B1 when A1 is the sample, or comparison B2 when A2 is the sample). Each sample presentation has at least two presented comparisons (B1 and B2). Only the choice of the correct comparison is reinforced (if the sample is A1, then the choice of B1 is reinforced, and the incorrect choice of B2 is not). Certain stimuli, called nodes, are the link that allows grouping stimuli that were not paired during the reinforced training (for reinforced AB and AC training, A is the node between B and C). After the training, tests are conducted under extinction conditions, and the establishment of pairs that were not directly reinforced is expected. These novel correct pairings are referred to as emergent relations, including symmetry (B1A1, C1A1, B2A2, and C2A2) and transitivity relations (B1C1, C1B1, B2C2, and C2B2). Through such a procedure, arbitrary stimuli of the same class become replaceable and share behavioral functions. An event may then be referred to through its substitute, and symbolic knowledge is said to emerge.

In studies of neuroscience and symbolic behavior, much has been investigated about the function of the hippocampus and connected associative areas because they are involved in memory and appear to play a role in establishing symbolic relations between stimuli (Mesulam, 2000; Miyashita, 2004). Relations emerging from symmetry and transitivity in rats with an injured hippocampus were studied by Bunsey and Eichenbaum (1996). Conditional discriminations AB and BC were trained, in addition to symmetric relation BA. Only olfactory stimuli were used because this modality is naturally involved when rats search for food. During every trial, after digging for a cup containing cereal buried in sand treated with the sample odor, two other comparison cups treated with new odors were presented. The odor of the first comparison for which rats began to dig considered the chosen comparison stimulus. In the extinction tests, injured rats neither learned symmetry CB nor transitivity AC, although control rats had accurate performance in tests of these emergent relations. These data, however, were not replicated in pigeons with hippocampal injury, which displayed normal pecking of visual stimuli in transitivity (Strasser, Ehrlinger, & Bingman, 2004).

Other ways to alter hippocampal function were also studied by H. Eichenbaum and highlight the importance of the entorhinal cortex (which has bulky interconnections with the hippocampus) in procedures of conditional discrimination. Results similar to those described by Bunsey and Eichenbaum (1996) were obtained by destroying the cholinergic afferents to the entorhinal cortex and consequently suppressing information transmitted from the entorhinal cortex to hippocampus. McGaughy, Koene, Eichenbaum, and Hasselmo (2005) studied conditional discrimination with odorized stimuli and verified that already learned conditional discrimination was maintained after lesions of cholinergic afferents in experimental subjects, even in intervals from 15 min to 3 h after training. However, no learning occurred when new odors were presented in non-reinforced tests performed after surgical lesion, reiterating the importance of hippocampal pathways for recent memories.

Additional consistent results by Bunsey and Eichenbaum (1996) and McGaughy et al. (2005) were found with lesions of the entorhinal cortex itself. Buckmaster, Eichenbaum, Amaral, Suzuki, and Rapp (2004) trained monkeys in conditional discriminations AB and AC, in which sample and comparison stimuli were cookies of different colors and shapes. The cookies used as sample and correct comparisons had the same appetitive flavor, whereas incorrect choices had a bitter flavor. Thus, the visual modality of cookies defined the antecedent stimuli (sample and comparison), whereas the gustative modality served as the positive or negative reinforcer for choices made both during training and testing because flavor was inherent to stimuli that the monkeys received and ingested. The authors verified that monkeys with an injured entorhinal cortex required longer for training and also did not show learning of transitive relations.

Confirming the importance of the entorhinal cortex in conditional discrimination, Coutureau et al. (2002) observed that lesions of the entorhinal cortex in rats, but not the hippocampus, hindered the reinforcing stimulus from joining as a member of equivalent classes of stimuli. After training with stimuli of the visual modality (i.e., chambers with different pictures on the walls), thermal modality (i.e., chambers with different temperatures), and auditory modality (i.e., sound or click), two classes were formed, each having a stimulus from the different modalities. Therefore, one class contained stimulus chamber with visual 1 (V1), stimulus chamber with temperature 1 (T1), and stimulus auditory 1 (A1), and the other class was formed by V2T2A2. Auditory stimuli were nodal stimuli from their respective classes. A substantial amount of free food was then provided in chamber V1, but not in chamber V2, and greater activity was observed in the chamber associated with plenty of food. When rats with an injured hippocampus were placed in chambers T1 and T2, they behaved as if they were in V1 and V2, respectively. However, rats with entorhinal injury were not sensitive to differential reinforcement and, therefore, did not respond in chamber T as if they were in chamber V (i.e., injury of the entorhinal cortex damaged the formation of equivalence classes).

Regarding cellular measures, if learning of conditional discrimination corresponded to differential neuronal responses, specific neural pathways or processes can be suggested to codify meanings. Sakai and Miyashita (1991) recorded responses from temporal cortex neurons of two monkeys in a matching-to-sample procedure, presenting arbitrary visual stimuli (geometric patterns) on a computer screen. Because of the fact that the temporal cortex is intimately involved in memory processes, a 4 s delay was established between the end of the presentation of the sample and the presentation of comparison stimuli. The correct comparison choice would release fruit juice as the S^R. Relations among 12 pairs of stimuli were reinforced (pairs 1-1' to 12-12'), as well as respective symmetric relations (pairs 1'-1 to 12'-12). Two patterns of neuronal electrical activity appeared in records after training. In the first pattern, some neurons responded consistently to both members of certain stimuli pairs. For example, neuron X responded to pairs 12-12' and 12'-12, and neuron Y responded to pairs 5-5' and 5'-5 and also to 6-6' and 6'-6. In the second pattern, other neurons responded better to one of the members of the pair. If, for example, the activity of neuron Z was greater for stimulus 7', then both response elicitation as soon as this stimulus was presented in pair 7'-7 and a gradual increase in the neuronal response during the delay of pair 7-7' were verified. Such activity in this delay between the presentation of the sample and comparison stimuli was not attributable to anticipation of motor activity because monkeys could not foresee the position where the comparison stimulus would appear on the screen. Discriminative responses occurred for various pairs of stimuli because individual neurons codified each element of a pair independently from the function it assumed in the contingency, either as a sample or correct comparison because neurons codified previously and correctly the presence of a particular member of the pair of stimuli. By mentioning these results and indicating that behavior analysis cannot renounce neural analysis, depending on the research problem, Donahoe (1996) suggested that "direct effects of stimulus-stimulus relations can be observed only at the neural level" (p. 72).

Future research might well record the activity of dopaminergic mesencephalic neurons of monkeys during matching-to-sample. Generally, the best accuracy in behavioral performance during training implies the best prevision of which comparison signals reinforcement. Therefore, behavioral discrepancy would be less, and the response of dopaminergic neurons should decrease. Additionally, verifying the activity of these neurons in a test of emergent relations would be even more interesting. Would neural activity denote that new relations between stimuli cause surprise and a huge discrepancy? Or, conversely, could emergent relations show the same neural pattern observed for trained relations, revealing the absence of discrepancy?

A traditional question in the equivalence area concerns "whether emergent behavior exists before we actually see it" (Sidman, 1994, p. 274). Does it already exist after training but before the test of the emergence of new relations, or does it emerge only because of the variables present in the test? For some behavioral theorists (among them, M. Sidman himself), the assumption that classes of stimuli existed before the test contingency could dangerously fall into cognitivism. With regard to this issue, Haimson, Wilkinson, Rosenquist, Ouimet, and McIlvane (2009) reported an interesting datum that strengthens Sidman's view of the need for testing. Similar to the normal case with humans, after the arbitrary matching-to-sample training, the participants of that study successfully performed the matching-to-sample test for emergent relations. But before or after the test phase, related and unrelated stimulus pairs were alternately presented in non-reinforced trials. These stimulus pairs involved the potentially related and unrelated stimuli that would appear in the equivalence test. The participants were asked to silently judge if each pair was related. Haimson and colleagues measured a brain wave pattern called N400, which is a normal voltage drop that typically appears approximately 400 ms after noticing that phrases or words are semantically mismatched. The N400 pattern was immediately seen for unrelated pairs after the matching-to-sample testing phase. However, when the electrophysiological datum was collected prior to the matching-to-sample test, N400 tended to develop over the trials with unrelated pairs. Once the N400 pattern was clearly established, the participants immediately showed accuracy in the ensuing matching-to-sample test. This datum indicated that responding (silently judging) to stimulus pairs favored equivalence class formation. Thus, Haimson et al. (2009) indicated that non-reinforced presentation of related and unrelated pairs might have the same effect of non-reinforced matching-to-sample testing. So the existence of a context of testing, regardless of whether it involves the matching-to-sample, is probably necessary for the emergence of equivalence relations, as Sidman (1994) argued. In practice, the N400 waveform might be a good neural candidate to predict accuracy in symbolic learning.

The area of equivalence class formation opens multiple possibilities of research on the behavioral-neural interface. This is especially true when considering, as suggested by Matos (1999) and Sidman (2000), that antecedent stimuli, reinforcing stimuli, covert events such as drug effects, elicited responses such as skin conductance, and operant responses determined by reinforcement schedules may all become part of a stimulus class. Such inclusion suggests that all of these elements may be part of environment-behavior units selected by reinforcement, according to the proposal by Donahoe and Palmer (1994). A potentially fertile field is revealed here for the investigation of the neurobiological and behavioral variables confluence.

Final Considerations

The neural processes involved in stimulus control described in most articles selected for this paper parallel the conditioning paradigms. Such integration between the two paradigms demands careful analysis of the literature because almost all of the papers found were studies using either the classical or operant paradigm; they did not have to address, for example, the comparable features of experimental design. Few of the cited authors dealt with both paradigms, particularly J. H. Byrne. Apparently, even W. Schultz, one of the most quoted researchers in the present work, did not show concern in comparing paradigms. His analyses are simply descriptions of relations in terms of predictor stimuli (CS or S^D) and predictable stimuli (US or S^R), without mention of any possible interference between classical and operant contingencies. For example, in classical procedures, Schultz neither suggested nor controlled the possibility that superstitious operant responses (temporally contiguous but not causally contingent on consequence) interfered in the intervals between presentation of the CS and US (Fiorillo et al., 2003). Schultz also did not consider that although the antecedent stimulus normally serves as an S^D in operant procedures it is impossible to assuredly know, at the neural level, whether the pathway of the antecedent stimulus would not actually elicit processes that culminate in the measured neural activity (Hassani et al., 2001; Schultz, 2000, 2004; Schultz et al., 2003; Tremblay & Schultz, 1999). The S^D for an operant task, in fact, could also be a CS for neuronal activity.

A well-timed summary of the discussed findings will now be presented to clarify the similarities and differences between the mechanisms of both conditioning paradigms. The present paper also discussed the similarities with regard to the following points:

Enhanced dopaminergic activity facilitates classical and operant conditioning (Wise, 2004; Taylor & Robbins, 1984), and decreased dopaminergic activity attenuates both forms of learning (Wise, 2004; Wolterink et al., 1993).

Dopamine has an uncontested role as a signal of behavioral discrepancy in neural pathways of positive reinforcement in mammals. When the presentation of the US or S^R+ is signaled, responses from dopaminergic neurons decrease for these stimuli and are transferred to the CS or S^D (Fiorillo et al., 2003). The CS also evokes neuronal discriminative responses that vary according to US probability (i.e., to the degree of discrepancy) (Fiorillo et al., 2003).

Neuronal responses evoked by the S^D inform whether there will be an emission of a motor response (Nicola et al., 2004) and also indicate the discrimination of preferred reinforcers (Hassani et al., 2001; Schultz, 2000, 2004; Tremblay & Schultz, 1999).

Prefrontal and orbitofrontal cortices are important structures involved in stimulus control. Their neurons respond accordingly to stimuli that precede reinforcers preferred by monkeys (Hassani et al., 2001; Schultz, 2000, 2004; Tremblay & Schultz, 1999). Lesions of these cortices hinder successful reversal of stimulus control in rats, monkeys, and humans (Butter, 1969; Chudasama & Robbins, 2003; Jones & Mishkin, 1972; Rolls, 2000) and also block both classical and operant extinction (Butter, 1969; Gottfried & Dollan, 2004; Rolls, 2000).

Glutamate modulates dopaminergic activity in both conditioning paradigms and is released by the CS and S^D antecedent stimulus pathway (Wise, 2004).

Until the present time, only the nervous system of the Aplysia allowed the observation, in the same experimental design, of both classical and operant conditioning. Both forms of learning use the same neural pathways for reinforcement and the same neurotransmitter (dopamine) and also produce plasticity in the same buccal motor neuron, B51 (Baxter & Byrne, 2006; Lorenzetti et al., 2006).

The data reviewed so far, with respect to stimulus control, suggest that the boundaries between reflexive and operant behavior are feeble. Nevertheless, an important difference was found by Lorenzetti et al. (2006). The plasticity shown by neuron B51 of Aplysia took distinct courses in operant and classical conditioning. In the former, neuronal excitability increased, whereas it diminished in the latter. The case of classical conditioning appears apparently incongruent because diminished excitability tends to produce minor CR elicitation. However, a concomitant increase was found in the excitatory input of the presynaptic neuron that compensated for the diminished excitability of B51 in classical conditioning. Differences were found in the intrinsic properties of neuron B51, which justifies, according to Lorenzetti and colleagues, new detailed investigations about the reinforcement pathway (US or S^R+) and CS pathway of Aplysia.

The study of neural variables may in fact be potentially applied to the entire research program of behavior analysis. For example, stimulus equivalence is an obvious field ripe for exploration. In addition to the promising use of waveforms such as N400 in anticipating accuracy in complex learning (Haimson et al., 2009), researchers can explore the common pathways that are reasonably used by all stimuli that share behavioral functions. If these pathways are found, then studying how neural convergence is created for different stimuli assembled in a class is possible. Some important methodologies were not included in this paper because they would add an untenable volume of text. Among them are studies on biofeedback and neuroimaging techniques.

Finally, we hope the presented analysis has caused some positive behavioral discrepancy in the reader. If so, then some theoretical progress in the understanding of the biology of reinforcement has been achieved.

Received 13 September 2010; received in revised form 21 November 2010; accepted 22 November 2010. Available on line 28 December 2010

Luiz Guilherme Gomes Cardim Guerra and Maria Teresa Araujo Silva, Departamento de Psicologia Experimental, Instituto de Psicologia, Universidade de São Paulo, Brazil.

Alves, C.R.R., Guerra, L.G.G.C., & Silva, M.T.A. (1999). Inibição latente, um modelo experimental de esquizofrenia. Psiquiatria Biológica, 7, 111-117.
Barbas, D., Zappulla, J.P., Angers, S., Bouvier, M., Mohamed, H.A., Byrne, J.H., Castellucci, V.F., & DesGroseillers, L. (2006). An aplysia dopamine₁-like receptor: molecular and functional characterization. Journal of Neurochemistry, 96, 414-427.
Baum, W.M. (1994). Understanding behaviorism: science, behavior, and culture. New York: HarperCollins.
Baxter, D.A., & Byrne, J.H. (2006). Feeding behavior of Aplysia: a model system for comparing cellular mechanisms of classical and operant conditioning. Learning and Memory, 13, 669-680.
Brembs, B., Lorenzetti, F.D., Reyes, F.D., Baxter, D.A., & Byrne, J.H. (2002). Operant reward learning in Aplysia: neuronal correlates and mechanisms. Science, 296, 1706-1709.
Brembs, B., Baxter, D.A., & Byrne, J.H. (2004). Extending in vitro conditioning in Aplysia to analyze operant and classical processes in the same preparation. Learning and Memory, 11, 412-420.
Buckmaster, C.A., Eichenbaum, H., Amaral, D.G., Suzuki, W.A., & Rapp, P.R. (2004). Entorhinal cortex lesions disrupt the relational organization of memory in monkeys. Journal of Neuroscience, 24, 9811-9825.
Bunsey, M., & Eichenbaum, H. (1996). Conservation of hippocampal memory function in rats and humans. Nature, 379, 255-257.
Butter, C.M. (1969). Perseveration in extinction and in discrimination reversal tasks following selective frontal ablations in Macaca mulatta Physiology and Behavior, 4, 163-171.
Catania, A.C. (1998). Learning. Upper Saddle River, N.J.: Prentice Hall.
Chiesa, M. (1994). Radical behaviorism: the philosophy and the science. Boston: Authors Cooperative.
Chilingaryan, L.I. (2001). I.P. Pavlov's theory of higher nervous activity: landmarks and developmental trends. Neuroscience and Behavioral Physiology, 31, 39-47.
Chudasama, Y., & Robbins, T.W. (2003). Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. Journal of Neuroscience, 23, 8771-8780.
Cools, R., Lewis, S.J.G., Clark, L., Barker, R.A., & Robbins, T.W. (2007). L-DOPA disrupts activity in the nucleus accumbens during reversal learning in Parkinson's disease. Neuropsychopharmacology, 32, 180-189.
Coutureau, E., Killcross, A.S., Good, M., Marshall, V.J., Ward-Robinson, J., & Honey, R.C. (2002). Acquired equivalence and distinctiveness of cues: II. Neural manipulations and their implications. Journal of Experimental Psychology: Animal Behavior Processes, 28, 388-396.
DeLong, M.R. (2000). The basal ganglia. In: E.R. Kandel, J.H. Schwartz, & T.M. Jessell (Eds.), Principles of neural science (pp. 853-867). New York: McGraw-Hill.
Deutch, A.Y., Bourdelais, A.J., & Zahm, D.S. (1993). The nucleus accumbens core and shell: accumbal compartments and their functional attributes. In: P.W. Kalivas & C.D. Barnes (Eds.), Limbic motor circuits and neuropsychiatry (pp. 45-88). Boca Raton, F.L.: CRC Press.
Deutch, A.Y., & Roth, R.H. (1999). Neurotransmitters. In: M.J. Zigmond, F.E. Bloom, S.C. Landis, J.L. Roberts, & L.R. Squire (Eds.), Fundamental neuroscience (pp. 193-234). San Diego, C.A.: Academic Press.
DiFiore, A., Dube, W.V., Oross, S. III, Wilkinson, K., Deutsch, C.K., & McIlvane, W.J. (2000). Studies of brain activity correlates of behavior in individuals with and without developmental disabilities. Experimental Analysis of Human Behavior Bulletin, 18, 33-35.
Donahoe, J.W. (1996). On the relation between behavior analysis and biology. Behavior Analyst, 19, 71-73.
Donahoe, J.W. (2003). Selectionism. In: K.A. Lattal & P.N. Chase (Eds.), Behavior theory and philosophy (pp. 103-128). New York: Kluwer/Plenum.
Donahoe, J.W., & Palmer, D.C. (1994). Learning and complex behavior. Boston: Allyn and Bacon.
Donahoe, J.W., Palmer, D.C., & Burgos, J.E. (1997). The S-R isuue: its status in behavior analysis and in Donahoe and Palmer's Learning and complex behavior Journal of the Experimental Analysis of Behavior, 67, 193-211.
Donahoe, J.W., & Vegas, R. (2004). Pavlovian conditioning: the CS-UR relation. Journal of Experimental Psychology: Animal Behavior Processes, 30, 17-33.
Ettenberg, A., & Camp, C.H. (1986a). Haloperidol induces a partial reinforcement extinction in rats: implications for a dopamine involvement in food reward. Pharmacology Biochemistry and Behavior, 25, 813-821.
Ettenberg, A., & Camp, C.H. (1986b). A partial reinforcement extinction effect in water-reinforced rats intermittently treated with haloperidol. Pharmacology Biochemistry and Behavior, 25, 1231-1235.
Fetz, E.E., & Finocchio, D.V. (1971). Operant conditioning of specific patterns of neural and muscular activity. Science, 174, 431-435.
Fiorillo, C.D., Tobler, P.N., & Schultz, W. (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science, 299, 1898-1902.
Gómez-Pinilla, F., Huie, J.R., Ying, Z., Ferguson, A.R., Crown, E.D., Baumbauer, K.M., Edgerton, V.R., & Grau, J.W. (2007). BDNF and learning: evidence that instrumental training promotes learning within the spinal cord by up-regulating BDNF expression. Neuroscience, 148, 893-906.
Gottfried, J.A., & Dolan, R.J. (2004). Human orbitofrontal cortex mediates extinction learning while accessing conditioned representations of value. Nature Neuroscience, 7, 1144-1152.
Haimson, B., Wilkinson, K.M., Rosenquist, C., Ouimet, C., & McIlvane, W.J. (2009). Electrophysiological correlates of stimulus equivalence processes. Journal of the Experimental Analysis of Behavior, 92, 245-256.
Hassani, O.K., Cromwell, H.C., & Schultz, W. (2001). Influence of expectation of different rewards on behavior-related neuronal activity in the striatum. Journal of Neurophysiology, 85, 2477-2489.
Heimer, L. (1994). The human brain and spinal cord: functional neuroanatomy and dissection guide New York: Springer-Verlag.
Hoshi, E. (2008). Differential involvement of the prefrontal, premotor, and primary motor cortices in rule-based motor behavior. In: S.A. Bunge & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 159-175). New York: Oxford University Press.
Jones, B., & Mishkin, M. (1972). Limbic lesions and the problem of stimulus-reinforcement associations. Experimental Neurology, 36, 362-377.
Kelley, A.E. (2004). Ventral striatal control of appetitive motivation: role in ingestive behavior and reward-related learning. Neuroscience and Biobehavioral Reviews, 27, 765-776.
Konorski, J., & Miller, S. (1937). On two types of conditioned reflex. Journal of General Psychology, 16, 264-272.
Kupfermann, I., Kandel, E.R., & Iversen, S. (2000). Motivational and addictive states. In: E.R. Kandel, J.H. Schwartz, & T.M. Jessell (Eds.), Principles of neural science (pp. 998-1013). New York: McGraw-Hill.
Lorenzetti, F.D., Mozzachiodi, R., Baxter, D.A., & Byrne, J.H. (2006). Classical and operant conditioning differentially modify the intrinsic properties of an identified neuron. Nature Neuroscience, 9, 17-19.
Lubow, R.E. (1998). Latent inhibition and behavior pathology: prophylactic and other possible effects of stimulus preexposure. In: W.T. O'Donohue (Ed.), Learning and behavior therapy (pp. 107-121). Boston: Allyn and Bacon.
Martin, J.H. (1998). Neuroanatomia: texto e atlas. Porto Alegre: Artes Médicas.
Matos, M.A. (1999). Controle de estímulo condicional, formação de classes conceituais e comportamentos cognitivos. Revista Brasileira de Terapia Comportamental e Cognitiva, 1, 159-178.
Matsumoto, K., Suzuki, W., & Tanaka, K. (2003). Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science, 301, 229-232.
McGaughy, J., Koene, R.A., Eichenbaum, H., & Hasselmo, M.E. (2005). Cholinergic deafferentation of the entorhinal cortex in rats impairs encoding of novel but not familiar stimuli in a delayed nonmatch-to-sample task. Journal of Neuroscience, 25, 10273-10281.
McIlvane, W.J., & Dube, W.V. (1997). Units of analysis and the environmental control of behavior. Journal of the Experimental Analysis of Behavior, 67, 235-239.
McKim, W.A. (2007). Drugs and behavior: an introduction to behavioral pharmacology Upper Saddle River, N.J.: Prentice Hall.
Mesulam, M.M. (2000). Principles of behavioral and cognitive neurology. New York: Oxford University Press.
Miyashita, Y. (2004). Cognitive memory: cellular and network machineries and their top-down control. Science, 306, 435-440.
Morgan, M.J. (1974). Resistance to satiation. Animal Behaviour, 22, 449-466.
Morris, E.K., Lazo, J.F., & Smith, N.G. (2004). Whether, when, and why Skinner published on biological participation in behavior. Behavior Analyst, 27, 153-169.
Musallam, S., Corneil, B.D., Greger, B., Scherberger, H., & Andersen, R.A. (2004). Cognitive control signals for neural prosthetics. Science, 305, 258-262.
Nargeot, R., Baxter, D.A., & Byrne, J.H. (1997). Contingent-dependent enhancement of rhythmic motor patterns: an in vitro analog of operant conditioning. Journal of Neuroscience, 17, 8093-8105.
Nargeot, R., Baxter, D.A., Patterson, G.W., & Byrne, J.H. (1999). Dopaminergic synapses mediate neuronal changes in an analogue of operant conditioning. Journal of Neurophysiology, 81, 1983-1987.
Nestler, E.J., Hyman, S.E., & Malenka, R.C. (2001). Molecular neuropharmacology: a foundation for clinical neuroscience. New York: McGraw-Hill.
Nicola, S.M., Yun, I.A., Wakabayashi, K.T., & Fields, H.L. (2004). Cue-evoked firing of nucleus accumbens neurons encodes motivational significance during a discriminative stimulus task. Journal of Neurophysiology, 91, 1840-1865.
Nicolelis, M.A.L., & Chapin, J.K. (2002). Controlling robots with the mind. Scientific American, 287, 24-31.
Olds, J., & Milner, P. (1954). Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. Journal of Comparative and Physiological Psychology, 47, 419-427.
Pavlov, I.P. (1927). Conditioned reflexes: an investigation of the physiological activity of the cerebral cortex. London: Oxford University Press.
Pavlov, I.P. (1986). Mecanismo fisiológico de los movimientos voluntarios. In: A. Colodrón (Ed.), Fisiología y psicología (pp. 143-148). Madrid: Alianza Editorial. Original work published in 1936.
Rescorla, R.A. (1988). Pavlovian conditioning: it's not what you think it is. American Psychologist, 43, 151-160.
Roberts, A.C. (2008). Dopaminergic and serotonergic modulation of two distinct forms of flexible cognitive control: attentional set-shifting and reversal learning. In: S.A. Bunge, & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 283-312). New York: Oxford University Press.
Rolls, E.T. (2000). The orbitofrontal cortex and reward. Cerebral Cortex, 10, 284-294.
Saint-Cyr, J.A. (2003). Frontal-striatal circuit functions: context, sequence and consequence. Journal of the International Neuropsychological Society, 9, 103-127.
Sakai, K., & Miyashita, Y. (1991). Neural organization for the long-term memory of paired associates. Nature, 354, 152-155.
Scherberger, H., & Andersen, R.A. (2007). Target selection signals for arm reaching in the posterior parietal cortex. Journal of Neuroscience, 27, 2001-2012.
Schoenbaum, G., Chiba, A.A., & Gallagher, M. (1999). Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. Journal of Neuroscience, 19, 1876-1884.
Schultz, W. (1999). The reward signal of midbrain dopamine neurons. News in Physiological Sciences, 14, 249-255.
Schultz, W. (2000). Multiple reward signals in the brain. Nature Reviews Neuroscience, 1, 199-207.
Schultz, W. (2004). Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current Opinion in Neurobiology, 14, 139-147.
Schultz, W., Dayan, P., & Montague, P.R. (1997). A neural substrate of prediction and reward. Science, 275, 1593-1599.
Schultz, W., Tremblay, L., & Hollerman, J.R. (2003). Changes in behavior-related neuronal activity in the striatum during learning. Trends in Neurosciences, 26, 321-328.
Sidman, M. (1960). Tactics of scientific research: evaluating experimental data in psychology. New York: Basic Books.
Sidman, M. (1994). Equivalence relations and behavior: a research story. Boston: Authors Cooperative.
Sidman, M. (2000). Equivalence relations and the reinforcement contingency. Journal of the Experimental Analysis of Behavior, 74, 127-146.
Silva, M.T.A. (2005). Análise biocomportamental. Neurociências, 2, 43-47.
Silva, M.T.A., Gonçalves, F.L., & Garcia-Mijares, M. (2007). Neural events in the reinforcement contingency. Behavior Analyst, 30, 17-30.
Skinner, B.F. (1938). The behavior of organisms: an experimental analysis. New York: D. Apple-Century.
Skinner, B.F. (1953). Science and human behavior. New York: Macmillan.
Skinner, B.F. (1974). About behaviorism. New York: Knopf.
Skinner, B.F. (1988a). Skinner's reply to Harnad. In: A.C. Catania, & S.R. Harnad (Eds.), The selection of behavior: the operant behaviorism of B.F. Skinner-comments and consequences (pp. 468-473). Cambridge: Cambridge University Press.
Skinner, B.F. (1988b). Skinner's reply to Catania. In: A.C. Catania, & S.R. Harnad (Eds.), The selection of behavior: the operant behaviorism of B.F. Skinner-comments and consequences (pp. 483-488). Cambridge: Cambridge University Press.
Stein, L., & Belluzzi, J.D. (1989). Cellular investigations of behavioral reinforcement. Neuroscience and Biobehavioral Reviews, 13, 69-80.
Stein, L., Xue, B.G., & Belluzzi, J.D. (1993). A cellular analogue of operant conditioning. Journal of the Experimental Analysis of Behavior, 60, 41-53.
Stein, L., Xue, B.G., & Belluzzi, J.D. (1994). In vitro reinforcement of hippocampal bursting: a search for Skinner's atoms of behavior. Journal of the Experimental Analysis of Behavior, 61, 155-168.
Strasser, R., Ehrlinger, J.M., & Bingman, V.P. (2004). Transitive behavior in hippocampal-lesioned pigeons. Brain, Behavior and Evolution, 63, 181-188.
Talwar, S.K., Xu, S., Hawley, E.S., Weiss, S.A., Moxon, K.A., & Chapin, J.K. (2002). Rat navigation guided by remote control. Nature, 417, 37-38.
Taylor, J.R., & Robbins, T.W. (1984). Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens. Psychopharmacology, 84, 405-412.
Tremblay, L., & Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature, 398, 704-708.
Wallis, J.D. (2008). Single neuron activity underlying behavior-guiding rules. In: S.A. Bunge, & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 23-44). New York: Oxford University Press.
Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5, 483-494.
Wolterink, G., Phillips, G., Cador, M., Donselaar-Wolterink, I., Robbins, T.W., & Everitt, B.J. (1993). Relative roles of ventral striatal D₁ and D₂ dopamine receptors in responding with conditioned reinforcement. Psychopharmacology, 110, 355-364.

Correspondence regarding this article should be directed to:

Luiz Guilherme G. C. Guerra

Rua Dona Antônia de Queirós, 504, cj. 25

São Paulo, SP, Brazil, CEP 01307-013

Phone: +55-11-3257-1899

E-mail:

lugui@conscientia.com.br

1

According to Mesulam (2000), the amygdala is part of the cerebral cortex and because of its simplified cytoarchitecture is designed as a corticoid structure.

2

The neurons that were not reinforced by postsynaptic glutamate are known to show LTP under the action of presynaptic glutamate.

Publication Dates

Publication in this collection
21 Mar 2011
Date of issue
Dec 2010

History

Reviewed
22 Nov 2010
Received
21 Nov 2010

This work is licensed under a Creative Commons Attribution 4.0 International License.

[1] Alves, C.R.R., Guerra, L.G.G.C., & Silva, M.T.A. (1999). Inibição latente, um modelo experimental de esquizofrenia. Psiquiatria Biológica, 7, 111-117.

[2] Barbas, D., Zappulla, J.P., Angers, S., Bouvier, M., Mohamed, H.A., Byrne, J.H., Castellucci, V.F., & DesGroseillers, L. (2006). An aplysia dopamine₁-like receptor: molecular and functional characterization. Journal of Neurochemistry, 96, 414-427.

[3] Baum, W.M. (1994). Understanding behaviorism: science, behavior, and culture. New York: HarperCollins.

[4] Baxter, D.A., & Byrne, J.H. (2006). Feeding behavior of Aplysia: a model system for comparing cellular mechanisms of classical and operant conditioning. Learning and Memory, 13, 669-680.

[5] Brembs, B., Lorenzetti, F.D., Reyes, F.D., Baxter, D.A., & Byrne, J.H. (2002). Operant reward learning in Aplysia: neuronal correlates and mechanisms. Science, 296, 1706-1709.

[6] Brembs, B., Baxter, D.A., & Byrne, J.H. (2004). Extending in vitro conditioning in Aplysia to analyze operant and classical processes in the same preparation. Learning and Memory, 11, 412-420.

[7] Buckmaster, C.A., Eichenbaum, H., Amaral, D.G., Suzuki, W.A., & Rapp, P.R. (2004). Entorhinal cortex lesions disrupt the relational organization of memory in monkeys. Journal of Neuroscience, 24, 9811-9825.

[8] Bunsey, M., & Eichenbaum, H. (1996). Conservation of hippocampal memory function in rats and humans. Nature, 379, 255-257.

[9] Butter, C.M. (1969). Perseveration in extinction and in discrimination reversal tasks following selective frontal ablations in Macaca mulatta Physiology and Behavior, 4, 163-171.

[10] Catania, A.C. (1998). Learning. Upper Saddle River, N.J.: Prentice Hall.

[11] Chiesa, M. (1994). Radical behaviorism: the philosophy and the science. Boston: Authors Cooperative.

[12] Chilingaryan, L.I. (2001). I.P. Pavlov's theory of higher nervous activity: landmarks and developmental trends. Neuroscience and Behavioral Physiology, 31, 39-47.

[13] Chudasama, Y., & Robbins, T.W. (2003). Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. Journal of Neuroscience, 23, 8771-8780.

[14] Cools, R., Lewis, S.J.G., Clark, L., Barker, R.A., & Robbins, T.W. (2007). L-DOPA disrupts activity in the nucleus accumbens during reversal learning in Parkinson's disease. Neuropsychopharmacology, 32, 180-189.

[15] Coutureau, E., Killcross, A.S., Good, M., Marshall, V.J., Ward-Robinson, J., & Honey, R.C. (2002). Acquired equivalence and distinctiveness of cues: II. Neural manipulations and their implications. Journal of Experimental Psychology: Animal Behavior Processes, 28, 388-396.

[16] DeLong, M.R. (2000). The basal ganglia. In: E.R. Kandel, J.H. Schwartz, & T.M. Jessell (Eds.), Principles of neural science (pp. 853-867). New York: McGraw-Hill.

[17] Deutch, A.Y., Bourdelais, A.J., & Zahm, D.S. (1993). The nucleus accumbens core and shell: accumbal compartments and their functional attributes. In: P.W. Kalivas & C.D. Barnes (Eds.), Limbic motor circuits and neuropsychiatry (pp. 45-88). Boca Raton, F.L.: CRC Press.

[18] Deutch, A.Y., & Roth, R.H. (1999). Neurotransmitters. In: M.J. Zigmond, F.E. Bloom, S.C. Landis, J.L. Roberts, & L.R. Squire (Eds.), Fundamental neuroscience (pp. 193-234). San Diego, C.A.: Academic Press.

[19] DiFiore, A., Dube, W.V., Oross, S. III, Wilkinson, K., Deutsch, C.K., & McIlvane, W.J. (2000). Studies of brain activity correlates of behavior in individuals with and without developmental disabilities. Experimental Analysis of Human Behavior Bulletin, 18, 33-35.

[20] Donahoe, J.W. (1996). On the relation between behavior analysis and biology. Behavior Analyst, 19, 71-73.

[21] Donahoe, J.W. (2003). Selectionism. In: K.A. Lattal & P.N. Chase (Eds.), Behavior theory and philosophy (pp. 103-128). New York: Kluwer/Plenum.

[22] Donahoe, J.W., & Palmer, D.C. (1994). Learning and complex behavior. Boston: Allyn and Bacon.

[23] Donahoe, J.W., Palmer, D.C., & Burgos, J.E. (1997). The S-R isuue: its status in behavior analysis and in Donahoe and Palmer's Learning and complex behavior Journal of the Experimental Analysis of Behavior, 67, 193-211.

[24] Donahoe, J.W., & Vegas, R. (2004). Pavlovian conditioning: the CS-UR relation. Journal of Experimental Psychology: Animal Behavior Processes, 30, 17-33.

[25] Ettenberg, A., & Camp, C.H. (1986a). Haloperidol induces a partial reinforcement extinction in rats: implications for a dopamine involvement in food reward. Pharmacology Biochemistry and Behavior, 25, 813-821.

[26] Ettenberg, A., & Camp, C.H. (1986b). A partial reinforcement extinction effect in water-reinforced rats intermittently treated with haloperidol. Pharmacology Biochemistry and Behavior, 25, 1231-1235.

[27] Fetz, E.E., & Finocchio, D.V. (1971). Operant conditioning of specific patterns of neural and muscular activity. Science, 174, 431-435.

[28] Fiorillo, C.D., Tobler, P.N., & Schultz, W. (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science, 299, 1898-1902.

[29] Gómez-Pinilla, F., Huie, J.R., Ying, Z., Ferguson, A.R., Crown, E.D., Baumbauer, K.M., Edgerton, V.R., & Grau, J.W. (2007). BDNF and learning: evidence that instrumental training promotes learning within the spinal cord by up-regulating BDNF expression. Neuroscience, 148, 893-906.

[30] Gottfried, J.A., & Dolan, R.J. (2004). Human orbitofrontal cortex mediates extinction learning while accessing conditioned representations of value. Nature Neuroscience, 7, 1144-1152.

[31] Haimson, B., Wilkinson, K.M., Rosenquist, C., Ouimet, C., & McIlvane, W.J. (2009). Electrophysiological correlates of stimulus equivalence processes. Journal of the Experimental Analysis of Behavior, 92, 245-256.

[32] Hassani, O.K., Cromwell, H.C., & Schultz, W. (2001). Influence of expectation of different rewards on behavior-related neuronal activity in the striatum. Journal of Neurophysiology, 85, 2477-2489.

[33] Heimer, L. (1994). The human brain and spinal cord: functional neuroanatomy and dissection guide New York: Springer-Verlag.

[34] Hoshi, E. (2008). Differential involvement of the prefrontal, premotor, and primary motor cortices in rule-based motor behavior. In: S.A. Bunge & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 159-175). New York: Oxford University Press.

[35] Jones, B., & Mishkin, M. (1972). Limbic lesions and the problem of stimulus-reinforcement associations. Experimental Neurology, 36, 362-377.

[36] Kelley, A.E. (2004). Ventral striatal control of appetitive motivation: role in ingestive behavior and reward-related learning. Neuroscience and Biobehavioral Reviews, 27, 765-776.

[37] Konorski, J., & Miller, S. (1937). On two types of conditioned reflex. Journal of General Psychology, 16, 264-272.

[38] Kupfermann, I., Kandel, E.R., & Iversen, S. (2000). Motivational and addictive states. In: E.R. Kandel, J.H. Schwartz, & T.M. Jessell (Eds.), Principles of neural science (pp. 998-1013). New York: McGraw-Hill.

[39] Lorenzetti, F.D., Mozzachiodi, R., Baxter, D.A., & Byrne, J.H. (2006). Classical and operant conditioning differentially modify the intrinsic properties of an identified neuron. Nature Neuroscience, 9, 17-19.

[40] Lubow, R.E. (1998). Latent inhibition and behavior pathology: prophylactic and other possible effects of stimulus preexposure. In: W.T. O'Donohue (Ed.), Learning and behavior therapy (pp. 107-121). Boston: Allyn and Bacon.

[41] Martin, J.H. (1998). Neuroanatomia: texto e atlas. Porto Alegre: Artes Médicas.

[42] Matos, M.A. (1999). Controle de estímulo condicional, formação de classes conceituais e comportamentos cognitivos. Revista Brasileira de Terapia Comportamental e Cognitiva, 1, 159-178.

[43] Matsumoto, K., Suzuki, W., & Tanaka, K. (2003). Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science, 301, 229-232.

[44] McGaughy, J., Koene, R.A., Eichenbaum, H., & Hasselmo, M.E. (2005). Cholinergic deafferentation of the entorhinal cortex in rats impairs encoding of novel but not familiar stimuli in a delayed nonmatch-to-sample task. Journal of Neuroscience, 25, 10273-10281.

[45] McIlvane, W.J., & Dube, W.V. (1997). Units of analysis and the environmental control of behavior. Journal of the Experimental Analysis of Behavior, 67, 235-239.

[46] McKim, W.A. (2007). Drugs and behavior: an introduction to behavioral pharmacology Upper Saddle River, N.J.: Prentice Hall.

[47] Mesulam, M.M. (2000). Principles of behavioral and cognitive neurology. New York: Oxford University Press.

[48] Miyashita, Y. (2004). Cognitive memory: cellular and network machineries and their top-down control. Science, 306, 435-440.

[49] Morgan, M.J. (1974). Resistance to satiation. Animal Behaviour, 22, 449-466.

[50] Morris, E.K., Lazo, J.F., & Smith, N.G. (2004). Whether, when, and why Skinner published on biological participation in behavior. Behavior Analyst, 27, 153-169.

[51] Musallam, S., Corneil, B.D., Greger, B., Scherberger, H., & Andersen, R.A. (2004). Cognitive control signals for neural prosthetics. Science, 305, 258-262.

[52] Nargeot, R., Baxter, D.A., & Byrne, J.H. (1997). Contingent-dependent enhancement of rhythmic motor patterns: an in vitro analog of operant conditioning. Journal of Neuroscience, 17, 8093-8105.

[53] Nargeot, R., Baxter, D.A., Patterson, G.W., & Byrne, J.H. (1999). Dopaminergic synapses mediate neuronal changes in an analogue of operant conditioning. Journal of Neurophysiology, 81, 1983-1987.

[54] Nestler, E.J., Hyman, S.E., & Malenka, R.C. (2001). Molecular neuropharmacology: a foundation for clinical neuroscience. New York: McGraw-Hill.

[55] Nicola, S.M., Yun, I.A., Wakabayashi, K.T., & Fields, H.L. (2004). Cue-evoked firing of nucleus accumbens neurons encodes motivational significance during a discriminative stimulus task. Journal of Neurophysiology, 91, 1840-1865.

[56] Nicolelis, M.A.L., & Chapin, J.K. (2002). Controlling robots with the mind. Scientific American, 287, 24-31.

[57] Olds, J., & Milner, P. (1954). Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. Journal of Comparative and Physiological Psychology, 47, 419-427.

[58] Pavlov, I.P. (1927). Conditioned reflexes: an investigation of the physiological activity of the cerebral cortex. London: Oxford University Press.

[59] Pavlov, I.P. (1986). Mecanismo fisiológico de los movimientos voluntarios. In: A. Colodrón (Ed.), Fisiología y psicología (pp. 143-148). Madrid: Alianza Editorial. Original work published in 1936.

[60] Rescorla, R.A. (1988). Pavlovian conditioning: it's not what you think it is. American Psychologist, 43, 151-160.

[61] Roberts, A.C. (2008). Dopaminergic and serotonergic modulation of two distinct forms of flexible cognitive control: attentional set-shifting and reversal learning. In: S.A. Bunge, & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 283-312). New York: Oxford University Press.

[62] Rolls, E.T. (2000). The orbitofrontal cortex and reward. Cerebral Cortex, 10, 284-294.

[63] Saint-Cyr, J.A. (2003). Frontal-striatal circuit functions: context, sequence and consequence. Journal of the International Neuropsychological Society, 9, 103-127.

[64] Sakai, K., & Miyashita, Y. (1991). Neural organization for the long-term memory of paired associates. Nature, 354, 152-155.

[65] Scherberger, H., & Andersen, R.A. (2007). Target selection signals for arm reaching in the posterior parietal cortex. Journal of Neuroscience, 27, 2001-2012.

[66] Schoenbaum, G., Chiba, A.A., & Gallagher, M. (1999). Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. Journal of Neuroscience, 19, 1876-1884.

[67] Schultz, W. (1999). The reward signal of midbrain dopamine neurons. News in Physiological Sciences, 14, 249-255.

[68] Schultz, W. (2000). Multiple reward signals in the brain. Nature Reviews Neuroscience, 1, 199-207.

[69] Schultz, W. (2004). Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current Opinion in Neurobiology, 14, 139-147.

[70] Schultz, W., Dayan, P., & Montague, P.R. (1997). A neural substrate of prediction and reward. Science, 275, 1593-1599.

[71] Schultz, W., Tremblay, L., & Hollerman, J.R. (2003). Changes in behavior-related neuronal activity in the striatum during learning. Trends in Neurosciences, 26, 321-328.

[72] Sidman, M. (1960). Tactics of scientific research: evaluating experimental data in psychology. New York: Basic Books.

[73] Sidman, M. (1994). Equivalence relations and behavior: a research story. Boston: Authors Cooperative.

[74] Sidman, M. (2000). Equivalence relations and the reinforcement contingency. Journal of the Experimental Analysis of Behavior, 74, 127-146.

[75] Silva, M.T.A. (2005). Análise biocomportamental. Neurociências, 2, 43-47.

[76] Silva, M.T.A., Gonçalves, F.L., & Garcia-Mijares, M. (2007). Neural events in the reinforcement contingency. Behavior Analyst, 30, 17-30.

[77] Skinner, B.F. (1938). The behavior of organisms: an experimental analysis. New York: D. Apple-Century.

[78] Skinner, B.F. (1953). Science and human behavior. New York: Macmillan.

[79] Skinner, B.F. (1974). About behaviorism. New York: Knopf.

[80] Skinner, B.F. (1988a). Skinner's reply to Harnad. In: A.C. Catania, & S.R. Harnad (Eds.), The selection of behavior: the operant behaviorism of B.F. Skinner-comments and consequences (pp. 468-473). Cambridge: Cambridge University Press.

[81] Skinner, B.F. (1988b). Skinner's reply to Catania. In: A.C. Catania, & S.R. Harnad (Eds.), The selection of behavior: the operant behaviorism of B.F. Skinner-comments and consequences (pp. 483-488). Cambridge: Cambridge University Press.

[82] Stein, L., & Belluzzi, J.D. (1989). Cellular investigations of behavioral reinforcement. Neuroscience and Biobehavioral Reviews, 13, 69-80.

[83] Stein, L., Xue, B.G., & Belluzzi, J.D. (1993). A cellular analogue of operant conditioning. Journal of the Experimental Analysis of Behavior, 60, 41-53.

[84] Stein, L., Xue, B.G., & Belluzzi, J.D. (1994). In vitro reinforcement of hippocampal bursting: a search for Skinner's atoms of behavior. Journal of the Experimental Analysis of Behavior, 61, 155-168.

[85] Strasser, R., Ehrlinger, J.M., & Bingman, V.P. (2004). Transitive behavior in hippocampal-lesioned pigeons. Brain, Behavior and Evolution, 63, 181-188.

[86] Talwar, S.K., Xu, S., Hawley, E.S., Weiss, S.A., Moxon, K.A., & Chapin, J.K. (2002). Rat navigation guided by remote control. Nature, 417, 37-38.

[87] Taylor, J.R., & Robbins, T.W. (1984). Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens. Psychopharmacology, 84, 405-412.

[88] Tremblay, L., & Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature, 398, 704-708.

[89] Wallis, J.D. (2008). Single neuron activity underlying behavior-guiding rules. In: S.A. Bunge, & J.D. Wallis (Eds.), Neuroscience of rule-guided behavior (pp. 23-44). New York: Oxford University Press.

[90] Wise, R.A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5, 483-494.

[91] Wolterink, G., Phillips, G., Cador, M., Donselaar-Wolterink, I., Robbins, T.W., & Everitt, B.J. (1993). Relative roles of ventral striatal D₁ and D₂ dopamine receptors in responding with conditioned reinforcement. Psychopharmacology, 110, 355-364.

Brasil

Brasil

Learning processes and the neural analysis of conditioning

Abstract

Publication Dates

History