Modulation by Context of a Scene in Monkey Anterior Inferotemporal Cortex during a Saccadic Eye Movement Task

We investigated the effect of a scene on the activity of cells in the anterior inferotemporal (AIT) cortex while the monkey performed a saccadic eye movement (SEM) task with and without the context of a scene (gray frame). Most neurons did not code for the presence of a scene when it appeared alone (monkey free viewing) or when the monkey was fixating. Nevertheless, when a peripheral target was turned on and the monkey had to make a SEM to it, some cells were capable of differentially coding the presence of the scene before and after the saccade.


INTRODUCTION
The underling circuitry and mechanisms subserving visual object recognition in the inferotemporal (IT) cortex is still poorly understood.Nevertheless, since the initial works of Gross and collaborators (Gross et al. 1969, 1972, Gross 1973, Schwartz et al. 1983, Desimone et al. 1984), models have been proposed based mainly on work carried out in the anesthetized preparations.The procedure normally adopted to study the response profile of isolated neurons in IT cortex is to select stimuli for which the cell responds preferentially.Using an extensive set of stimuli, Tanaka and collaborators (Tanaka et al. 1991, Tanaka 1996) systematically studied the response profile of IT units by determining the stimulus features necessary and sufficient for the cells maximal activation.Based on results that the se-lectivity of IT cells were rather sharp but not absolute, these investigators proposed the ''combinatory code'' model.Optical imaging data posteriorly corroborated the evidence obtained by electrophysiology of a modular organization in IT cortex (Wang et al. 1996(Wang et al. , 1998)).Modules specific for particular parameters of the visual stimuli would be the neural substrate for ''combinatory code'' mechanisms, so that activation of a few of these modules would be capable of representing any natural object.
In the present work, we observed that some neurons in AIT could selectively code for the presence of a given visual object.With the monkey fixating a small square (fixation point -FP), the presence of a scene (a gray frame) could be totally ineffective to drive the neuron.On the other hand, the cells response could be strongly modulated by the same object when the monkey was required to make a SEM to a peripherally presented target in the presence of the scene.

MATERIALS AND METHODS
All experimental procedures were conducted in accordance with the guidelines for care and use of laboratory animals (CAUAP) of the Institute of Biophysics Carlos Chagas Filho, which conform to the National Institutes of Health (Bethesda, MD, USA) guidelines.One Cebus apella monkey weighing 4 kg was used in this study.The methods of anesthesia, single unit recording and histological processing have been described in detail elsewhere (Gallyas 1979, Gattass andGross 1981).In summary, the animal was implanted under sterile conditions with a recording chamber and a head bolt.For monitoring the eye position, a scleral search coil was also implanted (Judge et al. 1980).The monkey was trained to gaze a FP and to make a SEM to a peripheral target in conditions with and without a contextual scene.The fixation window was a 3.5 o per 3.5 o square centered on the FP or on the target.Fig. 1a illustrates the behavioral tasks used.In the conditions with contextual scene, a gray frame would initially appear alone for 1250 ms in the beginning of the trial.The frame had a thickness of 0.25 o and inner dimensions of 23 o (width) per 17 o (height).A FP (0.3 o × 0.3 o ) would appear and the animal would have to hold fixation during 500 ms.Then, the FP would be turned off at the same time that a peripheral target appeared.The stimulus used as target was a colored disc of 1 o diameter.The center of the target was 2.5 o away from the closest side of the frame.The animal would have to make a SEM to the target and fixate it during 650 ms to receive a reward.For the conditions without the context of a scene, the task would be identical except for the presence of the gray frame.The FP and the scene could appear in one of two positions and the target could appear in one of eight positions (Fig. 1b).This resulted in a total of 48 conditions that had to be each performed 10 times correctly (total of 480 correct trials).Because our aim was to analyze the effect of the frame, the conditions with same context of scene were grouped together, independent of the FP and target position, resulting in 3 groups, each with 16 conditions (Fig. 1b): conditions with no contextual scene (NoSc), conditions with centered scene (CenSc) and conditions with shifted scene (ShiftSc).At 4 different periods, the average firing rate of the cell was computed and named: FrON (when a gray frame was on and the monkey was free viewing); FixON (same as FrON but the monkey was fixating); Pre-Sac (when the peripheral target was on but the monkey had not yet moved the eyes) and Post-Sac (after the eye was inside the target fixation window).Considering the neurons response latency, the initial 70 ms of activity after the monkey moved the eyes was usually attributed to Pre-Sac and not to Post-Sac.The average firing rate in each of the 4 periods were used for the analyses.Comparisons between the 3 contexts of scene were made using a one-way analysis of variance (ANOVA) with 5% significance level.Comparisons were always made within the corresponding period of activity.
Using 1.2 M tungsten microelectrodes, we recorded extracellular potentials from neurons in the AIT cortex isolated with the aid of a spike sorter (SPS -Signal Processing System, Australia).No attempt was made to isolate cells that responded to the frame or to the target.

RESULTS
We recorded single unit activity from 28 neurons in the AIT cortex.Initially, we calculated the number of neurons that exhibited modulation by the scene context at the 5% significance level.Of the units, 29% (8/28) was modulated by the frame in one or more of the four periods analyzed.Only 25% (2/8) of the units were capable of coding the presence of the frame in the FrON period.Surprisingly, the rest of the population (75% or n=6/8) failed to code for the presence of the scene during the FrON period, but showed a significant modulation in one or more of the other 3 periods analyzed.An example of contextual scene modulation is shown in Fig. 2. In this case, the frame seems to block the excitatory effect evoked by the target.If considered the whole FixON period, there was no statistical difference between Fig. 1 -(A): Behavioral task adopted, illustrating one of the 48 possible conditions.In this condition, a gray frame (scene) appears on a black screen for 1250 ms while the monkey is free viewing (FrON period).Two periods of fixation then follow: the FixON (gazing on the FP) and the Post-Sac (gazing on the target).Between the FixON and Post-Sac periods, the Pre-Sac period was also analyzed (not shown).This interval consisted mainly of the latency for the onset of the SEM after the target was turned on.(B) illustrates the two possible FP positions (crosses), the eight possible target positions (upper and lower spots) and the three contexts of scene: context with no scene (NoSc), with centered scene (CenSc) and with shifted scene (ShiftSc).The square with rounded corners represents the border of a 21'' monitor.The bold arrow represents the trial progression in time.
the 3 contexts of scene.It can be observed, however, a subtle differential modulation just before and after the monkey begins gazing the FP.Among other possibilities, the differential activity spanning the event of fixation could be correlated with the process of SEM, as will be discussed below.It may be argued that the general failure of the neurons on coding the presence of the scene during the FrON period was due to a lack of stabilization of the image on the retina because the animal was free viewing.An argument against this possibility is the observation of other cells (2/8) that failed to code the context of scene during any part of the FixON period, when the animal was fixating.Nevertheless, these same units were capable of coding the presence of the scene after the target was turned on.An example of contextual scene modulation exclusively target-dependent is shown in Fig. 3.For this neuron there is a general increase in activity after fixation.Nevertheless, the modulation by the frame only occurs after the appearance of the target.Thus, it is possible to partially dissociate the phenomena of scene modulation from the engagement of the monkey on the task or from possible attentional processes.

DISCUSSION
The data presented here indicates a flexibility on the object coding by cells in IT cortex.An implication of these results is that passive presentation of objects while the monkey free views or fixates may not undoubtedly establish the responsiveness of a neuron for a particular stimulus.For neurons that responded to the frame during the free viewing (FrON) or fix-Fig. 2 -Example of a neuron with contextual scene modulation.On the raster plots in the top of the figure, the black dots and the black curve represent the cells firing and the spike density function (gaussian smoothed), respectively.The trial progresses from the left to the right.The FP and the target are turned on after the FrON or FixON periods, respectively.There is a variability in the raster length after these two periods due to the variability of the animal to acquire fixation.The average firing rate and standard error for each of the four periods in the three contexts of scene was computed and plotted in the graph below.One-way ANOVA analyzes showed a statistical significance (asterisks) between the scene contexts only after the target was turned on (P < 10 −6 ).The differential modulation by the scene observed just before and after the monkey begins gazing the PF is discussed in the text.See also legend for Fig. 1. ation (FixON) periods, we were probably recording from a site with some selectivity to the frame, possibly a module tuned to that object as described by Wang et al. (1996Wang et al. ( , 1998)).For these situations, we would have obtained the neurons response profile to an object as it is customary done for IT cortex.Nevertheless, for the example illustrated in Fig. 3, the differential response only occurred after a behaviorally relevant target enclosed in the object was turned on.So, even though it has been demonstrated that the modules of IT cortex are tuned in their selectivity, there seems to be some interaction between them.It would be interesting to know the intracellular membrane potential during the 4 periods analyzed and to verify when the modulation actually started; whether before (subthreshold level) or after the behaviorally relevant target was turned on.
We observed the phenomena of modulation by scene context in a task of SEM.Ringo et al. (1994) and Sobotka et al. (1997) have described modulation by eye movement in the activity of the inferior and medial temporal cortical neurons.The work of Scheinberg and Logothetis (2001) has recently focused on natural vision (with its rich repertoire of SEM) and the activity of IT cells.The importance of SEM in visual perception may still be underestimated.In addition, the AIT cortex is a converging site of the ventral and dorsal streams of visual processing (Ungerleider and Mishkin 1982, Gattass et al. 1990, Suzuki andAmaral 1994).It is possible that this area may be integrating object identity with some kind of object spatial location.The scene context would then be processed by the cells in AIT cortex as a spatial reference for the identification and localization of the SEM targets.
An Acad Bras Cienc (2003) 75 (1) The neurons firing rate only distinguished the presence of the scene after the target was turned on, as confirmed by one-way ANOVA analyzes between the scene contexts (Pre-Sac: P < 0.03 and Post-Sac: P < 10 −7 ).See also legends for Figs. 1 and 2.

ACKNOWLEDGMENTS
We wish to thank Edil Saturato da Silva Filho and Theresa Monteiro for skillful technical assistance, and to Paulo Coutinho and Gervasio Coutinho for animal care.This research was supported by grants from CNPq, PRONEX, FUJB and FAPERJ.

Fig. 3 -
Fig. 3 -Example of a neuron with contextual scene modulation exclusively target-dependent.Conventions as in Fig. 2. Contrary to the unit illustrated in Fig. 2, this cell does not present differential modulation by the frame prior to the appearance of the target.