Effects of Reinforcement Learning on Gaze Following of Gaze and Head Direction in Early Infancy: An Interactive Eye-Tracking Study.

The current four experiments investigated gaze following behavior in response to gaze and head turns in 4-month-olds and how reinforcement learning influences this behavior (N = 99). Using interactive eye tracking, infants' gaze elicited an animation whenever infants followed a person's head or gaze orientation (Experiment 1.1, 2.1 and 2.2) or looked at the opposite side (Experiment 1.2). Infants spontaneously followed the direction of a turning head with and without simultaneously shifted gaze direction (Cohen's d: 0.93-1.05) but not the direction of isolated gaze shifts. We only found a weak effect of reinforcement on gaze following in one of the four experiments. Results will be discussed with regard to the impact of reinforcement on the maintenance of already existing gaze following behavior.

The current four experiments investigated gaze following behavior in response to gaze and head turns in 4month-olds and how reinforcement learning influences this behavior (N = 99). Using interactive eye tracking, infants' gaze elicited an animation whenever infants followed a person's head or gaze orientation (Experiment 1.1, 2.1 and 2.2) or looked at the opposite side (Experiment 1.2). Infants spontaneously followed the direction of a turning head with and without simultaneously shifted gaze direction (Cohen's d: 0.93-1.05) but not the direction of isolated gaze shifts. We only found a weak effect of reinforcement on gaze following in one of the four experiments. Results will be discussed with regard to the impact of reinforcement on the maintenance of already existing gaze following behavior.
Gaze following behavior, the ability to align one's own gaze with another person's head or gaze orientation , is a fundamental ability underlying social communicative processes like joint attention and language acquisition (Brooks & Meltzoff, 2005, 2015. Despite its importance, the mechanisms underlying infants' gaze following behavior are not yet completely understood. The present experiments examined spontaneous gaze following in response to gaze or head direction in 4-month-olds. In addition, we investigated how reinforcement learning influences this behavior. In our experiments infants therefore repeatedly experienced that a specific behavior, here following the gaze or head direction of another person, is followed by a potentially reinforcing stimulus. Gaze following behavior can be observed early in infancy. The exact onset and the mechanisms underlying this behavior are not yet clear. Empirical findings vary with the definition, with the measures applied to assess gaze following, and with the experimental setup, that is screen-based versus live paradigms (De ak, 2015). Even newborns attend to information of the gaze direction of face-like stimuli (Farroni, Massaccesi, Pividori, & Johnson, 2004). To explain this observation, some authors assume that an early sensitivity to gaze direction in upright human faces is part of an evolved communication system (Farroni, Csibra, Simion, & Johnson, 2002;Farroni, Mansfield, Lai, & Johnson, 2003;Farroni, Menon, & Johnson, 2006). Others highlight that gaze information serves as an ostensive cue fostering social learning (Csibra & Gergely, 2006). It was suggested that newborns may be equipped with specified neural systems to detect and process information on gaze direction (Baron-Cohen, 1994;Perrett & Emery, 1994). Following this line of arguments, infants' response to gaze direction may need only little to no specific experience to emerge.
Examining covert shifts of attention, it was found that infants' attention and their object processing is shifted in response to both, the simultaneous movement of head and gaze to the side and also the isolated movement of either the head or the gaze of others already at the age of 4 months (Farroni et al., 2004;Hoehl, Wahl, & Pauen, 2014;Hood, Willen, & Driver, 1998;Reid, Striano, Kaufman, & Johnson, 2004;Wahl, Michel, Pauen, & Hoehl, 2013). Infants' overt gaze following behavior was investigated by analyzing infants' active head or gaze movements. Studies using a live paradigm and analyzing infants' eye movements reported gaze following emerging between 2 and 4 months of age (D'Entremont, 2000;D'Entremont, Hains, & Muir, 1997;Gredeb€ ack, Fikke, & Melinder, 2010). However, gaze following was not found to be stable before 4-6 months of age when using eyetracking devices for assessment (Astor & Gre-deb€ ack, 2019;Gredeb€ ack et al., 2010;Gredeb€ ack, Theuring, Hauf, & Kenward, 2008). In live studies which coded infants' first head turn, instead of infants' eye movements, some infants only followed the combined head-and gaze direction of another person at around 8-9 months of age (De ak, 2015; . Studies showing a later onset of gaze following behavior as well as work indicating that infants' gaze following behavior gets more sophisticated with age support the idea that gaze following develops with increasing experiences during the first postnatal months of life (Butterworth & Cochran, 1980;De ak, 2015;. This rather dynamic notion of gaze following is in line with the recently proposed perceptual narrowing account for gaze following by Del Bianco, Falck-Ytter, Thorup, and Gredeb€ ack (2019): Similarly to other perceptual narrowing processes (Scott, Pascalis, & Nelson, 2007), gaze following gets more refined with age in the sense that infants rely less on broader cues like the movement of the head but focus more on specific directional cues like gaze direction instead. Results of a recent study with infants of deaf parents highlight the plasticity of gaze following behavior (Brooks, Singleton, & Meltzoff, 2019). Moore, Angelopoulos, and Bennett (1997) suggested that gaze following is the product of reinforcement learning: Infants learn that it pays off to follow the head and eye movement of adults as corresponding information is often predictive of seeing something interesting and presumably arousing and rewarding in the environment. According to the classical reward learning account, an increase in behavior is expected when someone experienced a reward (here: getting to see something interesting) after performing a certain behavior. In case a behavior is associated with a negative outcome, one would expect a decrease in this behavior (e.g., Skinner, 1938). To test if reward learning is one of the mechanisms driving the early development of gaze following behavior, Moore and Corkum (1998) examined how infants change their gaze following behavior over the course of a 3-phase-experiment in a live situation. Spontaneous gaze following behavior was assessed during an initial baseline phase. In a subsequent shaping phase, the object looked at by the experimenter, in the following referred to as the "cued object," was animated about 2 s after the experimenter had turned towards it. To reward the infant for their gaze following behavior, the cued object was animated whenever the infant looked at it in the final test phase. The majority of 6-to 7months-olds did not show gaze following in any of the experimental phases. Starting at 8 months of age, infants acquired gaze following behavior over the course of the experiment, presumably because they were rewarded by the animation of the target.
To check whether any kind of following behavior can be elicited through reward or whether specific characteristics of a cue combined with reward can account for the change in behavior, Moore and Corkum (1998) performed a second experiment (dubbed the "unnatural condition"). This time, the animated object in the shaping and in the test phase was the object opposite the experimenter's head and gaze direction, in the following referred to as the "not-cued object." Even infants aged 8-9 months did not start to prefer the not-cued object at the end of the experiment. In contrast, replicating the effects of Experiment 1, infants started to follow the gaze direction of the experimenter even though this behavior was not reinforced. The authors concluded that the acquisition of gaze following is facilitated in response to specific characteristics like change in the head orientation of the experimenter (see also De ak, 2015).
The assumption that gaze following results from a combination of early existing mechanisms and learning processes is in line with the integrative model presented by Triesch, Teuscher, De ak, and Carlson (2006). They argued that a basic set of structures and mechanisms, like the ability for reward-driven learning and preferences for social stimuli, is sufficient to get gaze following started. Thus, gaze following may be acquired the best if infants are rewarded for following a social stimulus, like head and gaze orientation.
De ak (2015) proposed motion-cued scanning, that is scanning the environment in the direction of a movement as the basic feature used for gaze following (De ak, 2015). Thus, gaze following effects may more parsimoniously be explained by infants' preference to follow motion, rather than social stimuli such as eyes. In line with this low-level account, the direction of the motion, (i.e., the direction of gaze shifts or head turns) is often confounded with a higher perceptual load of high-contrast elements, for instance eyes and the front of the face contain more perceptual details and higher contrast elements than the side or back of the head. Some studies highlighted the impact of specific perceptual contrast features in social cues like eyes for early attentional cueing (Michel, Pauen, & Hoehl, 2017;Michel, Wronski, Pauen, Daum, & Hoehl, 2019). Overall, it remains unclear whether early gaze following is based on attending to (a) low-level cues like motion or high-contrast elements or (b) to a specific combination of cues (e.g., a human head with eyes in motion). Our aim is to investigate infants' gaze following capacities as naturalistically as possible but within a controlled computer-based setup. Therefore, the cue used in the current experiments (the turning face of an adult) contains both, motion and a higher contrast-level in the cueing direction.
The specific interplay between innate and early acquired information processing mechanisms like a preference for social stimuli and reinforcement learning still needs to be explored in more detail. To shed more light on this issue, we (a) investigated infants' early gaze following capacity in a computer-based setup measuring eye movements via eye tracking in 4-month-olds. We chose this age group as it was repeatedly shown that 4-montholds shift their attention in the direction of head or gaze direction of another person in controlled computer-based setups (Reid et al., 2004;Wahl et al., 2013). We therefore assumed that infants as young as 4 months of age follow the head and gaze direction in our rather simple paradigm.
We (b) aimed to investigate the influence of reinforcement learning on infants' early gaze following behavior. We therefore adapted the paradigm by Moore and Corkum (1998) and developed an interactive screen-based eye-tracking paradigm with 2D stimuli. Eye tracking is suitable for testing 4-month-olds, thus younger infants than in the previous live experiment and was used in previous experimental studies investigating infant gaze following behavior (e.g., Astor & Gredeb€ ack, 2019;Senju & Csibra, 2008). In their study, Moore and Corkum (1998) coded gaze following behavior when the first head turn occurred in the direction of the target. This relatively crude measure may have underestimated young infants' gaze following competencies during the baseline phase. In addition, Moore and Corkum (1998) only reinforced infants' gaze following behavior during the test phase, thus at the end of the experiment. This procedure did not allow the authors to test how reinforcement affected infants' subsequent behavior. In our experiments, we therefore decided to use a baseline-training-test design. Baseline and test phase did not contain any reinforcement. Our training phase was identical to the test phase in Moore and Corkum (1998): We reinforced infants whenever they performed a gaze shift following the actor's head-and eye movement versus whenever they looked at the target located opposite to the cued direction. This allowed us to analyze how infants' looking behavior changed from the baseline to the test phase varying with the amount of reinforcing experiences received during the training phases.
In Experiment 1, infants therefore saw a gazecontingent animation of an interesting visual stimulus during the training phase whenever they looked at the object cued by the head and gaze direction of an actor on the computer screen (in the following referred to as the "cued object"; Experiment 1.1) or the object located at the opposite direction, (the "not-cued object; Experiment 1.2). In line with previous infant studies on gaze following at this age using eye movements as the dependent variable, we expected infants at 4 months of age to show spontaneous gaze following behavior already during the baseline phase in both experiments (Astor & Gredeb€ ack, 2019;D'Entremont, 2000;D'Entremont et al., 1997;Gredeb€ ack et al., 2010). As infants at this age are able to learn the association between visual cues and the location of an appearing target (Johnson, Posner, & Rothbart, 1991), we further assumed that gaze following behavior in response to social cues can be shaped by reinforcement. More specifically, we expected infants to increase their gaze shifts toward the cued object from baseline to test in Experiment 1.1 Triesch et al., 2006). These analyses, thus, are rather confirmatory. Furthermore, we speculated that this enhancement may be systematically related to the amount of elicited gaze-contingent animations during the training phase and investigate this in a rather exploratory way. If gaze following is a highly plastic behavior, shaped essentially by reinforcement learning, we expected infants in Experiment 1.2 to start to prefer the not-cued object during the test phase. In addition, we expected infants' change in preference for the cued object to be related to their experience with the potentially reinforcing animation when not following the person's gaze. On the other hand, if infants' tendency to follow the gaze and head movement of another person is already strong and robust at 4 months of age or if reinforcement influences gaze following only when it occurs aligned with social signals, that is, when the gaze direction is cueing the animation, infants' looking behavior should not be influenced by reinforcement in Experiment 1.2 . This hypothesis would be in line with the basic set model by Triesch et al. (2006). Therefore, analyses of the data of Experiment 1.2 are rather exploratory. Contrasting results of Experiment 1.1 and Experiment 1.2 will inform us about whether reinforcement learning works better if the reinforcement is placed in the direction of social cues, here the actor's head and gaze orientation.

Method Participants
We tested sixty-one 4-month-olds, but had to exclude one infant who could not be calibrated, two infants due to a mistake in the presentation program, one infant who did not finish stimulus presentation and another one due to an experimenter error. Different exclusion criteria apply for the different analyses (see Table 1 for sample information and descriptive statistics of Experiment 1). Data collection took place at Heidelberg University, Germany. All participants came from Heidelberg (Germany) or surrounding areas, which represent an urban Western, industrialized context. All experiments were approved by the ethics committee of the Fakult€ at f€ ur Verhaltens-und Empirische Kulturwissenschaften, Heidelberg. All infants were born full term (> 37 weeks). Parents were asked to participate via phone on a voluntary basis. Telephone numbers were taken from the database of the institution where data acquisition took place. Parents and infants were recruited to the database at the maternity wards at the local hospitals or their contact details were provided by the residents' registration office. Parents were then contacted and asked to provide their consent to be added to the database. We did not collect any information on participants' socioeconomic or race and ethnicity status. Typically, families being registered at the databases of our institutions represent a mid to high socioeconomic status. Written informed consent from the infants' caregiver was obtained. Participants received a certificate and, in Experiment 2, additionally a toy and 7,50€ for their participation. We aimed at a sample of 20 infants in the looking time measure in each experiment. This sample size was defined based on previous research in the area of gaze following using similar methods (Senju & Csibra, 2008) and higher than the one in the study by Corkum and Moore (1998) who included 14 participants per condition. Data collection for Experiment 1 took place between December 2014 and February 2016.

Materials
For calibration, we used a blue and yellow star (6.2 9 6.2 cm, visual angle of 5.9 9 5.9). Before each phase of the experiment and before each trial started, a black star was presented (5.6 9 6.0 cm, visual angle of 5.3 9 5.7) accompanied by an attention-grabbing sound to attract infants' attention to the center of the screen. All stimuli consisted of the torso and face of a female actor looking directly to the camera (15 9 14.5 cm, visual angle of 14.3 9 13.8) and the same face with the head and gaze turned about 45°to one side (12.8 9 14.5 cm, visual angle of 12.2 9 13.8). The picture of the face was taken from the Radboud Faces Database (Langner et al., 2010). In addition, the identical picture of a famous comic mouse was presented left and right of the actor's face. The size of each mouse was 6.7 9 6.7 cm (visual angle of 6.4 9 6.4). The minimal distance between the turned face and the static mouse was 2.9 cm (visual angle of 2.8). The mouse was facing the infant and held up its hands in the direction of the infant.

Procedure
During the experiment, infants sat on their caregiver's lap in a dimly lit room at about 60 cm distance from the screen. Infants were calibrated using a 5-point calibration and a happy song to keep them attracted to the screen. A successful calibration is essential for interactive eye tracking. Therefore, the experimenter validated the calibration: We presented pink spirals (5.3 9 5.3 cm, visual angle of 5.1 9 5.1) on the left and on the right side of the screen after the calibration. The experimenter saw an overlap of the position of the spirals and infant's captured gaze point. Only the two spirals were presented on screen. We therefore assumed that infants would look directly at the spirals. Thus, the experimenter visually checked whether infant's gaze was located correctly at the position of the spirals and not systematically shifted to the side or up-or downwards. Only if this was confirmed, the experimenter started the presentation. If any shifts occurred after calibration, the infant was either recalibrated or not included in the analyses.
As shown in Figure 1, the experiment was divided into three phases: (a) baseline, (b) training and (c) test phase. Each phase started with the presentation of a rotating black star and the attention-grabbing sound. Each trial in every phase began with the presentation of a static black star in the center of the screen and the attention-grabbing sound. The experimenter only started the next trial when the infant looked at the center of the screen. First, infants saw the actor facing them as well as a comic mouse to the left and right of her face for 1 s (front picture). On the next picture presented immediately afterwards, infants saw the actor with her eyes and head shifted 45°to one side, thereby looking at one mouse while looking away from the other mouse (side picture). The quick shift from the front to the side picture highlighted the movement of the face to one side and led to the impression of an apparent movement. This seems important as previous work has shown that movement is crucial for gaze following (De ak, 2015;Farroni, Johnson, Brockbank, & Simion, 2000). The duration of the presentation of this image as well as the number of trials varied between phases.
As in previous studies, the baseline phase consisted of four trials, each showing the actor's head and gaze turning to either the left or the right mouse in a fixed order: left, right, right, left with the side picture being presented for 5 s (Senju & Csibra, 2008;Theuring, Gredeb€ ack, & Hauf, 2007). The training phase consisted of eight trials, in which the actor turned to the side in the fixed order: right, left, right, right, left, left, right, left. The side picture was presented for max. 10 s. Whenever the infant looked at the cued (Experiment 1.1) or the not-cued mouse (Experiment 1.2) during these 10 s, the corresponding mouse started to wiggle for about 2 s. We decided for this operationalization of the potentially reinforcing stimulus as an animation of the target served as a successful reinforcement in Moore and Corkum (1998) and as infants show a preference for moving over static stimuli (Carpenter, 1974;Samuels, 1985;Wilcox & Clayton, 1968). The exact timing differed between trials due to the timing of the presentation computer. Averaged over all participants in both experiments, the wiggling lasted for 2,401 ms with a standard deviation of 67 ms. The wiggling of the mouse served as a potential reinforcement of either showing gaze following behavior (Experiment 1.1) or no gaze following (Experiment 1.2). Therefore, gaze data of the infant were analyzed constantly during the presentation of the side picture during the training phase. The animation of the mouse started about 100 ms after the first gaze point was located within the area of interest (AOI) of the cued (Experiment 1.1) or not-cued mouse (Experiment 1.2). As the average accuracy of the gaze coordinates is about 0.5°for the Tobii T60 eye tracker, we defined AOIs being 1°visual angle larger than the maximal dimensions of the mice and the face, respectively (Gliga, Elsabbagh, Andravizou, & Johnson, 2009). Note that a gaze point within the AOI of the respective mouse was sufficient to elicit the animation. Infants did not have to look at the face AOI (i.e., the cue) first to start the animation with their gaze. However, averaged over all four experiments, the animation started about 244 ms after infants' last look at the face AOI. More detailed information on the relation between infants' look at the face AOI and the elicitation of the gaze-contingent animation can be found in Supporting Information C.
Following the animation, or after 10 s had passed without eliciting the animation (i.e., if the infant never looked within the AOI of the target mouse), the trial ended and the next trial started with the presentation of the black star. The subsequent test phase comprised of four trials and was identical to the baseline phase. We only varied the order in which the actor turned her head and eyes to the side (right, left, left, right). Over the course of the experiment, we assured that the actor turned to the left and the right side equally often and did not turn to the same side more than two times in a row during the experiment. The number of trials in each phase of the experiments was chosen based on piloting. We wanted to make sure that the majority of infants still pays attention to the final test phase of the experiment to analyze changes in infants' looking behavior from the baseline to the test phase.

Eye-Tracking Recording and Analysis
We analyzed infants' looking behavior to the cued and the not-cued mouse during the presentation of the side pictures in the baseline and the test phase. To analyze infants' gaze following behavior, we focused our analyses on (a) first gaze shifts from the face to one of the mice during the side pictures and on (b) looking times to the mice during the side pictures . In all experiments and analyses, an infant had to contribute at least one valid trial in the baseline phase and one valid trial in the test phase to be included into the analyses. The general pattern of the first gaze shifts and looking times analyses remained very similar when only including infants with two or more trials (see Supporting Information A). Level of significance was set to .05 for all analyses and all comparisons were two-tailed.
First gaze shifts to cued and not-cued mouse. We analyzed whether infants more often looked first at the cued mouse as compared to the not-cued mouse after the onset of the side pictures. Therefore, a sample point in the eye tracking data within the face AOI had to be detected first. The first subsequent sample point within the AOI of the cued or not-cued mouse ended the first gaze shift and the analysis of this trial. We then calculated a difference score (DSGazeShift) separately for the baseline and the test phase by subtracting the number of trials with first gaze shift to the not-cued mouse from the number of trials with a first gaze shift to the cued mouse and dividing the result by the total number of trials with a first gaze shift (Senju & Csibra, 2008). DSGazeShift values could range from +1 (first gaze shifts to the cued mouse only) and À1 (first gaze shifts to the not-cued mouse only). To test whether infants' gaze following differed between the baseline and the test phase in Experiment 1.1 and Experiment 1.2, we performed a 2 9 2 repeated measures analysis of variance (ANOVA) with the within-subject factor phase (baseline vs. test) and the between-subject factor experiment (Experiment 1.1 vs. Experiment 1.2). To test whether infants followed the central stimulus at all, we performed subsequent t-tests.
Looking times at the cued and not-cued mouse. To extract looking times to the cued and the not-cued mouse, we summed up gaze points within the coordinates of the mice in the respective time window for each trial using the software R (R Core Team, 2014). To be able to compare looking times to both mice, we decided to only include trials in the analyses in which infants looked at one of the two mice for at least 200 ms while the actor directed her head and gaze to one side (Kochukhova & Gredeb€ ack, 2010;Michel et al., 2017). Thus, trials in which infants kept looking at the actor for the entire duration of the trial were excluded. We applied this criterion to make sure that we capture trials in which infants actually looked at one mouse. Next, we computed the relative looking time for a given trial by dividing the number of gaze points to the cued as well as to the not-cued mouse through the number of gaze points to the entire screen for the respective trial. We then averaged relative looking times to the cued and the not-cued mouse for valid baseline trials as well as for valid test trials. Using these four variables (relative looking times to the cued mouse during baseline, relative looking times to the not-cued mouse during baseline, relative looking times to the cued mouse during test, relative looking times to the not-cued mouse during test), we excluded outliers: All infants whose values differed more than AE2 SD from the mean in at least one of these four variables were excluded from further analyses (Beier & Spelke, 2012;Cashon, Ha, Allen, & Barna, 2013). However, it should be noted that the general pattern of results remained very similar when including these infants (see Supporting Information B for the exact results).
To examine infants' looking times, we calculated a difference score (DSLook) for the baseline phase and for the test phase by subtracting relative looking times to the not-cued mouse from relative looking times to the cued mouse and dividing the result by the sum of both values (Kov acs, T egl as, Gergely, & Csibra, 2017;Senju & Csibra, 2008). A DSLook > 0 indicated a preference for the cued mouse and a DSLook < 0 indicated a preference for the not-cued mouse. To test whether infants' looking times differed between baseline and test phase based on the differences in the gaze contingent animations during the training phase, we performed a 2 9 2 repeated measures ANOVA with the withinsubject factor phase (baseline vs. test) and the between-subject factor experiment (Experiment 1.1 vs. Experiment 1.2). To test whether the central stimulus influenced infant looking times at all, we performed subsequent t-tests.
Relating the number of gaze-contingent animations to changes in gaze following from baseline to test phase.
To examine whether an increase in gaze following behavior from the baseline to the test phase varied with the amount of elicited gaze-contingent animations, we correlated the number of elicited gaze-contingent animations during the training phase with two variables reflecting an increase in gaze following behavior: (a) DSGazeShiftChange = DSGazeShift for the test phase-DSGazeShift for the baseline phase and (b) DSLookChange = DSLook for the test phase-DSLook for the baseline phase. To keep analyses consistent for all experiments and to consider that not all variables were normally distributed, we calculated the Spearman correlation coefficient for all correlational analyses.

First Gaze Shifts to the Cued and Not-Cued Mouse
The 2 (phase) 9 2 (experiment) repeated measures ANOVA with DSGazeShift as the dependent variable revealed no significant main effects or interaction, all ps > .25. Infants' preference to look at the cued mouse first was identical at baseline and test in both experiments and did not reveal any changes from the baseline to the test phase. To test whether infants overall followed gaze or not, we averaged DSGazeShift for baseline and test of Experiment 1.1 and 1.2 and performed a one-sample t-test against 0: Infants showed a significant preference for the cued mouse (M = 0.40, SD = 0.44), t(49) = 6.65, p < .001, Cohen's d = 0.91. This means that infants followed the head and gaze direction of the actor throughout both experiments.

Looking Times at the Cued and Not-Cued Mouse
The 2 (phase) 9 2 (experiment) repeated measures ANOVA with DSLook as the dependent variable revealed no significant main effects or interaction, all ps > .25. Infants' looking time at each mouse was identical at baseline and test in both experiments and did not reveal any changes from the baseline to the test phase. To test whether infants overall preferred the cued mouse, we averaged DSLook for baseline and test of Experiment 1.1 and 1.2 and performed a one-sample t-test against 0: Infants showed a significant preference for the cued mouse (M = 0.38, SD = 0.37), t (40) = 6.66, p < .001, Cohen's d = 1.03. This means that infants looked longer at the mouse cued by the actor's head and gaze orientation throughout both experiments.

Relating the Number of Gaze-Contingent Animations to Changes in Gaze Following From Baseline to Test Phase
In Experiment 1.1, DSGazeShiftChange was positively correlated with the number of elicited gaze-contingent animations during training, r = .61, p = .001 (see Figure 2A) and DSLookChange was positively correlated with the number of elicited gaze-contingent animations during the training phase, r = .61, p = .004 (see Figure 2B).
In Experiment 1.2, DSGazeShiftChange was not significantly correlated with the number of elicited gaze-contingent animations during training, r = .05, p = .806 (see Figure 2C) and DSLook-Change was not significantly correlated with the number of elicited gaze-contingent animations during the training phase, r = À.18, p = .460 (see Figure 2D).
As infants in both experiments of Experiment 1 showed gaze following already during the baseline phase, infants in Experiment 1.1 consequently elicited more gaze-contingent animations, thus experiencing more reward, than in Experiment 1.2, t(39) = À2.05, p = .047 with M = 6.05 and SD = 1.53 for Experiment 1.1 and M = 4.90 and SD = 2.02 for Experiment 1.2 (this analysis was performed with data of the looking times sample). However, variances of the amount of gaze-contingent animations in both experiments did not differ significantly from each other as tested with the Levene's test for equality of variances, F(1, 39) = 2.80, p = .187. Therefore, results of the correlational analyses with regard to the DSLook-Change score should not be affected by the differences in the average number of elicited animations during the training phase. When performing the analysis on the first look sample, the same effect of a higher number of gaze-contingent animations in Experiment 1.1 was found t (43.49) = 2.76, p = .008. Levene's test for equality of variances revealed a significant difference in the variance of the number of elicited gaze-contingent animations, F(48) = 4.98, p = .03. The variance was larger in Experiment 1.2 than Experiment 1.1 (SD = 1.52 in Experiment 1.1 and SD = 2.12 in Experiment 1.2). We only found a significant correlation between the number of elicited gaze-contingent animations and DSGaze-ShiftChange in Experiment 1.1 but not in Experiment 1.2. As the variance was, in fact, smaller in Experiment 1.1 as compared to Experiment 1.2, we are confident that differences in variance do not account for the pattern of results found in the correlation analyses with regard to the DSGazeShiftChange score. The observed difference in variances would favor larger correlations (due to larger variances) in Experiment 1.2 compared to Experiment 1.1, that is, the opposite of our findings.
To test whether the correlations between Experiment 1.1 and Experiment 1.2 differed significantly, we compared the correlation coefficients for the DSGazeShiftChange and the DSLookChange post hoc, following a Fishers' r-to-z transformation. The z-scores of the independent samples were tested against each other. The correlation coefficient for the DSGazeShiftChange and for the DSLookChange was significantly higher for Experiment 1.1 than for Experiment 1.2 (both ps < .05).

Discussion
In our eye-tracking paradigm, infants at 4 months of age showed spontaneous gaze following behavior during both the baseline and the test phase, as they preferred to look first and longer at the cued mouse as compared to the not-cued one. This result is in line with earlier studies presenting objects close to a person's face and using gaze shifts as the dependent variable for gaze following (Astor & Gredeb€ ack, 2019;D'Entremont, 2000;D'Entremont et al., 1997). On the group level, infants in both experiments continued to follow the person's head and gaze orientation during the test phase. Importantly, we did not find any evidence for a change in gaze following behavior when comparing baseline with test performance. This may at least partly be due to the fact that infants followed the gaze direction of the actor on the screen already in baseline.
However, a significant and significantly larger correlation between the number of gaze-contingent animations and the change in infants' looking behavior from the baseline to the test phase was only found when gaze following was reinforced, and not the opposite behavior. This correlational pattern of findings suggests that reinforcement may stabilize the natural tendency of young infants to follow the head and gaze direction of another person, but is not effective when it comes to learning the opposite behavior. However, several third variables, which may have accounted for the positive correlation in Experiment 1.1, need to be taken into consideration: Infants who show a greater preference for a moving over a static mouse might have elicited more contingent reactions and thus got reinforced more often during the training phase. They might subsequently have more exhaustively tried to elicit the animation during test, for example, by looking around the screen which may have increased the likelihood to show a gaze shift toward the cued mouse. However, this does not explain why a corresponding correlation between reinforced trials and gaze shifts was not found for Experiment 1.2 as well. An alternative explanation draws on the idea that infants who show a stronger bias to turn their gaze in the same direction as the actor (as indicated by a higher DSLookBaseline) elicited more gaze-contingent animations during the training phase. To test for this explanation, we correlated the DSLookBaseline and the DSGazeShiftBaseline with the amount of elicited gaze contingent animations in Experiment 1.1. However, this analysis did not reach significance (all ps > .25). Yet a third potential explanation draws on the idea that infants' habituation to the presentation may explain the positive correlation in Experiment 1.1, meaning that infants who elicited more gaze contingent animations may have habituated less to the presentation in general and therefore kept on showing more gaze following throughout the experiment. However, in the looking times sample of Experiment 1.1 the number of elicited gaze-contingent animations did not correlate with the absolute looking times to the screen at test (r = .20, p = .382) which should have been the case if the animations simply enhanced infants' interest in the presentation. Indeed, the amount of gaze-contingent animations positively correlated with the DSLookTest, (r = .63, p = .002) and the DSGazeShiftTest (r = .57, p = .003) reflecting the relation between experienced reinforcement and infants' gaze following. Descriptive data of an additional analysis suggest that infants who elicited more gaze-contingent animations and increased their gaze following behavior from baseline to test had a lower gaze following score during baseline (see Supporting Information D).
However, the more interesting relation is the positive correlation between the elicited animations and the increase in gaze following from the baseline to the test phase which cannot be explained by a "global" preference to look in the head and gaze direction of the actor but may rather reflect early reward learning processes. As can be seen in Figures 2A and 2B, we did not find a clear increase in gaze following from baseline to test in Experiment 1.1: about half of the infants increased their gaze following behavior from baseline to test (indicated by a positive DSGazeShiftChange or a positive DSLookChange value). The other half decreased their preference. Thus, we cannot conclude that any reinforcement, for example, even just one time eliciting the animation, increased gaze following behavior. Based on our data, conclusions with regard to the effects of reinforcement on infants' gaze following behavior can only be drawn very tentatively. Our findings point to a rather weak, if any influence at all, of reinforcement learning on already existing gaze following behavior in young infants as infants only increased their gaze following behavior when they experienced a certain number of animations. In real life and natural face-toface interactions, one can imagine a much higher amount of rewarding experiences during gaze following episodes. In such cases, reinforcement may be one factor besides many others which help them maintain gaze following. Future studies are needed to replicate this correlation.
Experiment 1 still leaves open some questions that call for further testing: In Experiment 1, we used an animated stimulus (i.e., a wiggling comic mouse) as the potentially reinforcing event during the training phase to exploit infants' preference for moving over static stimuli (Carpenter, 1974;Samuels, 1985;Wilcox & Clayton, 1968) and to keep our procedure similar to that of Moore and Corkum (1998). However, we cannot be sure that this stimulus indeed worked as a reward for infants. In Experiment 2 we therefore recorded and analyzed infants' pupil dilation as a measure for emotional arousal in response to the elicited animation (Hepach & Westermann, 2016). If infants were rewarded or at least aroused by the animated stimulus, we expected enhanced pupil dilation in response to the animated mouse as compared to the static mouse. We did not analyze pupil size in Experiment 1 as lighting conditions in the laboratory were not controlled when testing Experiment 1 and changes in pupil sizes would therefore not reflect changes in the arousal of the infant but would be confounded with the changing lighting conditions in the laboratory.
In Experiment 1, infants showed spontaneous gaze following already during baseline in response to the shifted head and gaze direction. It remains unclear whether infants followed the obvious movement of the experimenter's head or if they used gaze information to spot the target object. Results from behavioral studies using live acting suggest that infants follow gaze shifts of a person at an older age than head turns in live situations. (Lempers, 1979;Meltzoff & Brooks, 2007;; but see Tomasello, Hare, Lehmann, & Call, 2007). However, live experiments often contain complex situations with the infant and target objects being located at relatively large distances from the experimenter. Reducing complexity and distances may enable even younger infants to use gaze direction to orient their attention.
In their computer-based experiment, Hood et al. (1998) were able to show that eye movements can guide infants' attention even at the young age of 3 months. In addition, eye movements alone were effectively enhancing the processing of novel objects in the same way as head movements or jointly moving head and eyes (Hoehl et al., 2014;Wahl et al., 2013) when presented in a simple screenbased setup. It is possible that infants' attention in these studies was shifted covertly in response to the gaze direction while no overt gaze following response was necessary. Thus, the difference between studies showing an early sensitivity for eye movements (Hoehl et al., 2014;Hood et al., 1998;Wahl et al., 2013) and studies suggesting a later onset of this sensitivity (Lempers, 1979;) may have arisen from depicting different processes (covert vs. overt attention shifts) or using different methods (live vs. computer-based experiments; coding of head or gaze shifts). Nonetheless, the question remains whether infants participating in Experiment 1 revealed head or gaze following, or a combination of both.
Experiment 2 thus examines whether young infants' overt gaze following is already driven by information about the eye or head movement in a computer-based paradigm. We used the same paradigm as in Experiment 1, but we disentangled the influence of gaze and head direction: In Experiment 2.1, the actor shifted only her gaze to the side, while the head remained orientated to the front. In Experiment 2.2, the head of the actor shifted to the side while the eyes kept looking at the infant. If infants' attention is equally guided by eye and head movement, we expected to replicate the results of Experiment 1.1: Infants would spontaneously follow the central cue (head or eyes) during the baseline phase and continue to do so during the training phase. Analyses with regard to infants' spontaneous gaze following and gaze following after training are rather confirmatory. Hypotheses on how reinforcement shapes gaze following behavior in response to isolated gaze or head orientation can only be tentative and analyses, thus, are rather exploratory. Under the preassumption that gaze and head direction both serve as influential cues guiding infants' attention and gaze following behavior, we expected to replicate the correlational relation between elicited gaze-contingent animations and infants' change in gaze following from baseline to test. However, if reinforcement shapes gaze following only when it is coupled with naturalistic social information which infants encounter every day, no such relation is expected as infants rarely see other people only shifting their gaze or head direction.

Method Participants
For Experiment 2, we tested 53 infants and had to exclude one infant due to a mistake in the presentation program and one infant who did not finish stimulus presentation (see Table 2 for sample information and descriptive statistics of Experiment 2). The experiment took place at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig. All participants came from Leipzig (Germany) or surrounding areas, which represent an urban Western, industrialized context. Data collection for Experiment 2 took place between March and May 2017.

Stimuli, Procedure, and Analysis
The stimuli and procedure were identical to Experiment 1.1, but the central stimulus changed. Figure 3 presents the central stimuli used in Experiment 2. Instead of shifting her head and gaze to the side, the actor now turned only her eyes (Experiment 2.1) or only her head (Experiment 2.2) to the side. We therefore edited the original stimulus using the software GIMP v. 2.8.16 (The GIMP Development Team, CA, USA, http://gimp.org). Additionally, before and after the experiment, we presented infants with a gray star (5 s) and the picture of the actor looking to the front (5 s). We planned to analyze if pupil size in response to these stimuli changed from beginning to the end of the presentation to investigate whether infants' general arousal (gray star) or infants' stimulus-related arousal when seeing the actor increased due to the potentially reinforcing and therefore arousing experiences during the training phase. However, infants did not pay enough attention to these stimuli. Applying the same inclusion criteria as for the pupil dilation analysis used for the analysis of the gaze-contingent animation, only 47% of the infants provided sufficient data. We therefore did not  obtain enough clean data and did not further analyze responses to these stimuli. In addition to the analyses done in Experiment 1, we investigated whether the animated mouse was arousing for infants, as indicated by pupil dilation. Analysis was performed based on the combined data set of Experiment 2.1 and 2.2 using Time Studio (v3.19, http://timestudioproject.com/, Nystr€ om, Falck-Ytter, & Gredeb€ ack, 2016). For this analysis, we took the mean of the left and the right eye, if both eyes provided valid data. If only one eye had valid data, we took data of this eye for the analysis. If data of both eyes were invalid, we excluded this sample point. To investigate whether the contingent animation in our experiment aroused infants, we compared changes in pupil size during the first 2 s after the mouse started to wiggle during the training phase to the identical but static pictures during the baseline and test phase. We averaged trials from the baseline and the test phase. We first excluded spurious sample points based on the second derivate (Hoehl, Hellmer, Johansson, & Gredeb€ ack, 2017). We then interpolated data with a maximum gap of 10 sample points and a moving average filter with a width of five sample points was applied. We only included a trial if 50% of sample points of the trial were valid. A trial was excluded if it exceeded 3 SD from all trials from all participants (Fawcett, Wesevich, & Gredeb€ ack, 2016). Baseline was set to 0-0.5 s and the analysis interval was set to 0.5-2.5 s after onset of the animation (training) and picture onset (baseline and test) which corresponds to the time the mouse was animated during training. To test whether infants were aroused by the wiggling mouse, we performed a paired sample t-test between the animated and the static mouse using pupil dilation as the dependent variable.

Pupil Dilation in Response to the Animated Mouse as Compared to the Static Mouse
The paired sample t-test revealed an enhanced pupil dilation in response to the animated as compared to the static mouse (see Figure 4), t (49) = 5.10, p < .001, Cohen's d = 0.67.

First Gaze Shifts to the Cued and Not-Cued Mouse
The 2 (phase) 9 2 (experiment) repeated measures ANOVA with DSGazeShift as the dependent variable revealed a significant main effect of experiment, F(1, 45) = 28.50, p < .001, partial ƞ 2 = .39. All other main effects or interaction were not significant, all ps > .25. Infants' gaze following behavior was identical in baseline and test but varied between experiments. To test whether infants followed gaze or not, we averaged DSGazeShift for baseline and test of Experiment 2.1 and 2.2, respectively, and performed a one-sample t-test against 0 separately for both experiments. Infants in Experiment 2.1 showed significantly more first gaze shifts in the direction of the not-cued mouse, t

Looking Times at the Cued and Not-Cued Mouse
The 2 (phase) 9 2 (experiment) repeated measures ANOVA with DSLook as the dependent variable revealed a significant main effect of experiment, F(1, 39) = 24.37, p < .001, partial ƞ 2 = .39. All other main effects or interaction were not significant, all ps > .5. Infants' looking behavior was identical in baseline and test but varied between the experiments. To test whether infants' looking behavior differed between the gaze and the head condition, we averaged DSLook for baseline and test of Experiment 2.1 and of 2.2, respectively, and performed a one-sample t-test against 0 separately for both experiments: While we did not find

Relating the Number of Gaze-Contingent Animations to Changes in Gaze Following From Baseline to Test
There were no significant correlations between the number of elicited gaze-contingent animations during training and DSGazeShiftChange or DSLookChange in either of the two experiments, all ps > .05 (see Figure 5).
Infants in Experiment 2.2 (with the head turn as the central cue) elicited significantly more gaze-contingent animations (M = 6.5) than infants in Experiment 2.1 (with the eyes as the central cue, M = 5.4), t(39) = À2.50, p = .017 (this analysis was performed with data of the looking times sample). However, variances of the amount of gaze-contingent animations in both experiments did not differ significantly from each other as tested with the Levene's test for equality of variances, F(1, 39) = 0.03, p = .854. Therefore, results of the correlational analyses should not be affected by the differences in the average number of elicited animations during the training phase. When performing the analysis on the first look sample, the same effect of a larger number of gaze-contingent animations in Experiment 2.2 was found, t(45) = À3.13, p = .003 and the Levene's test for equality of variances did not reveal a significant difference, F(1, 45) = 0.02, p > .25.

Discussion
In Experiment 2, we validated the arousing, presumably rewarding character of our animated stimulus (mouse). Infants showed enhanced pupil dilation in response to the animated as compared to the static mouse. We have therefore good reason to assume that our operationalization of reinforcement worked in both experiments. In Experiment 2, we also aimed at investigating the specific influence of gaze versus head orientation on infants' gaze following behavior and how reinforcement shapes this behavior. Infants followed the isolated head movement but not the eye movement of the actor. This is reflected in more first gaze shifts and longer looking times at the cued mouse as compared to the not-cued mouse during the baseline and the test phase in Experiment 2.2. In Experiment 2.1, infants did not follow the isolated gaze shifts during baseline or test. On the contrary, infants' first look even went to the not-cued mouse. One interpretation for this unexpected finding may be that the isolated movement of the eyes within the static face made the head appear moving to the opposite side. Infants then may have followed the apparent head movement. However, we did not find this effect with regard to infants' looking times. In general, effect sizes of first gaze shifts and looking times analysis in Experiment 2.1 were only small to medium while effect sizes of all other experiments were rather large. Our interpretation of these results can only be tentative as results of Experiment 2.1 were, in contrast to all the other experiments, susceptible to our inclusion criteria as reported in the Supporting Information A and B. If anything, isolated gaze direction weakly guided infants' attention in the opposite direction. We conclude that infants at 4 months of age in Experiment 2.1 did not follow gaze direction when presented in isolation.
On the contrary, infants followed the movement of the head, both spontaneously and after the training phase. This result highlights the important role head orientation plays for infant attention. This is in line with studies showing that infants follow isolated eye movements only later in development (Lempers, 1979; and the finding that infants take the visibility of eyes into account when following the head direction only at the age of 10-11 months (Meltzoff & Brooks, 2007). Furthermore, in research by Moore et al. (1997) 9month-olds only showed gaze following behavior when they saw the moving head turn, but not when they saw a static turned head. In addition, the study by Johnson, Ok, and Luo (2007) found that 9-month-olds associated the head and gaze direction of an actor with a target object only if several head movements towards the objects were visible to the infant. Thus, infants' attention seems to be driven primarily by head movements early in development, but gets more and more fine-tuned to gaze direction over time. This development fits to the perceptual narrowing account of gaze following stating that infants first follow rather broad cues like head orientation (Del Bianco et al., 2019). One may therefore argue that infants simply followed the gross movement of the actor and not the social information. Indeed, young infants' attention is guided by motion (De ak, 2015). However, infant studies on object encoding showed that in addition to motion, the cue itself matters. For instance, 4month-olds encoded objects more efficiently only when they were cued by stimuli containing eyes but not when they were cued by an equally moving stimulus without eyes (Michel et al., 2017(Michel et al., , 2019. Thus, we assume that the effect of gaze following in Experiment 2.2 goes beyond the shift of attention purely elicited by the motion of the head. Further studies using a nonsocial central cue are needed to test this assumption in the current paradigm which focuses on overt rather than covert attention shifts, thus differing from the studies mentioned above.
In Experiment 2 we did not find evidence that the training phase and the reinforcing animations influenced infants' gaze following behavior, even though infants consistently followed the head movement in Experiment 2.2.

General Discussion
In our four experiments, we investigated 4-montholds' spontaneous gaze following behavior in response to either the joint or isolated movement of gaze or head orientation. Furthermore, we explored how reinforcement, operationalized by an arousing animation, influences gaze following. Applying an interactive eye-tracking paradigm allowed us to build on the work of Moore and Corkum (1998) and transferred their learning paradigm to a method that is suitable for testing even younger infants. Infants at 4 months of age spontaneously followed the head and gaze direction of the stimulus (Experiment 1.1 and Experiment 1.2). This behavior seems to be driven by the direction of the head and not by the gaze direction (Experiment 2.1 and Experiment 2.2). Reinforcement had, if any, only a weak effect on infants' looking behavior. We found a significant correlation between the number of elicited gaze contingent animations and infants' change in gaze following behavior only in Experiment 1.1. Only infants who experienced more rewarding animations increased their gaze following behavior. Thus, our tentative interpretation of the results is that when encountering a larger amount of reward for their gaze following behavior, this may be one factor supporting the maintenance of infants' already existing gaze following behavior. This idea is in line with the integrative basic set model by Triesch et al. (2006), highlighting that the ability for reward driven learning is crucial for gaze following behavior. However, as infants' looking behavior was influenced by the central cue already in baseline in three out of four of our experiments, we cannot draw conclusions on whether reinforcement learning is crucial for infants' initial acquisition of gaze following behavior.
However-comparing the correlational analyses of Experiment 1.1 and Experiment 2.2-the tentative interpretation that reward may promote the maintenance of gaze following seems to be true only when the social cue contains the naturalistic motion of combined head and gaze direction. Experiment 1.1 and Experiment 2.2 only differed with regard to the gaze direction. In addition, we did not find any evidence for overt gaze following in Experiment 2.1. Thus, gaze direction of another person does not seem to be the driving factor initializing gaze following behavior in young infants.
We decided to test very young infants in order to explore the ontogenetic foundations of gaze following. As 4-month-olds performed gaze and head following already during the baseline phase, we were not able to investigate how gaze following first emerges. However, after being reinforced, infants did not start to follow the back of the head in Experiment 1.2 and infants did not start following the gaze direction in Experiment 2.1. It may be that the given reward only reinforced already established gaze following behavior. However, it is also conceivable that reinforcement can in principle influence gaze following when presented opposite the social cue (Experiment 1.2) or that it can be a driving factor for the acquisition of gaze following behavior (Experiment 2.1) but that our paradigm was not able to elicit such effects. It is possible that the maximum of eight training trials was not sufficient to alter infants' already existing tendency to follow a social cue (Experiment 1.2) or to elicit a novel behavior (Experiment 2.1). Due to infants' relatively short attention span and due to the requirement that infants still pay attention to the stimuli at the end of the presentation, an increase in the number of training trials was not feasible in our paradigm. Future studies may vary the central cues (e.g., using different faces) and targets on a trial by trial basis to attract infants' attention longer to allow for the presentation of more learning trials. In line with the idea that a reward increases a behavior while a negative outcome decreases a behavior, it would also be very interesting to investigate whether infants reduce their gaze following when experiencing that a stimulus disappears whenever they look at it. This would be another way of investigating how reward learning shapes already existing gaze following behavior.
As infants reacted with enhanced pupil dilation to our animated stimulus, we are confident that our operationalization of the reinforcement worked and had caused some arousal in infants. Our results can, of course, not be generalized above and beyond the stimuli presented in the current experiments. It would be highly interesting to investigate whether the effects can be replicated and generalized to other forms of reinforcement (e.g., interesting sounds, visual effects like lighting up the target) or visual stimuli other than comic mice. Here, infants' arousal with regard to different reinforcement stimuli can be captured via pupil dilation. Furthermore, it would be of great interest to investigate whether the amount of arousal, as measured via pupil dilation, different potentially reinforcing stimulus elicit, is related to an increase in gaze following behavior. In addition to varying the reward itself, future studies may include analyses of infants' eye blink rate. Eye-blink rate reflects reward-related central dopamine activity and was already successfully applied in infant studies investigating reward learning (Tummeltshammer, Feldman, & Amso, 2019;Werchan, Collins, Frank, & Amso, 2015. As mentioned in the Introduction, we decided to present our stimuli on a screen, making it feasible to test eye movements of young infants via eye tracking. As in any other screenbased experiment using 2D stimuli, conclusions drawn from our experiments are only valid for 2D and not 3D stimuli. However, our finding that 4month-olds' looking behavior is influenced by the head direction of another person is in line with live studies showing an early onset of gaze following behavior when gaze direction is embedded in larger head movements (D'Entremont, 2000;D'Entremont et al., 1997;Gredeb€ ack et al., 2010). We are thus confident that our results are at least partly predictive for real-world social interactions taking place in 3D. Of course, live interactions differ from highly controlled screen-based experiments in more characteristics than only the 2D versus 3D comparison (e.g., more flexibility in movements and timing, reciprocity with the infant). A live training study could overcome these differences and enable more ecologically valid conclusions on how gaze following emerges in infancy, but this would involve a trade-off in terms of less experimental control and higher variability. As mentioned in the Methods section of Experiment 2, we planned to investigate whether infants associated the reward with the central cue in analyzing changes in pupil dilation in response to the central facial cue from the baseline to the test phase. In Experiment 2, we therefore presented the central cue in the beginning and at the end of the experiment for 5 s. We were not able to analyze responses to these stimuli as infants did not provide sufficient gaze data. However, similar analyses are necessary in future studies to examine whether infants associate the rewarding stimulus with the cue (Tummeltshammer et al., 2019).
The current experiments addressed learning mechanisms underlying gaze following behavior in early infancy. The findings of four independent experiments revealed that 4-months-olds are already able to follow another person's head and gaze orientation in a simple computer-based paradigm and that this behavior is likely to be based on their tendency to follow the movement of the head of another person. Within our paradigm, reinforcement had only a very weak effect on infants' gaze following behavior, when it occurred in line with a naturalistic social stimulus, that is, the joint movement of head and gaze.