The trial was a parallel group, efficacy trial of game-based therapy among chronic schizophrenia patients. The trial is registered at http://www.chictr.org.cn, identifier ChiCTR2100048403.
Participants
Eighty clinically stable patients with chronic schizophrenia were recruited from Beijing Huilongguan Hospital. Thirty-two age-matched healthy controls (HCs) who were recruited through advertisement also participated in this study. The patients were required to meet the Diagnostic and Statistical Manual of Mental Disorders-V (DSM-V) criteria for schizophrenia or schizoaffective disorder based on interviews and a review of their clinical histories. Other inclusion criteria included age between 16 and 45 years, more than 6 years of education, moderate negative symptoms using the Positive and Negative Syndrome Scale (PANSS) [35], and relative clinical stability. The exclusion criteria for all subjects included cognitive impairment caused by head trauma or cranial surgery, neurological deficiencies such as visual or hearing loss, alcohol or substance abuse/dependence within the previous 6 months, and game addiction (playing games for ≥3 h/day for the last 6 months). All volunteers received financial compensation for their participation. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. The experimental procedure was approved by the Institutional Review Boards (IRBs) of Beijing Huilongguan Hospital, Peking University. Written informed consent was obtained from all volunteers after the study had been fully explained.
Study procedures
Eligible SZ patients completed baseline assessments, including cognitive, clinical, and functional tests and eye-tracking tasks, in the ward. Then, these patients were randomly assigned to either game training (GT, Komori Life (see below) and routine treatment) or treatment as usual (TAU, routine treatment) training conditions (also see the CONSORT checklist). All subjects were grouped into blocks of 2 showing similar demographic characteristics (age, sex and education), and each of the 2 patients in a block was then randomly assigned to a different group to minimize the imbalance. Participants in the GT group were loaned tablet computers, given logins and instructed to complete their training intervention 5 times/week, lasting for 60 min each time over 4 weeks. After 4 weeks, all patients were asked to complete the posttraining assessment battery (same tests and task as the baseline assessments). During the training, participants interacted with staff who supervised the patients if they indicated difficulty in completing training. Participants in the TAU group as well as those in the GT group participated in their daily rehabilitative activities. Following consent and initial screening, HCs completed the eye-tracking task only.
The Komori Life training program
This training program was modified from Komori Life (https://komori.qq.com), which was deployed on an online, browser-playable platform by Tencent. Komori Life is a social/life and farming simulation game. In the game, the user can choose to play the role of a student or an office worker who has just moved to a small town in the Japanese countryside. The user goal is, starting from scratch, to decorate the house, fix up garden patches and plant fruit trees or vegetables, catch animals, cut down trees, extract useful minerals, cook food and get to know all the inhabitants in the town (see online Supplementary Material Fig. S1 and Supplementary Table 15). In this process, participants complete exercises on all of the cognitive domains related to attention, memory, executive function and social cognition. This program begins with lower cognitive demands and progressively advances to more complex exercises. Progression is guided through the training interactively, and feedback is given when completing a level in the game. Participants completed unique games during every training session for ~60 min per session. Game-related measures include the game grades, total playing time, total game playing behaviors (planting, cooking, decorating, hunting, felling and mining) and degree of activity.
Assessments
Cognitive and clinical assessments were performed at baseline and after completion of the training intervention among all participants with SZ. All neuropsychological tests were conducted by graduate psychologists working in hospitals.
The positive and negative symptoms were assessed by using PANSS [35] and the Brief Negative Symptom Scale [36]. Social function, quality of life and pleasure experience were measured with the Personal and Social Performance Scale (PSP), Self-Esteem Scale [37], Schizophrenia Quality of Life Scale and Temporal Experience of Pleasure Scale [38, 39]. Cognitive function was assessed with the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) consensus cognitive battery according to 7 cognitive domains of the MCCB, including speed of processing, attention/vigilance, verbal learning, visual learning, reasoning and problem solving and social cognition [40].
Eye-tracking task
A free-viewing task with emotional scenes was performed at baseline and after completion of the training intervention among all SZ participants as well as among HCs. The full description of this task has been published elsewhere, and most relevant details of task procedures are described below [32].
Stimuli
The stimuli included 80 images selected from the International Affective Picture System [41]. The stimuli were categorized as sad, happy, threatening, or neutral following a pilot study [42]. A total of 12 sad, 12 happy, 12 threatening and 12 neutral target social images as well as 32 neutral control images were chosen. The sad, happy and threatening target images represented people showing different emotions under different situations. Neutral target images represented people in nonemotional activities. The neutral control images represented nonliving objects [43]. Threatening images had the greatest arousal rating, and neutral images had the lowest arousal rating among the four categories of images. The sad and happy images did not differ from each other in arousal. The happy and neutral images had higher valence ratings than the other categories of images, whereas the sad and threatening images did not differ from each other in valence ratings. The images did not differ in visual complexity, which was transferred in JPEG compressed file format [44].
Free-viewing task
A total of twenty trials (12 study trials and 8 filler trials) containing four images were simultaneously presented. Each trial began with a 1000-ms centrally displayed fixation cross. Then, the trial was presented for 20 s. Twelve study trials contained four images on each slide, and each image was selected from the following four categories: sad, happy, threatening, or neutral target social images. Eight filler trials with four neutral control images were displayed to obscure the nature of the task. For each study trial, the position of each image was randomly selected, with the constraint that each valence must occur in each of the four positions three times across the 12 trials. The presentation order of the trials was also randomized across the subjects [32, 43], see Fig. 1.
Apparatus
A remote eye-tracking binocular system (aSee Pro, 7invensun, Beijing, China) was employed to measure the subjects’ eye movements, which allowed a free range of head movements. The sampling rate of the eye positions was 135 Hz. Subjects were seated 65 cm away from the screen. Camera adjustments were made to best capture the subjects’ eyes.
Procedure
After completing the baseline assessments, participants were tested individually in a silent room. The experimental session began once the calibration was accepted (the average error was less than 1.5° of the visual angle for each calibration point). Then, participants were instructed via the computer screen with the following content: “Look freely at the images as if you were watching television or looking at a photo album”. The experimenter was located in the same room and monitored the whole process. The size of the pictures was 10.95° (wide) × 6.20° (high), and the distance between the pictures was 5.44° (horizontal axis) and 3.06° (vertical axis).
Data analysis
To ensure a balance between all baseline measures in each SZ patient group, eye movement measures and cognitive and clinical assessments were compared. Continuous data are presented as the means and standard deviations. Group differences in demographic measures were tested using one-way analysis of variance (one-way ANOVA), and chi-square analyses or Fisher’s exact tests were used for quantitative and qualitative variables. Group differences in cognitive and clinical measures at baseline between the two SZ patient groups were tested using the independent sample t-test. Group differences in cognitive and clinical measures after training were analyzed in separate repeated-measures ANOVAs with group (the GT group and the TAU group) as a between-subject factor and before and after assessments as a within-subject factor.
For eye movement data, the areas of interest were identified for each study trial and corresponded to the total area for each of the four images. Each area of interest for each target image was added 2 mm to both the up and down, left and right. A total of 5 measurements were computed to evaluate attention across different target images. Two measurements assessed allocation of attention were as follows: (1) the percentage of total fixations (i.e., percentage of times that each subject fixated, and refixated, on a particular target image); and (2) the percentage of total duration (i.e., percentage of fixation time attending to each target image). In addition, three measurements were employed to assess subsequent attentional engagement as follows: (1) first-pass fixations (i.e., the sum of fixations made on the image when looking at it for the first time, before fixating away from it); (2) percentage of first fixation (i.e., percentage of times that the first fixation lands on the image); and (3) gaze duration (sum of fixation duration made on the target image when looking at it for the first time) [32, 33]. All measures were averaged across the trials. Each eye movement measure at baseline was analyzed in two-way ANOVAs with group (the GT group, the TAU group and the HC group) as a between-subject factor and valence (sad, happy, threatening and neutral) as a within-subject factor. For eye movement measures after training, group differences were analyzed in repeated-measures ANOVAs with group (the GT group and the TAU group) as a between-subject factor and (before and after assessments) as a within-subject factor. The aforementioned ANOVAs were also analyzed with the participant’s sex, age, and years of education as covariates. We were only interested in the interaction effect of group by valence or group by time. If the interaction effect was significant, simple effects tests were performed, which were corrected for multiple comparisons by Bonferroni comparisons.
Additionally, correlational analyses were performed to characterize the relationships among game-related measures, cognitive and clinical measure changes before and after training, and eye movement measure changes before and after training in the two SZ patient groups separately. Bonferroni multiple comparisons were employed to control for type-I errors due to multiple testing. SPSS (version 21.0, IBM) was utilized for statistical analyses. A value of p < 0.05 was considered statistically significant.