Abstract
Background: South African township learners are known to perform poorly in international tests, including reading and science. Language is a complicating variable as English, the learners’ second language (L2), is generally the language of learning and teaching, including for reading science texts.
Aim: The study’s objectives were to determine the factors limiting South African township learners’ comprehension of science texts so as to inform intervention efforts.
Setting: This study used eye-tracking equipment to measure higher-achieving South African township learners’ reading speeds, fixation durations and fixations per word for Sesotho (L1), English (L2) and science texts at various difficulty levels.
Methods: Reading gaze metrics were compared between languages and difficulty levels and against benchmarking levels and comprehension tests were administered.
Results: Learners were found to read very slowly, particularly for science texts. The number of fixations per word was typical of reading; there were few regressions and long fixation durations.
Conclusion: Results suggest that reading is not automated, even for L2, but rather that learners struggled to decode all texts. Longer fixations and higher comprehension scores for the Sesotho (L1) texts suggest that this decoding struggle was more thoroughly engaged in the vernacular L1. The number and length of fixations, with few regressions, may indicate minimal attempts at comprehension.
Contribution: The most basic aspects of decoding limit reading comprehension in both L1 and L2, and therefore, these should be the first skills targeted in the quest to improve science performance.
Keywords: reading; eye tracking; fixations; comprehension; gaze.
Introduction
The low general language and literacy skills of South African (SA) learners, including in their mother tongue and especially in poorer contexts, are well established (Mullis et al. 2023). A total of 81% of Grade 4 learners cannot read for meaning in any language (Böhmer & Wills 2023), and this statistic has little improvement across schooling (Van der Berg et al. 2016). Furthermore, before the coronavirus disease 2019 (COVID-19) pandemic, a learner in the richest 10% of schools was five times more likely to learn to read for meaning than those in the poorer 90% (Spaull & Pretorius 2019:147–168), with this socioeconomic disparity enlarging significantly as a result of school closures during the pandemic (Böhmer & Wills 2023). Low language and literacy skill levels impact all aspects of learning, including the sciences (Stott & Beelders 2019:72–80). What is less well established is the status of each of the components of reading that needs to be mastered for comprehension, particularly in the bilingual context prevalent in South Africa (Spaull, Pretorius & Nompumelelo 2020:1–14). These components are text decoding and language proficiency (Gough & Tunmer 1986:6–10) for the various languages and how these interact (Spaull & Pretorius 2019:147–168).
Most South African township learners receive instruction in their vernacular for the first 4 years of schooling, after which the language of learning and teaching (LoLT) changes to English. Although African languages are phonetically transparent, and therefore, their decoding should be relatively quickly mastered, this is not the case (Böhmer & Wills 2023). English, in contrast, is spelt more phonetically opaquely, offering more decoding difficulty and reducing the bi-directionality of learning to decode bilingually (Mohohlwane et al. 2023:687–710). Decoding, fluency and reading speed are interconnected, but it is generally accepted that a decoding threshold must be reached before comprehension and speed will increase (Wang et al. 2019:387–401). Although decoding automation is necessary for comprehension, it is not sufficient. People who can decode text yet fail to comprehend it beyond a surface level are said to bark at text (Samuels & Farstrup 2011). High incidences of barking at science texts in English were found among even relatively high-achieving middle-school learners from township schools (Stott & Beelders 2019:72–80). Departmental benchmarking of Sesotho reading found that the majority of Grade 3 learners were capable of reading at the Grade 1 fluency level (Education 2022). Furthermore, comprehension increased substantially for Sesotho reading as the reading speed improved (Education 2022). However, the comprehension tended to stabilise at 60 wpm, giving a strong indication that these skills were underdeveloped and posed the biggest obstacle to understanding and learning (Education 2022). This supports the notion that a large percentage of these learners are barking at text.
Language proficiency influences the effectiveness of comprehension activities that can be engaged in once decoding has been automated (Spaull & Pretorius 2019:147–168). Although it appears obvious that learners should perform better in their vernacular than in English, in the Progress in International Reading Literacy Study (PIRLS) reading studies, the groups of South African learners who performed the worst were those who wrote the test in their vernacular (Böhmer & Wills 2023; Mthimkulu, Roux & Mihai 2024:1–10; Pretorius & Klapwijk 2016:1–20). However, socioeconomic status is a confounding variable in this comparison (Graham & Mtsweni 2024). When learners of similar socio-economic status are compared, it seems that reading in the vernacular does afford an advantage. However, this is very poor in absolute terms and relative to learners from higher socio-economic status reading English (Mohohlwane 2019:687–710).
This study aims to determine the factors limiting South African township learners’ comprehension of science texts, particularly the role that language plays in this. This is important for informed intervention efforts. This study therefore extends this existing work on the relationship between text decoding and language proficiency for learners from low socioeconomic learning in a bilingual context. To achieve this, the study describes the eye-movement metrics of Grade 8–12 township learners while reading non-scientific texts of various difficulty levels in English, Sesotho and science texts. Furthermore, the study compares these metrics with one another and with baseline metrics from the literature. This comparison aims to understand the limiting factors to these learners’ science reading. Such understanding is crucial to mitigate low-skilled reading in this demographic. Eye tracking provides rich, objective data from which conclusions about cognitive processing can be drawn and which can potentially be applied in intelligent tutors if sufficiently understood for a particular population.
Background
The South African context
South African township schools are known to suffer from poor learning and teaching time utilisation, with multiple consequences, including insufficient time for the repetition required to develop decoding automation (Pretorius & Spaull 2016:1449–1471). Furthermore, learners from these schools tend to come from text-poor homes with little adult input into their learning (Chikoko & Mthembu 2021:69–90; Graham & Mtsweni 2024). It is unsurprising that decoding is often not even mastered, let alone automated (Böhmer & Wills 2023).
Readers can be classified into reading level groups according to their decoding accuracy and comprehension levels (Lesiak & Bradley-Johnson 1983). Reading at the frustration level (below 90% decoding accuracy and below 60% comprehension) may correspond to barking at text, or such low decoding ability that not even barking at text can be engaged in (Stott & Beelders 2019:72–80). Learners who read at the instructional level (95% decoding accuracy and 75% comprehension) require comprehension instruction to promote learning through reading. In contrast, those at the independent level (98% decoding accuracy and 95% comprehension) can self-direct their learning through reading (Lesiak & Bradley-Johnson 1983).
Language and literacy learning are bidirectional (Kim & Piper 2019:839–871). Comprehension skills learned in one language are transferable to another (Pretorius & Klapwijk 2016:1–20), with improved transfer measured from the vernacular to English in the African context, particularly for weaker learners (Mohohlwane et al. 2023:687–710). The extent to which such transfer occurs is influenced by instruction (Kim et al. 2024:171–194). Regarding South African reading instruction, Spaull and Pretorius (2019:147–168) question the appropriateness of social constructivism as a theoretical framework for teaching reading instruction during undergraduate teacher preparation, and Fleisch (2023) criticises the Curriculum and Assessment Policy’s (CAPS) weak phonics guidance and suggests that more contextually inappropriate strategies are needed.
Eye movements during reading
Basic eye movements consist of saccades and fixations that alternate to enable humans to see an object of interest. Saccades are very fast movements that position the eye (Rayner 1998). Visual sensitivity is suppressed during saccades (Rayner 1998:372). In contrast, visual sensitivity is enhanced when the eyes are held relatively still, called fixations (Rayner 1998:372). Eye movements have been investigated in several different scenarios, such as visual search or attention, but the movements found during reading are particularly interesting for this article.
Saccades and fixations are naturally present while reading; saccades position the eye over new text that needs to be read, and fixations process the text (Rayner 1998:372). When reading English, fixations typically last between 200 ms – 250 ms, and saccades span 7–9 letters. These measures are for L1 adult readers but are nevertheless used for benchmark comparisons as ‘typical’ eye movements. Most saccades during reading are from left to right, but regressions move the eyes from right to left to previously read lines (Rayner 1998:372). Short regressions, such as in-word regressions, indicate difficulties processing the current word, while longer regressions indicate difficulty in understanding. Good readers are more proficient at accurately positioning their eyes with regressions than poorer readers, who tend to have many more regressions (Rayner 1998:372).
Fixation lengths can be quite variable during reading, ranging from less than 100 ms to more than 500 ms, but it has been shown that fixation durations increase (Rayner 1998:372; Rayner et al. 2006:241–255), and the number of fixations also increases (Rayner et al. 2006:241–255) as text gets more difficult and hence more problematic to process. In these instances, there are also more regressions, and saccade lengths tend to decrease (Rayner 1998:372). Furthermore, while younger readers have longer mean fixations than adult readers typically exhibit, as the child progresses the mean fixation durations and regressions decrease (Strandberg et al. 2022:3).
Gaze behaviour can be used as an indicator of whether a reader is struggling to understand or read the text as this impacts the duration and number of fixations and regressions in adult readers (Rayner 1998:372). Eye movements in children are also impacted by reading ability, with significant differences in fixations and non-significant differences in saccades being observed as reading ability changes (Strandberg et al. 2023:8). Apart from only detecting reading difficulty, eye tracking can also be used as a tool to predict the level of comprehension for both adults (Mézière et al. 2023:425–449) and children, as better comprehension is accompanied by shorter fixations (De-la-Peña 2024). Many studies have been conducted on adult readers and considerably fewer on children. To our knowledge, no eye-tracking reading study addresses South African Sesotho mother-tongue children. This study will thus aim to address this by concentrating on Sesotho mother tongue learners. These learners have previously exhibited low reading skills, with eye movements indicative of barking when reading English. This study will further explore these findings to determine whether the same eye movements are present when reading a Sesotho text or whether reading skills are more advanced in the first language.
Bilingual reading
Many studies that have determined gaze behaviour were conducted using L1 English readers. However, the findings from such studies do not necessarily apply to L2 readers, who are known to have more and longer fixations than L1 readers except during reading-while-listening, when L1 and L2 readers have similar eye movements (Conklin et al. 2020:257–276). As with L1 reading, other factors influence eye movements – such as word frequency and the number of meanings of the word, among others. Low-frequency words and novel words elicit longer processing time, but the processing time decreases with each subsequent encounter of the novel word (various studies, as discussed in Pellicer-Sánchez [2020:134–146]) for both L1 and L2 readers.
An eye-tracking study on reading Afrikaans texts for L1 and L2 readers found that L1 readers exhibited typical gaze behaviours of readers, but L2 readers did not. The L2 readers had longer and more fixations than L1 readers (Dednam et al. 2014:334–342). Conversely, Demareva and Edeleva (2020:673–676) found that L2 reading results in shorter fixation times than L1 reading. However, this faster reading time by L2 readers is more likely to indicate shallow parsing (Pulido 2021), in other words shallow decoding and limited understanding. Both readers with high and low chunking abilities exhibit high reading speeds, the former showing fast integration and the latter showing a lack of integration (Pulido 2021:1–18). This confirms prior findings (Dirix et al. 2020:371–397) that when studying a text in L2, gaze analysis showed that word processing was slower and that less information could be processed simultaneously in L2 (Dirix et al. 2020:371–397). Furthermore, reading time was longer when studying an L2 text (Dirix et al. 2020:371–397), as with Dednam et al. (2014:334–342), perhaps as a forced mechanism to avoid shallow parsing. The nature of the task, that is, studying to be able to answer questions, could explain the longer reading time in this instance (Dirix et al. 2020:371–397). In summary, as cited in Nahatame (2023:724–737), eye-tracking studies show that L1 reading is more efficient than L2 reading. Reading in a second language has the characteristics of a longer reading time, shorter saccades and less frequent word skipping (Nahatame 2023:724–737). This study investigated L1 and L2 reading of school-going learners to determine whether there is a significant difference in their gaze behaviour.
Research questions
The study is guided by the research question: How do eye-tracking measures of reading behaviour (e.g. fixation duration, reading speed) and comprehension differs across L1, L2 and scientific texts among South African learners? In this study, this question is answered with reference to higher-achieving Grade 8–11 township learners reading Sesotho (L1) and English (L2) narrative and English scientific texts at various difficulty levels.
Theoretical framework
Cognitive load theory (Sweller 2011:37–76) is used as a theoretical framework to guide the interpretation of the findings. All learning, including reading, requires active processing within working memory, which is of limited size. There are three possible types of active processing, each offering a corresponding type of cognitive load: extraneous, intrinsic and germane. Extraneous cognitive load results from the design features of the medium that distract the learner without aiding the learning of the target information. Intrinsic cognitive load is related to the number of interactive items experienced by the learner within the presented media. More knowledgeable learners experience a lower intrinsic cognitive load from a particular medium because of their ability to chunk information. Any space left over in working memory is available for sense making; in other words, it can be allocated for germane cognitive load. A learner feels discomfort when experiencing a task demanding more or similar amounts of working memory space through extraneous and intrinsic load than what is available. Without access to sufficient space for germane load, the learner cannot make sense of the information and is likely to suffer demotivation (Kirschner et al. 2018:213–233).
The processes involved during reading comprehension and the existing literature on inferring cognitive load from eye-tracking data clarify the relevance of this theoretical framework for interpreting eye-tracking data during reading. Reading comprehension involves decoding the written words, after which their individual and combined meanings must be comprehended using language skills (Gough & Tunmer 1986:6–10). Comprehension requires understanding at the surface (sentence level), situation and global levels (Cain 2010). Inference, knowledge of text structures and comprehension monitoring are needed to form the latter two levels of understanding, which refer to the integration of successive sentences and prior knowledge. Given the severe size limitation of working memory, in which all these processes must occur, comprehension is impossible without decoding automation. This phenomenon is because processes that have been automated through considerable repetitive practice cease to occupy space within working memory. This frees up space for the cognitive activities required for comprehension. Consistent with this theory, empirical research has shown that reading speed is a reliable means of detecting decoding automation and comprehension (Pertiwi & Sujarwati 2023:534–540; Strandberg et al. 2022:3), and a minimum reading speed threshold is required for comprehension (Pretorius & Spaull 2016:1449–1471).
Research methods and design
Study design
This study uses a quantitative descriptive research design. This type of research aims at describing, rather than manipulating, variables to understand the individual variables rather than cause-effect relationships between variables (Loeb et al. 2017). The measured variables (reading speed, fixation durations and numbers and comprehension) are compared between the vernacular and English for non-scientific texts and scientific texts in English for various difficulty levels. This comparison is performed descriptively to understand the general patterns in these comparisons. A descriptive research design cannot empirically claim cause-effect relationships. However, it can set the stage for further causal research (Loeb et al. 2017). Although the discussion in the next section focuses on the patterns evident in the data, cause-effect explanations are suggested in terms of literature and the theoretical framework. These explanations could be empirically tested in future research.
Study population and sampling strategy
The population for this study was South African high school learners. Purposive and convenience sampling was used to recruit participants. The learners who approached to participate were part of a school-university partnership project, and thus the second author had access to these learners. These learners were purposefully chosen for inclusion in this research because of their voluntary production of an empirical research project for a science fair competition. This is an extraordinary achievement, particularly in contexts of poverty where extracurricular activity is rare, and learners receive little support (Mupezeni & Kriek 2018:1577). These learners were the first in their schools ever to have produced such projects and none of them received help from family members in this, which is consistent with their low socioeconomic context. Based on this, these learners are considered higher achievers within their school contexts. The second author also had access to the learners’ and their peers’ Natural or Physical Sciences marks, from which it was evident that the sample represented a wide range of achievement, but all achieved marks higher than their class averages. These class averages were, however, very low, often below the failure mark, which is in accordance with the general low competence level of learners in these schools (Van der Berg et al. 2016). The reason why these relatively high-achieving learners were sampled was to increase the likelihood that they would be relatively fluent in both English (officially their LoLT) and Sesotho (their vernacular), as well as able to read science texts with a reasonable level of comprehension. As reading skills are low in school-going learners, the study included English narrative and Sesotho narrative texts because these are, respectively, the tuition language and the vernacular of these learners. This would allow results to determine whether barking is present in narrative texts, as found previously in science texts (Stott & Beelders 2019:72–80). To ensure proper comparison between learners, a science text was also included. The fact that the sample is not representative of the population of learners in the schools of poverty they attended may limit claims made in this study.
The sample comprised 27 Grade 8–11 learners from five schools in a township (semi-rural but densely populated area of low socio-economic status) near Bloemfontein, in the Free State province of South Africa. The reason for the relatively small sample size is twofold: firstly, participants had to conform to the characteristics described here, which limits the number of available learners, and secondly, the eye-tracking data must be collected on an individual basis, which makes it a very time-consuming exercise and thus prohibits a very large sample, particularly as the study required that participants read multiple texts, which is itself time consuming.
Data collection
Comprehension texts and questions were accessed and slightly modified for Grades 5–6, 8–9 and 10–11 in each language (English and Sesotho) and for science in Grades 8–9 and 10–11. These small grade ranges, rather than exact grades, were used to guide instrument selection, given variations in difficulty levels within grade-related tests. The title and source of each text and its associated questions, as well as the texts’ word counts and average word lengths, are given in Table 1. The subsequent section provides an argument for the validity of each text as pertaining to the degree to which they reflect the actual South African context and concludes that the validity of the texts, together with their associated texts, are a reasonable representation and can thus be included in the study. Furthermore, care is taken in interpreting the findings to avoid overreaching the claims that can be made within the study’s limitations.
TABLE 1: Details about the eight texts used for data collection. |
For each of the eight texts, four multiple-choice questions were set for this research, in addition to the questions obtained from the sources listed in Table 1. The second author and a colleague fluent in Sesotho and English set these additional multiple-choice questions to evaluate global text comprehension immediately after the learners had read each text, without allowing them to refer back to it. The learners answered these questions on the eye-tracking-equipped computer. In contrast, the learners answered the original questions sourced with each text in writing after they had finished the associated eye-tracked session. They were given printed copies of the associated texts for reference for these questions. This combination of questioning forms is assumed to have enhanced the validity of the testing process.
The Grade 5–6 English narrative text, questions and marking guidelines were obtained from the PIRLS released texts and items 2011 user guide (permission number IEA-17-208). In contrast, the other narrative texts and questions were taken from South African tests, examinations or prescribed textbooks, as indicated in Table 1. This was why the English 6 text was longer than the other texts, breaking the general trend of increasing text length along the grades. Although this text was intended for use by grade 4 learners, South Africa chose to enter Grade 5 learners into the PIRLS study better to match the English second language standard in South Africa (Howie et al. 2012). The science text, questions and marking guidelines were created specifically for this research by a teacher with over 10 years of experience teaching, writing text and setting tests for Grades 8 to 12 Natural and Physical Sciences in South Africa. These measures are assumed to have ensured that each of these tests was appropriate for the range in the grade level for which it was intended and therefore, to allow for valid measurement of each participant’s narrative comprehension at their grade level for each language, English science comprehension at their grade level, as well as narrative, per language, below and/or above their grade level.
Each participant read two English (L2) narrative texts, two Sesotho (L1) narrative texts and an English science text. The English and Sesotho narrative texts were presented on different difficulty levels to determine whether barking is present at all. Table 2 summarises which of these texts were presented to the participants.
TABLE 2: A summary of the texts that each grade of learner in the sample read. |
For brevity, the remainder of the article will refer to the texts as Sesotho (low), Sesotho (high), English (low), English (high) and science. Low and high refer to the text at the lower- and higher-grade levels. Hence, all participants read a low and high text in English and Sesotho and then a science text appropriate to their grade range.
An experienced language teacher who is fluent in both English and Sesotho evaluated the cognitive demand of each question across all the tests, categorising each at one of three levels: low (information extraction), medium (comprehension) and high (evaluation and justification). Informed by this, she made minor adjustments to question inclusion to obtain a similar spread of cognitive demand tested across the narrative tests. The second author independently applied the same classification to the English narrative tests, with an inter-rater agreement of over 80%. The two evaluators discussed and adjusted those evaluations in which they differed until they reached a consensus. In addition to these questions, these two evaluators set four multiple-choice questions per text. These aimed to evaluate participants’ global understanding of the text without requiring them to revisit the text.
Each participant read the five texts relevant to them across a week, being removed, for these sessions, from other activities they were engaged in as part of a week-long intervention during their winter holidays. By spreading data collection per participant out in this way, reading and question-answering fatigue was reduced, further enhancing the validity of comparisons made between their eye movements and reading and their comprehension across the texts.
The Tobii spectrum eye tracker was used to collect gaze data, with a sampling rate of 600 Hz and a screen resolution of 1920 × 1080. For each sitting, the participant’s eye movements were first calibrated using a nine-point calibration (participants were recalibrated until acceptable precision and accuracy were achieved), after which they read the assigned text and answered its associated multiple-choice questions on this machine. The entire process was first explained to the participants before data capturing commenced. Participants read silently and at their own speed, turning the page by pressing a key on the keyboard or clicking the mouse button. The Tobii spectrum is a remote eye tracker and is non-invasive and not harmful to participants. The stimuli, in this case, the texts, are presented to the participant on a normal computer screen. The Tobii spectrum is also a high-speed eye tracker, meaning eye positions are frequently sampled. This is imperative for reading studies as the eye moves very quickly when reading. The spectrum is accepted in the field as an eye tracker capable of capturing data at high speeds and is therefore acceptable for use in reading studies.
While participants read the texts, the facilitator was seated next to them to monitor that the participant remained within the head box – and that the tracker captured the eyes and movement. After completing this process, the participant was given a printed version of the texts and questions (excluding the multiple-choice questions they had already answered on the computer) and given as much time as they required to answer these questions in writing. The use of questions for which the participants were not allowed to refer to the text (i.e. the multiple-choice, computer-based questions), as well as those where they could do this (i.e. the longer written questions), is assumed to have improved the range of comprehension measurement and therefore the validity of the resulting comprehension score. This score was obtained by applying the marking guidelines, with moderation of this marking across the two markers.
Data analysis
The following metrics, typically associated with reading behaviour, were compared and analysed to answer the research question comprehensively:
- Reading speed in words per minute (wpm)
- Fixation length, measured in milliseconds (ms)
- Mean number of fixations per word (as the lengths of the texts differed, the mean number of fixations per word was calculated for each participant and each text)
- Comprehension as determined by the number of correctly answered questions on the comprehension test.
Fixations were extracted using Tobii Pro Lab. Various eye movement classification algorithms can identify and classify a range of eye movements, from fixations to smooth pursuit (Komogortsev et al. 2010:2635–2645). As the study concerns fixations, a suitable algorithm would be a fixation classification algorithm. These algorithms use spatial or temporal eye movement characteristics (Salvucci & Goldberg 2000:71–78). The Velocity-Threshold Identification (I-VT) is a velocity-based spatial algorithm that relies on the fact that saccades are high velocity and fixations have low velocity (Salvucci & Goldberg 2000:71–78) and is a popular method of classifying algorithms because of its simplicity (Munn, Stefano & Pelz 2008:33–42). Threshold settings affect the classification of gaze movement (Birawo & Kasprowski 2022:8810). Hence, the algorithm and the settings must be reported for comparison and replication of results.
The Tobii I-VT fixation filter is an implementation of the standard I-VT fixation algorithm and was used to detect fixations, with a minimum duration of 60 ms for a gaze event to be classified as a fixation and a velocity threshold of 30°/s. Outliers (fixations much longer than the average for the sample) were removed from the fixation data. Eye-tracking software is often used for research studies to extract fixations and other eye movements.
Because measurements were repeated as the same participant read multiple texts, the nonparametric Friedman Friedman Analysis of Variance (ANOVA) was used for analysis. Where a significant difference was found, a series of Wilcoxon rank tests, with a Bonferroni adjustment, were conducted to determine which factors accounted for the significant difference.
Ethical considerations
Each participant and their guardian provided written informed consent for the anonymous inclusion of their data in this research. Additionally, ethical clearance was received from University of the Free State via the Education Ethics Committee, which evaluated the research proposal for compliance with ethical standards (ethical clearance no: UFS-HSD2016/1391).
Results and discussion
The findings are presented and discussed next in answer to the research question.
Reading speeds are very low, with science reading being significantly lower
The mean reading speed for each group is given in Table 3.
TABLE 3: Mean reading speed data for the various treatments. |
Many studies indicate an average reading speed of 300 wpm for adults, but this is more likely closer to 238 wpm for non-fiction and 260 wpm for fiction (Brysbaert 2019). These speeds apply only to adult native English speakers. Children read slower, with Grade 8–11 high school learners expected to read between 200 wpm and 240 wpm (Brysbaert 2019). From an African language perspective, benchmarks published by the Department of Education suggest that Grade 1 learners should be capable of speeds of 40 wpm, and at the end of Grade 3, learners should be reading Setswana-Sesotho at 60 wpm (Education 2022). Thus, inspection of Table 3 highlights that compared to these prior studies and benchmarks, the reading speeds were very low, particularly for the science text. Even the highest mean reading speeds for English (low) and Sesotho (high) are only half of these expected speeds, as per prior studies. This illustrates that the reading skills of these learners are very low and clearly not on benchmark or expected levels. The lowest reading speed, an average of one word per second (60 wmp), was achieved for the science text. It is to be expected that more technical language would slow readers down. However, learners are expected to have reached 60 wpm by the end of Grade 1 (Abadzi 2006). Within the constraints of working memory, this is the minimum speed required to recall a sentence of typical Grade 1 text to enable comprehension at the sentence level (Abadzi 2006). This speed can, therefore, certainly not be quick enough for the sense making of more complex sentences, such as those in high school science texts. This is because working memory has a limited capacity and a short retention period (Sweller 2011:37–76). Slow reading speeds indicate that decoding is not automated and therefore offers cognitive load, reducing cognitive resources available for comprehension (Pretorius & Spaull 2016:1–20). Furthermore, slow reading speeds may mean that by the time the end of the sentence is reached, the information from its start has already been lost from working memory. This makes it impossible to simultaneously represent the entire idea expressed in the sentence in working memory, as is needed for comprehension (Abadzi 2006). With the overall low speeds (Table 3) captured for these texts, it is highly likely that these learners are not capable of automatic decoding or comprehending the text they are reading.
Interestingly, the Velocity-Threshold Identification (I-VT) mean speed for Sesotho narrative (high) was slightly greater than for Sesotho narrative (low), and English narrative (low) was higher than either of the Sesotho narratives (low and high). However, these differences were insignificant, with only the science reading speed showing a significant difference, as explained next.
A Friedman ANOVA was conducted to determine whether there was a significant difference in reading speeds. At an α-level of 0.05, there was a significant difference between the reading speeds (χ2(4) = 25.8, p < 0.05). Post hoc tests indicated that reading speeds in English (high and low) were significantly faster than for science. At a significance level of 0.05, instead of the Bonferroni adjustment of 0.005, both Sesotho reading speeds were significantly faster than the science reading speeds. This shows that the learners read slowly for all the genres, languages and difficulty levels measured and significantly lower when reading science texts in English than for the other texts measured.
Fixations are long, with L2 fixations being significantly longer
Figure 1 plots the mean fixation lengths per page and the mean number of fixations per word for each page. As the number of fixations and the duration are both of interest in reading behaviour and can indicate reading difficulties, it is important to report both and offset them against one another. For example, it might be found that there are many short fixations or few long fixations. The two measures are plotted on a single graph for brevity, but each language is plotted on its own graph. Throughout the fixation, durations are plotted as a line graph that corresponds to the values on the left axis. The number of fixations per word is shown as a bar chart corresponding to the right axis. For these graphs, the texts are shown individually and not grouped as low or high for each participant.
 |
FIGURE 1: Fixation durations for separate pages of each English text. Graph showing fixation durations for pages of the English text where fixations are clearly longer than the typical reading fixations and the mean number of fixations gets less as the reader progresses. |
|
Figure 1 shows that for the non-scientific English texts, the mean fixation durations (300 ms – 375 ms) on all pages were higher than the typical range published for English L1 readers (200 ms – 250 ms). These long fixation lengths for the learners’ L2 are consistent with Dednam et al.’s (2014) finding of longer fixations for L2 reading. However, as shown in Figure 2, for the Sesotho texts (their L1), these learners showed longer fixations (from 260 ms) than given in the literature for average L1 reading, with some texts (up to 560 ms) eliciting even longer mean fixations than for the English texts. The elevated fixation duration could be indicative of the participants experiencing reading difficulty as well as difficulty in comprehending and making sense of the text. The increase in cognitive load could signal difficulty in making sense of what is being read. Coupled with the low reading speeds, this observation indicates that automated decoding is not present and that sense making requires more cognitive load, and comprehension is more difficult for these participants than for a typical reader of their age or education.
 |
FIGURE 2: Fixation durations for separate pages of each Sesotho text. Fixation durations for separate pages of each Sesotho text showing longer than typical fixation durations. |
|
The mean fixation durations of the science reading (Figure 3) are also high but comparable to both Sesotho (L1) and English (L2) reading. Again, this could be indicative of the low reading and decoding skills of the participants, but as it is in line with L1 and L2 reading, the nature of the text did not unduly influence the length of the fixations and these participants can potentially be considered low-skilled readers.
 |
FIGURE 3: Fixation durations for separate pages of each science text. Fixation durations for separate pages of each science text, again showing longer than typical fixation durations. |
|
At an α-level of 0.05, there is a significant difference between the fixation lengths while reading (χ2(4) = 24.69, P = 0.00006). Post hoc tests indicated that fixation lengths were significantly longer when reading Sesotho (L1) (low) than English (L2) (low) and science texts. At an α-level of 0.05, they were also significantly longer than Sesotho (L1) (high) and English (L2) (high). This shows that the learners fixated significantly longer on words in the Sesotho than in the English or science texts.
Longer fixations indicate greater cognitive load (Raney, Campbell & Bovee 2014). From this, we deduce that, on average, the learners experienced greater cognitive load when reading Sesotho narrative, their L1, than English narrative, their L2. What is less well understood, however, is how to distinguish between generative cognitive processing (sense-making), which is necessary for comprehension, and extraneous cognitive processing, that is, experiencing difficulty in reading the text, from eye-tracking data alone (Rosch & Vogel-Walcutt 2013:313–327). Therefore, it is impossible from the given data alone to determine whether this greater cognitive load resulted from less familiarity and therefore more extraneous cognitive load, from Sesotho, or deeper sense making and therefore more germane, cognitive load, when reading Sesotho. The former explanation could result from English being their LoLT and, therefore, their primary reading language. Either way, the learners appear below-average readers in both languages as they require considerably longer fixations than reported in the literature. This suggests that they must decode and comprehend at the word level. The long fixations and slow reading speeds indicate that reading is not automated, on average, for any of the languages, genres and difficulty levels measured. This correlates with Spaull and Pretorius’s (2019:147–168) findings that the greater phonetic transparency of African languages’ spelling relative to English has not been translated into heightened decoding mastery of such vernacular languages in South Africa.
Fixations per word are typical and not significantly different
Figure 1 shows that at the start of the English narrative texts, the number of fixations per word (1.1–1.7) is higher than the literature-based average (1.2), but this decreases steadily until on later pages, the number of fixations per word is, on average, even less than the accepted L1 English typical reading behaviour as the reader progresses. This may have been a sign of fatigue or that the learners were becoming more comfortable with the text. Figure 2 shows ranges of an average number of fixations per word for the Sesotho texts, as does Figure 3 for the science texts, and these are found to correspond closely with typical L1 reading. One noticeable idiosyncrasy, although, was a spike in the fixation length on Page 2 for the English science texts, while for Sesotho and English narratives, there was a decrease on Page 2. This could be because of the nature of the content.
The Friedman ANOVA shows no significant difference between the mean number of fixations per word on the different reading texts (χ2(4) = 1.93, P = 0.75). This indicates that although fixation durations were, in some instances, significantly longer, the participants did not, on average, need to fixate more on words. The average word length was between 4 and 5 characters for all texts. This finding of no significant difference between the number of fixations per word between languages contradicts Dednam et al. (2014), who found more fixations for L2 readers.
Therefore, from these findings, it can be deduced that learners fixated relatively few times per word and exhibited a similar visual span to previous studies (Rayner 1998). However, this could also be a rudimentary indication of the number of regressions, which we expect to be higher if readers struggle with decoding. This could show that these learners were weak readers who made minimal attempts at decoding or processing, simply reading word by word but not processing the meaning of the text being read. This is consistent with Spaull and Pretorius’s (2019:14–168) findings that reading decoding is the limiting factor in South African reading in all languages. It is also consistent with our prior findings that another sample of learners of this same demographic showed few regressions as they barked at (i.e. decoded without comprehension) science text written in English (Stott & Beelders 2019:72–80).
Comprehension was low for all but significantly higher for L2
The comprehension score for each treatment is shown in Table 4. This shows comprehension scores below 75% for all treatments, with average scores below 50% for the English and science texts but better average scores for the Sesotho texts.
TABLE 4: Mean comprehension data per treatment. |
The Friedman ANOVA showed a significant difference between the comprehension scores (χ2(4) = 26.85, P = 0.00002). Post hoc tests showed that the comprehension for Sesotho narrative (low) is significantly higher than for English narrative (high) and English science. At an α-level of 0.05, Sesotho (high) has a significantly higher comprehension than English science and English narrative (high). In addition, Sesotho narrative (low) has a significantly higher comprehension than English narrative (low). Therefore, the learners’ comprehension was significantly better for Sesotho than for English narratives or English science. However, it was low to very low for all texts.
This finding suggests a position regarding the previous speculation related to the findings on fixation length. Although fixations were longer for Sesotho (low), comprehension was significantly better. This suggests that the shorter fixations measured for English narrative and English science may have been because of shallower parsing (limited decoding and understanding) for these texts that the learners found more difficult, as they made less effort to read for meaning. In contrast, the longer fixations for Sesotho narrative may indicate a greater effort at comprehension (Pulido 2021:1–18). This suggests that reading the vernacular offers less extraneous cognitive load, freeing up more space for sense making. However, even when reading the vernacular, comprehension was poor. This suggests that even reading the vernacular offered considerable extraneous cognitive load, consistent with the explanation that decoding was not yet automated (Pretorius & Spaull 2016:1449–1471).
Limitations
The small size of the sample (27) is limiting. However, the time-consuming nature of a study such as this one, where specialist equipment is used and data are collected on an individual basis, necessitates relatively small sample sizes but prohibits generalisation and reduces the power of the statistics. Findings from such studies provide in-depth data that larger-scale studies cannot (Loeb et al. 2017). These higher-achieving learners are also not representative of South African township high school learners, as has already been discussed. Therefore, the more typical reading metrics can be expected to be considerably worse than those found here. Nevertheless, as no equivalent eye-tracking study could be found within this context and most reading research and models concentrate on English L1 adult readers, this study provides valuable information regarding the components of reading difficulties for South African township learners. The findings are, however, not generalisable to the wider Sesotho reading population, and therefore, the study will not propose a Sesotho reading model. To develop a Sesotho reading model, a larger sample and more developed readers will have to be tested. Besides these limitations to the generalisability of the findings, the discussion has revealed limitations within the existing body of literature regarding how reading metrics, particularly fixation duration, should be interpreted. Finally, despite the measures taken to ensure comparability of the Sesotho and English comprehension tests, the subjective nature of these measures means that comparability cannot be claimed absolutely, possibly threatening the validity of the claim of higher comprehension for the Sesotho texts. To reduce this limitation, the instruments are all available on request.
Conclusion
This study used eye-tracking technology to contribute to understanding the components of reading comprehension, particularly of science texts in English, in bilingual contexts where English is a second language, and how these components interact. The findings suggest that these higher-achieving learners had not attained reading automation even for reading narrative texts in the vernacular. The reading metrics suggested even less availability of cognitive resources for sense-making processes in English than in vernacular narrative and the least in English science texts. This study is consistent with current South African literature on reading, showing that the most basic aspects of decoding text in any language limit reading comprehension in all languages and genres, including science texts, contributing eye metric data, previously absent, to the body of literature that reaches this conclusion. This body of literature suggests that it is inappropriate and unhelpful to blame poor science performance only on learners studying science in a second language. Effectively developing learners’ basic decoding skills in both their first and second languages is the currently needed first step in remedying poor science performance.
Acknowledgements
Competing interests
The authors declare that they have no financial or personal relationships that may have inappropriately influenced them in writing this article.
Authors’ contributions
T.R.B. and A.E.S. contributed to conceptualisation, methodology, writing of original draft, writing – review and editing and project administration. T.R.B. also contributed to formal analysis and A.E.S. contributed to investigation.
Funding information
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Data availability
The data that support the findings of this study are available from the corresponding author, T. R. B. upon reasonable request.
Disclaimer
The views and opinions expressed in this article are those of the authors and are the product of professional research. It does not necessarily reflect the official policy or position of any affiliated institution, funder, agency or that of the publisher. The authors are responsible for this article’s results, findings and content.
References
Abadzi, H., 2006, Efficient learning for the poor: Insights from the frontier of cognitive neuroscience. World Bank Publications, viewed 21 May 2025, from https://openknowledge.worldbank.org/handle/10986/7023.
Birawo, B. & Kasprowski, P., 2022, ‘Review and evaluation of eye movement event detection algorithms’, Sensors 22(22), 8810. https://doi.org/10.3390/s22228810
Böhmer, B. & Wills, G., 2023, COVID-19 and inequality in reading outcomes in South Africa: PIRLS 2016 and 2021 COVID-Generation working paper 1, Stellenbosch.
Brysbaert, M., 2019, ‘How many words do we read per minute? A review and meta-analysis of reading rate’, Journal of Memory and Language 109, 104047. https://doi.org/10.1016/j.jml.2019.104047
Cain, K., 2010, Reading development and difficulties, vol. 8, John Wiley & Sons, New Jersey, USA.
Chikoko, V. & Mthembu, P., 2021, ‘The impact of poverty on basic education in South Africa: A systematic review of literature’, in F. Maringe (ed.), Systematic reviews of research in basic education in South Africa, pp. 69–90, African Sun Media, Stellenbosch.
Conklin, K., Alotaibi, S., Pellicer-Sánchez, A. & Vilkaitė-Lozdienė, L., 2020, ‘What eye-tracking tells us about reading-only and reading-while-listening in a first and second language’, Second Language Research 36(3), 257–276. https://doi.org/10.1177/0267658320921496
De-la-Peña, C., 2024, ‘Eye-tracking contribution on processing of (implicit) reading comprehension’, Journal of New Approaches in Educational Research 13, 13. https://doi.org/10.1007/s44322-024-00013-w
Dednam, E., Brown, R., Wium, D. & Blignaut, P., 2014, ‘The effects of mother tongue and text difficulty on gaze behaviour while reading Afrikaans text’, in ACM international conference proceeding series, 28 September, pp. 334–342, Centurion.
Demareva, V. & Edeleva, Y., 2020, ‘Eye-tracking based L2 detection: Universal and specific eye movement patterns in L1 and L2 reading’, Procedia Computer Science 169, 673–676. https://doi.org/10.1016/j.procs.2020.02.185
Dirix, N., Vander Beken, H., De Bruyne, E., Brysbaert, M. & Duyck, W., 2020, ‘Reading text when studying in a second language: An eye-tracking study’, Reading Research Quarterly 55(3), 371–397. https://doi.org/10.1002/rrq.277
Education, 2022, Benchmarks report – Sesotho-Setswana Early Grade Reading, viewed from https://www.education.gov.za/Portals/0/Documents/Reports/ReadingBenchmarks22/7.%20Sesotho-Setswana%20Language%20Group%20Benchmarks%20Report.pdf.
Fleisch, B., 2023, ‘Theory of change and theory of education: Pedagogic and curriculum defects in early grade reading interventions in South Africa’, Education as Change 27, 14. https://doi.org/10.25159/1947-9417/13316
Gough, P.B. & Tunmer, W.E., 1986, ‘Decoding, reading, and reading disability’, Remedial and Special Education 7(1), 6–10.
Graham, M.A. & Mtsweni, M.U., 2024, ‘Parental involvement predicts Grade 4 learners’ reading literacy: An analysis of PIRLS data for students in Mpumalanga, South Africa’, Educational Review 1–23. https://doi.org/10.1080/00131911.2024.2379416
Howie, S.J., Van Staden, S., Tshele, M., Dowse, C. & Zimmerman, L., 2012, PIRLS 2011: South African children’s reading literacy achievement report, Centre for Evaluation and Assessment (CEA), Pretoria.
Kim, Y.S.G. & Piper, B., 2019, ‘Cross-language transfer of reading skills: An empirical investigation of bidirectionality and the influence of instructional environments’, Reading and Writing 32(4), 839–871. https://doi.org/10.1007/s11145-018-9889-7
Kim, Y.S.G., Stern, J., Mohohlwane, N. & Taylor, S., 2024, ‘Instruction influences cross-language transfer of reading skills: Evidence from a longitudinal randomized controlled trial’, Reading and Writing 38, 171–194. https://doi.org/10.1007/s11145-023-10508-1
Kirschner, P.A., Sweller, J., Kirschner, F. & Zambrano, J., 2018, ‘From cognitive load theory to collaborative cognitive load theory’, International Journal of Computer-Supported Collaborative Learning 13(2), 213–233. https://doi.org/10.1007/s11412-018-9277-y
Komogortsev, O.V., Gobert, D.V., Jayarathna, S., Koh, D.H. & Gowda, S.M., 2010, ‘Standardization of automated analyses of oculomotor fixation and saccadic behaviors’, IEEE Transactions on Biomedical Engineering 57(11), 2635–2645. https://doi.org/10.1109/TBME.2010.2057429
Lesiak, J. & Bradley-Johnson, S., 1983, Reading assessment for placement and programming, Charles C Thomas, Springfield.
Loeb, S., Dynarski, S., Mcfarland, D., Morris, P., Reardon, S. & Reber, S., 2017, Descriptive analysis in education: A guide for researchers, The National Center for Education Evaluation and Regional Assistance (NCEE), viewed from http://ies.ed.gov/ncee/pubs/20174023/.
Mézière, D.C., Yu, L., Reichle, E.D., Von der Malsburg, T. & McArthur, G., 2023, ‘Using eye-tracking measures to predict reading comprehension’, Reading Research Quarterly 58, 425–449. https://doi.org/10.1002/rrq.498
Mohohlwane, N., Taylor, S., Cilliers, J. & Fleisch, B., 2023, ‘Reading skills transfer best from home language to a second language: Policy lessons from two field experiments in South Africa’, Journal of Research on Educational Effectiveness 17(4), 687–710. https://doi.org/10.1080/19345747.2023.2279123
Mohohlwane, N.L., 2019, ‘How language policy and practice sustains inequality in education’, in South African schooling: The enigma of inequality: A study of the present situation and future possibilities, pp. 127–146, Springer.
Mthimkulu, S., Roux, K. & Mihai, M., 2024, ‘Investigating measurement invariance in PIRLS 2021 across English and isiZulu language groups’, Reading & Writing 1(15), 1–10. https://doi.org/10.4102/rw.v15i1.455
Mullis, I., Von Davier, M., Foy, P., Fishbein, B., Reynolds, K. & Wry, E., 2023, PIRLS 2021 international results in reading, viewed 21 May 2025, from https://pirls2021.org/results/.
Munn, S.M., Stefano, L. & Pelz, J.B., 2008, ‘Fixation-identification in dynamic scenes: Comparing an automated algorithm to manual coding’, in Applied perception in graphics and visualization, pp. 33–42, Los Angeles.
Mupezeni, S. & Kriek, J., 2018, ‘Out-of-school activity: A comparison of the experiences of rural and urban participants in science fairs in the Limpopo Province, South Africa’, EURASIA Journal of Mathematics, Science and Technology Education 14(8), em1577.
Nahatame, S., 2023, ‘Predicting processing effort during L1 and L2 reading: The relationship between text linguistic features and eye movements’, Bilingualism: Language and Cognition 26(4), 724–737. https://doi.org/10.1017/S136672892200089X
Pellicer-Sánchez, A., 2020, ‘Expanding English Vocabulary Knowledge through reading: Insights from eye-tracking studies’, RELC Journal 51(1), 134–146. https://doi.org/10.1177/0033688220906904
Pertiwi, L. & Surjarwati, I., 2023, ‘The correlation between reading speed and reading comprehension’, Cendikia: Media Jurnal Ilmiah Pendidikan 13(3), 534–540, viewed 06 February 2025, from https://www.iocscience.org/ejournal/index.php/Cendikia/article/download/3396/2598.
Pretorius, E.J. & Klapwijk, N.M., 2016, ‘EJ Pretorius & NM Klapwijk’, Per Linguam 32(1), 1–20. https://doi.org/10.5785/32-1-627
Pretorius, E.J. & Spaull, N., 2016, ‘Exploring relationships between oral reading fluency and reading comprehension amongst English second language readers in South Africa’, Reading and Writing 29(7), 1449–1471. https://doi.org/10.1007/s11145-016-9645-9
Pulido, M.F., 2021, ‘Individual chunking ability predicts efficient or shallow L2 processing: Eye-tracking evidence from multiword units in relative clauses’, Frontiers in Psychology 11(January), 1–18. https://doi.org/10.3389/fpsyg.2020.607621
Raney, G.E., Campbell, S.J. & Bovee, J.C., 2014, ‘Using eye movements to evaluate the cognitive processes involved in text comprehension’, Journal of Visualized Experiments 83, e50780. https://doi.org/10.3791/50780
Rayner, K., 1998, ‘Eye movements in reading and information processing: 20 years of research’, Psychological Bulletin 124(3), 372.
Rayner, K., Chace, K.H., Slattery, T.J. & Ashby, J., 2006, ‘Eye movements as reflections of comprehension processes in reading’, Scientific Studies of Reading 10(3), 241–255.
Rosch, J.L. & Vogel-Walcutt, J.J., 2013, ‘A review of eye-tracking applications as tools for training’, Cognition, Technology and Work 15, 313–327. https://doi.org/10.1007/s10111-012-0234-7
Salvucci, D.D. & Goldberg, J.H., 2000, ‘Identifying fixations and saccades in eye-tracking protocols’, in Proceedings of the 2000 symposium on Eye tracking research & applications (ETRA ‘00), pp. 71–78, New York, NY.
Samuels, S.J. & Farstrup, A.E., 2011, What research has to say about reading instruction, International Reading Assoc, Newark, DE.
Spaull, N. & Pretorius, E., 2019, ‘Still falling at the first hurdle: Examining early grade reading in South Africa’, in N. Spaull & J. Jansen (eds.), South African schooling: The enigma of inequality, pp. 147–168, Springer.
Spaull, N., Pretorius, E. & Nompumelelo, M., 2020, ‘Investigating the comprehension iceberg: Developing empirical benchmarks for early-grade reading in agglutinating African languages’, South African Journal of Childhood Education 10(1), 1–14. https://doi.org/10.4102/sajce.v10i1.773
Stott, A.E. & Beelders, T., 2019, ‘The influence of science reading comprehension on South African township learners’ learning of science’, South African Journal of Science 115(1–2), 72–80. https://doi.org/10.17159/sajs.2019/5146
Strandberg, A., Nilsson, M., Östberg, P. & Seimyr, G., 2022, ‘Eye movements during reading and their relationship to reading assessment outcomes in Swedish Elementary School Children’, Journal of Eye Movement Research 15(4), 10. https://doi.org/10.16910/JEMR.15.4.3
Strandberg, A., Nilsson, M., Östberg, P. & Seimyr, G., 2023, ‘Eye movements are stable predictors of word reading ability in young readers’, Frontiers in Education 8, 1077882.
Sweller, J., 2011, ‘Cognitive load theory’, Psychology of Learning and Motivation 55, 37–76. https://doi.org/10.1016/B978-0-12-387691-1.00002-8
Van der Berg, S., Spaull, N., Wills, G., Gustafsson, M. & Kotzé, J., 2016, ‘Identifying binding constraints in education’, in Research on socio-economic policy, Department of Economics, University of Stellenbosch. https://dx.doi.org/10.2139/ssrn.2906945
Wang, Z., Sabatini, J., O’Reilly, T. & Weeks, J., 2019, ‘Decoding and reading comprehension: A test of the decoding threshold hypothesis’, Journal of Educational Psychology 111(3), 387–401. https://doi.org/10.1037/edu0000302
|