ICT & literacy meta-analysis

A systematic review and meta-analysis of the effectiveness of ICT on literacy learning in English, 5-16. Summary

Background

The English Review Group completed an overarching systematic review of the impact of Information and Communication Technology (ICT) on literacy learning in English in 2002 (Andrews et al., 2002). In this review, a 'map' described all the included research in the field. An in-depth sub-review reported on the impact of networked ICT on literacy learning. This present review is one of a further four in-depth sub-reviews that address aspects of the overarching question - what is the impact of Information and Communication Technology on literacy learning in English? The broad background to the descriptive map and the in-depth sub-reviews is that there is a growing concern internationally that the investment in ICT in schools is not impacting on literacy development. This concern arises from a belief held by many - including governments as well as schools - that ICT is beneficial to learning and specifically literacy learning. The question is a specific one and has to be seen within a wider political, social and technological context in which the symbiosis between new technologies and new literacies (and thus literacy learning) is acknowledged.

Policy and practice background

The use of ICT in schools to support literacy learning is pervasive. Since the mid-1980s, successive governments have invested large amounts of resources to develop ICT in schools. However, little robust evidence based on effectiveness research has been used to underpin this use of ICT.

Research background

The first review identified five systematic reviews in the field of ICT and literacy. All five reviews synthesised research in various aspects of literacy, such as spelling or writing (Bangert-Drowns, 1993; Blok et al., 2001; Fulk and Stormont-Spurgin, 1995; MacArthur et al., 2001; and Torgerson and Elbourne, 2002). In some of the reviews, the studies included focused on participants with specific learner characteristics, such as pupils experiencing learning disabilities. Most of the reviews included papers of all study types, whilst others were restricted to experimental research (randomised controlled trials and controlled trials or RCTs and CTs). Not all the systematic reviews included detailed assessment of the quality of the included studies. From these reviews, the evidence to date on the effectiveness of ICT on literacy learning is equivocal.

In summary, whilst there have been some systematic reviews relevant to the policy question of whether ICT is effective in improving literacy learning, the extant reviews are either insufficiently rigorous in that they include non-randomised or poor quality trials; or, they focus on only one aspect of literacy. There is, therefore, a need for a systematic review of recent ICT effectiveness research on all aspects of literacy.

Aims

The overall aim of the two-year project is to determine the impact of ICT on literacy learning in English for 5-16 year olds.

The main aim of this in-depth sub-review is to investigate whether or not ICT is effective in improving young people's literacy learning in English. Subsidiary aims are to assess the effectiveness of ICT on different literacy outcomes and, within those outcomes, to assess whether effectiveness varies according to different interventions.

Review questions

The overall research question for the two-year project is:

What is the impact of ICT on literacy learning in English, 5 - 16?

The research question for this effectiveness sub-review is:

What is the evidence for the effectiveness of ICT on literacy learning in English, 5-16?

This research question was developed because, despite huge investment by this and previous governments in the use of ICT in schools generally in the UK and with a view to improving literacy learning in particular, to date no generic systematic review has been undertaken to examine the effectiveness or otherwise of ICT on literacy learning in English. Specifically, no systematic review of effectiveness has reviewed studies in all aspects of literacy learning, with diverse learner characteristics, focused exclusively on the most appropriate study design for judging effectiveness: the randomised controlled trial (RCT). This study design is the most appropriate for judging effectiveness because it is the only experimental design that can control for all known and unknown extraneous variables, and for regression to the mean effects commonly observed in uncontrolled study designs.

Structure of the review

The structure of this review is unusual in that it includes a two-stage 'mapping' process, followed by an in-depth review.

In the descriptive map of the overarching review, the process of identifying, including and characterising the studies for the systematic review of the impact of ICT on literacy learning is described. This map is an updated version of the original map described in the first review. In total, a series of five sub-reviews has been undertaken to address aspects of the original research question. In the present review, the effectiveness map describes the process of identifying, including and characterising the studies for one of the five sub-reviews. This review is a systematic review and meta-analysis of the effectiveness of ICT on literacy learning. In the in-depth effectiveness review, the inclusion criteria have been refined to identify a sub-section of RCTs that can be used to address the question: 'What is the effectiveness of ICT on literacy learning in English, 5-16?'

Methods

Defining relevant studies for the descriptive map of the overarching review: inclusion and exclusion criteria

The earlier systematic review mapped the research on the impact of ICT on literacy learning in English, 5-16. The relevant research was searched for, located, sent for and mapped for the years 1990-2001. In addition to updating the searches for the period 2001-2002, and screening for inclusion of any potentially relevant studies for the period 2001-2002, all the included studies in the original map were re-keyworded using revised generic and review-specific keywording sheets. The English Review Group working document for the inclusion and exclusion of potentially relevant studies was updated to reflect changes made to the keywording sheets, both generic and review-specific. See Appendix 2.1 of the technical report for further details.

Defining relevant studies for the effectiveness map: inclusion and exclusion criteria

As the focus of the sub-review was 'effectiveness', papers using rigorous methods to assess effectiveness were required; this implies the identification of relevant randomised controlled trials (RCTs).

Defining relevant studies for the in-depth review: inclusion and exclusion criteria

For this review, studies were only included if they had randomly allocated pupils to an ICT or no ICT treatment for the teaching of literacy. Both individually randomised trials and cluster randomised trials were included, but cluster trials were only included if each arm contained more than four clusters, and if the unit of analysis was at the cluster level (not the individual level), i.e. if there was no unit of analysis error. Because this review is an effectiveness review, in order to establish effectiveness, it was necessary to look at the effect sizes (with confidence intervals). If the authors did not present effect sizes, or if the reviewers were unable to calculate the effect sizes, an RCT was excluded. Essentially this means that the study had to report either means of post-tests or mean gain scores; numbers of participants in the intervention and control groups and the standard deviations of the mean scores; or the means and numbers of participants in each group and either a t-value or precise p value in order that the reviewers could calculate the standard deviations. RCTs were included if they presented sufficient data. It was decided not to reanalyse poorly analysed cluster trials, and not to approach authors for further data.

Identification of potential studies for the descriptive map of overarching review: search strategy

The potential studies for this review were identified from the original systematic review and through an updating of the original electronic searches and handsearches, for the period 2001-2002.

Identification of potential studies for the effectiveness map: search strategy

The earlier systematic review mapped the research on the impact of ICT on literacy learning in English, 5-16. All included studies were keyworded according to study type. A research question looking at the effectiveness of ICT on literacy learning would therefore include studies keyworded as 'RCTs'. After updating searches to locate any further relevant studies that were undertaken after 2001 and re-keywording using the EPPI Centre core keywording strategy, the keyword 'RCT' was used to identify any RCTs from the updated database.

Identification of potential studies for the in-depth review: search strategy

All RCTs identified in the effectiveness map were re-screened for inclusion in the in-depth effectiveness review, using pre-established inclusion and exclusion criteria.

Screening studies for the descriptive map of the overarching review: inclusion and exclusion criteria

The updated database for 2002-2003 that included potentially relevant studies published after October 2001, was screened by a member of the Review Team, using titles and abstracts and the updated working document with inclusion and exclusion criteria.

Screening studies for the effectiveness map: inclusion and exclusion criteria

All studies keyworded as RCTs were re-screened to check that they were individual or cluster RCTs and fulfilled the inclusion criteria for the effectiveness map. They were then included in the map.

Screening studies for the effectiveness in-depth review: inclusion and exclusion criteria

All identified RCTs were screened for inclusion in the in-depth effectiveness review using the pre-established inclusion/exclusion criteria.

Characterising included studies for the descriptive map of the overarching review: EPPI Centre and review-specific keywording

All the studies included in the original database from the review of 2001 were re-keyworded by members of the Review Team, using the new guidelines from the EPPI Centre. The studies retrieved for the updated database were keyworded by a member of the Review Team, with assistance from other members of the Review Team and the EPPI Centre where there was any doubt about keywording. The database was fully annotated with the keywords. For pragmatic reasons, the database for 2002 was closed on November 30th 2002. Any studies received after that time will be included in the next update.

Characterising included studies for the effectiveness map: EPPI Centre and review-specific keywording

The included RCTs were therefore characterised using the EPPI Centre and review-specific keywords as part of the descriptive map of the overarching review.

Identifying and describing studies for the descriptive map of the overarching review: quality assurance process

For the purposes of quality assurance, two members of the Review Team and one member of the EPPI Centre screened a random sample of 10 percent of the studies in the updated database. Screening was undertaken independently, using the inclusion/exclusion criteria working document. After double-screening, the inter-rater reliability scores between the pairs of reviewers were calculated using Cohen's Kappa.

Identifying and describing studies for the effectiveness map: quality assurance process

Two reviewers independently re-screened the studies retrieved from the database and then compared results. In cases where there was disagreement, a member of the EPPI Centre was asked to advise.

In-depth effectiveness review

Screening

Two reviewers independently screened all included RCTs and coded them for inclusion or exclusion using the four exclusion criteria.

Data extraction and quality assessment of included RCTs

Data extraction was undertaken by two reviewers. The included RCTs were data-extracted and quality appraised using the EPPI Centre Guidelines. Any disagreements between the reviewers were discussed and resolved. In addition, because this is an effectiveness review and meta-analyses of effect sizes were calculated in order to judge the effectiveness of various interventions, outcomes data were extracted from all the RCTs. This was undertaken by two reviewers. Any disagreements were discussed and resolved. Quality assurance of data extraction and quality assessment of the included RCTs was provided by the data-extraction undertaken by staff from the EPPI Centre.

Weight of evidence judgements about included RCTs

The methodological quality of each trial (A) was reviewed in terms of how well it was executed. In addition, each trial was assessed for how much weight of evidence it provided for the specific review in terms of (B) the appropriateness of research design for the review question, and (C) the relevance of the study for the review question. Finally, on the basis of judgements about (A), (B) and (C), an overall weight (D) was ascribed to each randomised trial. The weight of evidence assessments were taken into consideration in both the narrative syntheses and the meta-analyses. Only studies assessed as 'medium' or 'high' on overall weight of evidence were included in the syntheses.

Narrative synthesis of included RCTs

A narrative synthesis of the included trials was undertaken. The conceptual framework which formed the basis of the synthesis focused firstly on different ICT interventions and secondly on different literacy outcomes. This resulted in two approaches to synthesizing the evidence:

(1) Interventions: an analysis of the effectiveness of different types of ICT interventions on a range of literacy outcomes
(2) Outcomes: an analysis of the effectiveness of different types of ICT on specific literacy outcomes.

Statistical synthesis of outcomes (meta-analysis)

A meta-analysis essentially averages the results from a number of studies using a statistical method that gives the greatest weight to the studies with the smallest standard errors, which usually means the largest studies. We pooled some of the studies in a series of meta-analyses that investigated effectiveness in different aspects of ICT and literacy.

Publication bias

One source of bias for systematic reviews is publication bias. If studies showing a positive (beneficial) effect are more likely to be published than negative or inconclusive studies, this will give a biased estimate of effect. One method of determining the existence of publication bias is to draw a funnel plot. This plots the effect size of a study (on the x-axis) against its sample size (on the y-axis). Very small studies will have a high probability of showing an inconclusive effect even if the intervention is effective, just as they will have a raised probability of showing a positive effect if the intervention is ineffective. If there is no publication bias, small studies should be scattered along the x-axis, with the larger trials being situated closer to the true estimate of effect (as they are less subject to variability). A funnel plot was drawn to investigate whether or not there was any publication bias in research in the effectiveness of ICT on literacy learning.

Results

Identification of studies: descriptive map of overarching review

A total of 2,319 potentially relevant reports were identified for the current review. Of these, 1,891 (just over 81 percent) were excluded by screening titles and/or abstracts and 428 were sent for. Of these, 34 (fewer than 8 percent) were not received within the timeframe of the review or were unavailable. A reading of the full report resulted in the exclusion of a further 182 reports, leaving a total of 212 that met the criteria for inclusion in the mapping study.

Identification of studies: effectiveness map

A total of 45 RCTs were identified from the updated database, using the keyword 'RCT'. Three of these failed to meet inclusion criteria, leaving a total of 42 trials included in the effectiveness map.

Most of the studies in the map were identified through the electronic searches on PsycInfo and ERIC. Most of the RCTs were undertaken in the US, with only three being undertaken in the UK.

Identification of studies: in-depth review

Forty-two RCTs were included in the map. Thirty of these were excluded from the in-depth review for the following reasons: no appropriate non-ICT control (19 RCTs); no literacy outcome measures (two RCTs); no data or insufficient data (six RCTs); cluster randomised trials with too few clusters or inappropriate analysis of cluster trial (three RCTs). This left 12 RCTs in the in-depth review. All the studies included in the in-depth review were retrieved from searches on two electronic databases: PsycInfo and ERIC.

In-depth effectiveness review

Publication bias

We plotted the effect size of the identified trials against their sample size. If there is no publication bias, then the included trials should form an 'inverted' funnel with the largest trial at the top. The largest trial reported a negative effect size; however, the studies with the largest effect sizes were positive. There is a suggestion, therefore, that some of the 'missing' trials would have had negative effect sizes. The absence of these trials will tend to make any estimates of effect biased towards the positive.

Synthesis

A range of five different kinds of ICT interventions emerged from the twelve RCTs included in the review: (1) computer-assisted instruction (CAI), (2) networked computer system (classroom intranet), (3) word-processing software packages, (4) computer-mediated texts (electronic text) and (5) speech synthesis systems. There were also three literacy outcomes: (1) reading, including reading comprehension and phonological awareness (pre-reading understandings), (2) writing and (3) spelling.

Six RCTs evaluated CAI interventions (Berninger et al., 1998; Heise et al., 1991; Jinkerson and Baggett, 1993; Lin et al., 1991; McArthur et al., 1990; Mitchell and Fox, 2001). The CAI interventions consisted of studies designed to increase spelling abilities, reading abilities or phonological awareness (pre-reading understandings). One RCT evaluated a networked computer system intervention (Golden et al., 1990) and two RCTs evaluated word-processing interventions; three RCTs evaluated computer-mediated texts interventions and one RCT evaluated a speech synthesis intervention.

In synthesis (1), for five different ICT interventions, overall we included 20 comparisons from the 12 RCTs: 13 were positive and seven were negative. Of the positive ones, three were statistically significant, whilst of the seven negative trials, one was statistically significant. These data would suggest that there is little evidence to support the widespread use of ICT in literacy learning in English. This also supports the findings from previous systematic reviews that have used data from rigorous study designs. It also supports the most recent observational data from the Impact2 study. These findings support the view that ICT use for literacy learning should be restricted to pupils participating in rigorous, randomised trials of such technology.

In synthesis (2), we undertook three principal meta-analyses: one for each of the three literacy outcomes measures in which we were interested. In two, there was no evidence of benefit or harm; that is, in spelling and reading the small effect sizes were not statistically significant. In writing, there was some evidence for a positive effect, but it was weak because only 42 children altogether were included in this meta-analysis.

Quality assurance results: descriptive map of overarching review

Screening

The inter-rater reliability score between one pair of reviewers was 0.65 (good); the inter-rater reliability score between the other two pairs was fair.

Keywording: EPPI Centre generic keywording sheet

Inter-rater agreement was very high. Out of a total possible 180 'keywords', disagreement occurred in only 30 keywords (i.e. 16.7 percent).

Keywording: English Review Group ICT and literacy keywording sheet

Agreement was again very good. Out of a total possible 794 keywords, disagreement occurred in 88 cases (i.e. 11 percent).

Quality assurance results: effectiveness map

Both reviewers agreed on the exclusion of three RCTs from the map and the inclusion of 42 RCTs. They also agreed about whether these RCTs were individually randomised trials or cluster randomised trials.

Quality assurance results: in-depth effectiveness review

Both reviewers included the same 12 RCTs and excluded the same 30 RCTs. In addition, there was complete agreement on exclusion codes. One reviewer excluded on the basis of the first exclusion code in the hierarchy to apply to a trial, while the other reviewer excluded on the basis of all codes that applied to any trial.

There are 12 RCTs in the in-depth review for this research question. All 12 trials were independently double data-extracted by Carole Torgerson, Graham Low and Die Zhu (all three from the University of York), and by Diana Elbourne and Katy Sutcliffe (both from the EPPI Centre). The data extractions were compared and all disagreements resolved. The English Review Group data extraction for each of the 12 RCTs was then uploaded.

Conclusions: in-depth review

We identified 12 relatively small RCTs for the in-depth review. Some were so small that they could only really be considered to be pilot studies. This group of tiny trials represent the sum of the most rigorous effectiveness evidence available to date upon which to justify or refute the policy of spending millions of pounds on ICT equipment, software and teacher training. Given that the trials showed little evidence of benefit, large, rigorously design, randomised trials are urgently required.

Strengths and limitations

We focused on a robust research design (RCT) appropriate for an effectiveness review. We applied rigorous inclusion and exclusion criteria for including studies in the in-depth review. All the included RCTs were highly relevant to the review and were assessed as being of high or medium overall weight of evidence. We did not include studies of other experimental designs; we did not attempt to combine the results of RCTs with trials of other designs. There was high quality assurance for the review: independent double-screening, data extraction and quality assessment at each stage.

We did not search for any studies published before 1990. The reason for this is that we felt that the ICT of the 1980s and before was relatively unsophisticated compared with current ICT provision. Therefore, trying to inform current ICT policy from studies that used 1980s technology could be misleading. We may have missed some studies. Nevertheless, we accept the possibility that our results could have been affected by publication bias, which is a very real problem for any systematic review. The fact that we have found and included some negative studies of ICT and literacy is somewhat re-assuring as publication bias tends mainly to affect negative studies. Nevertheless, one interpretation of our data could be that our results are over-optimistic as it is likely that the studies that remain unpublished are more likely to be negative studies than positive ones. If this is true, then the overall effects of ICT could be harmful. All the studies included in the in-depth review were undertaken in the US, so they may be of limited generalisability to the UK. All of the participants in the studies were either very young children in the stages of beginning literacy, or slightly older children who were experiencing difficulties or disabilities in learning in literacy.

Implications for policy, practice and research

Policy-makers should refrain from any further investment in ICT and literacy until at least one large and rigorously designed randomised trial has shown it to be effective in increasing literacy outcomes.

Teachers should be aware that there is no evidence that non-ICT methods of instruction and non-ICT resources are inferior to the use of ICT to promote literacy learning.

A series of large, rigorously designed RCTs to evaluate ICT and literacy learning across all age ranges is urgently required.

References

Bangert-Drowns RL (1993) The word processor as an instructional tool: a metaanalysis of word processing in writing instruction. Review of Educational Research 63: 69-93.

Berninger V, Abbott R, Rogan L, Reed E, Abbott S, Brooks A, Vaughan K, Graham S (1998) Teaching spelling to children with specific learning disabilities: the mind's ear and eye beat the computer or pencil. Learning Disability Quarterly 21: 106-122.

Blok H, Van Daalen-Kapteijns MM, Otter ME, Overmaat M (2001) Using computers to learn words in the elementary grades: an evaluation framework and a review of effect studies. Computer Assisted Language Learning 14: 99-128.

Fulk BM, Stormont-Spurgin M (1995) Spelling interventions for students with disabilities: a review. Journal of Special Education 28: 488-513.

Golden N, Gersten R, Woodward J (1990) Effectiveness of guided practice during remedial reading instruction: an application of computer-managed instruction. Elementary School Journal 90: 291-304.

Heise BL, Papalewis R, Tanner DE (1991) Building base vocabulary with computer-assisted instruction. Teacher Education Quarterly 18: 55-63.

Jinkerson L, Baggett P (1993) Spell checkers: aids in identifying and correcting spelling errors. Journal of Computing in Childhood Education 4: 291-306.

Lin A, Podell DM, Rein N (1991) The effects of CAI on word recognition in mildly mentally handicapped and nonhandicapped learners. Journal of Special Education Technology 11: 16-25.

MacArthur CA, Haynes JA, Malouf DB, Harris K, Owings M (1990) Computer assisted instruction with learning disabled students: achievement, engagement, and other factors that influence achievement. Journal of Educational Computing Research 6: 311-328.

MacArthur CA, Ferretti RP, Okolo CM, Cavalier AR (2001) Technology applications for students with literacy problems: a critical review. Elementary School Journal 101: 273-301.

Mitchell MJ, Fox BJ (2001) The effects of computer software for developing phonological awareness in low-progress readers. Reading Research and Instruction 40: 315-332.

Torgerson CJ, Elbourne D (2002) A systematic review and meta-analysis of the effectiveness of information and communication technology (ICT) on the teaching of spelling. Journal of Research in Reading 25: 129-143.

This report should be cited as: Torgerson C, Zhu D (2003) A systematic review and meta-analysis of the effectiveness of ICT on literacy learning in English, 5-16. In: Research Evidence in Education Library. London: EPPI Centre, Social Science Research Unit, Institute of Education, University of London.