85 Role of semantics in a phonics through spelling intervention 4 version B). The reliability of both version A and version B differs per age but is at least .90 (Geelhoed & Reitsma, 1999). Predictor measures Phonological awareness Two subtests from the “Screening Test for Dyslexia” (Kort et al., 2005a) were used. First, during “Phoneme Deletion,” children were asked to omit a phoneme from an orally presented word and speak out the remaining word (e.g., “dak” [roof] minus “k” [f] is “da” [roo]). Testing was terminated after four consecutive mistakes. Second, during the subtest “Spoonerism,” children had to switch the first sounds of two words (e.g., “John Lennon” becomes “Lohn Jennon”). Testing was terminated after five consecutive mistakes. The reliability differs per age but is at least .60. A composite score was calculated by adding z-scores of both subtests. Rapid automatized naming Rapid automatized naming was measured using two subtests of “Continuous Naming and Reading Words” (van den Bos & Lutje Spelberg, 2010). During “Naming Letters,” children had to read out loud 50 letters. During “Naming Digits,” they were asked to read out loud 50 digits. Children were asked to name these visual stimuli as quickly as possible. The time in seconds needed to finish each subtest was used for analysis, which means that a higher score reflects a weaker performance on RAN. The reliability of this measure differs per age but is at least .75. A composite score was calculated by adding z-scores of both subtests. Verbal working memory Verbal working memory was measured using the backward task of the Number Recall subtest from the Wechsler Intelligence Scale for Children-III (WISC-IIINL) (Kort et al., 2005b). In this task, the experimenter pronounces sequences of digits that the child was asked to repeat in backward order. Testing was terminated after two consecutive mistakes. The number of correctly recalled sequences was counted. The reliability of this measure differs per age but is at least .50. Semantic abilities Semantic abilities were measured by adding the z-scores of four subtests from the WISC-IIINL (Kort et al., 2005b). Based on the manual, the child received zero, one, or two points for each item. Testing was terminated after four or five (Information) consecutive mistakes. The reliability differs per age but is between .64 and .77 (Kort et al., 2005b).