Do Education System Characteristics Moderate the Socioeconomic, Gender and Immigrant Gaps in Math and Science Achievement?

Using data from the 2011 Trends in International Mathematics and Science Study for 45 countries, we examined the size of socioeconomic, gender, and immigrant status related gaps, and their relationships with education system characteristics, such as differentiation, standardization, and proportion of governmental spending on education. We find that higher socioeconomic status is positively and significantly associated with higher math and science achievement; immigrant students lag behind their native peers in both math and science, with first generation students faring worse than second generation; and girls show lower math performance than boys. A higher degree of differentiation makes socioeconomic gaps larger in both math and science achievement, whereas higher governmental spending reduces socioeconomic achievement gaps.   

ducation systems across the globe differ in the kinds of opportunities they provide their students along several institutional dimensions. For example, countries vary in the degree of standardization in their education -in curriculum, teachers' preparation, and types and timing of the mandatory exams that students take. Countries also use different means to separate students into different tracks or ability groups, i.e. differentiation. Finally, countries differ in the funding models used for their primary and secondary schools; there is considerable cross-national variation in the level of governmental spending on education.
In this paper we build on research that connects the institutional characteristics of national education systems to student achievement. We expand this literature in several important ways. Using data from the 2011 Trends in International Mathematics and Science Study (TIMSS) for 45 countries, we examine socioeconomic, gender and immigrant status gaps in math and science achievement. Further, we link these gaps to differentiation, standardization, and percent of governmental spending on education, thus examining whether these features of the education systems moderate the stratification of math and science achievement. By doing this, we simultaneously account for several dimensions of the education systems rather than focusing on just one specific feature. While the literature has addressed the association between countries' education systems and average achievement and its dispersion (Bodovski et al., 2017;Bol et al., 2014), it has not examined how education systems can affect boys' and girls' achievement and the achievement of immigrant students in a comprehensive way. More specifically, while a few studies have examined the effects of a particular feature of education systems on girls' or immigrants' math and science achievement (Ayalon & Livneh, 2013;Ruhose & Schwerdt, 2016), none have examined several features of education systems and their effects on math and science achievement of girls and immigrant students simultaneously. This is an important contribution to the literature because certain features of education systems can interact in how they affect students (Bol et al., 2014), and therefore exploring education systems in a multidimensional way ensures that the effects of education systems on immigrant students, as well as boys and girls, are understood in their full complexity. By focusing our analysis on math and science achievement, we contribute to the literature on the mechanisms behind differences in science, technology, engineering and mathematics (STEM) education. By analyzing 45 countries that differ in many substantial dimensions, such as relative size, wealth, and level of inequality, we shed light on the features of education systems that can ameliorate educational disparities.

Socioeconomic Differences in Academic Achievement
Research in the sociology of education has long linked family socioeconomic background to academic achievement, showing that children from advantageous backgrounds perform better in school than their less fortunate peers (Bowles & Gintis, 1976;Levels et al., 2008;Marks, 2005;2006). Numerous studies have examined the relationships between parental resources and practices and children outcomes (Farkas, 2003;Lareau, 2011;Bodovski et al., 2014). While the importance of family influence persists, a vital policy question is whether national education system characteristics can moderate the effects of family background. In other words, while it is hard to change the circumstances of a particular family and significant reforms are needed to battle socioeconomic inequality at the macro-level, a more tractable aim might be to identify which features of education systems exacerbate or ameliorate socioeconomic inequality.
Differentiation-and track placement-has been shown to affect student achievement, with students in higher tracks showing greater achievement gains than their peers in lower tracks (Alexander et al., 1978;Dauber et al., 1996;Gamoran, 1987;1996;Gamoran & Mare, 1989;Kerckhoff, 1986). Due to socioeconomic differences in track assignment, with students from disadvantaged backgrounds being more likely to attend vocational or low academic tracks, several studies have argued that tracking aggravates educational inequality (Bol & Van de Werfhorst, 2013;Kerckhoff, 1995;Oakes, 1985;Pfeffer, 2008). The negative association with educational attainment is particularly strong if tracking happens when students are younger (Pfeffer, 2008).
Previous studies presented mixed evidence of the effects of standardization on academic achievement. Bishop (1997) found that students in countries with  central exit exams in math and science outperform their peers in countries without such exams. Similarly, Schutz et al. (2007) found that exit exams are associated with overall better student mathematics performance, and that the relationship is stronger for students from middle and higher socioeconomic classes than from lower socioeconomic classes. On the other hand, Park (2005) did not find significant effects of a country level of standardization on average achievement. However, Park (2008) argues that standardized curriculum and instruction provides students and their families with a clear idea of what students are expected to learn and as such may help low socioeconomic status (SES) families monitor their children's educational progress. Using the Programme for International Student Assessment (PISA) 2006 data on 36 countries, Bol et al. (2014) found that parental SES influences student achievement more in education systems without central exams. Where central exams are present, the relationship between SES and tracking was attenuated. Furthermore, in several countries, most notably in Singapore, sharp increases in math and science achievement on international assessments have occurred alongside within-country changes towards more centralized curricula, such as producing guidelines regarding how subjects should be taught (Walberg et al., 2000).

Gender Gaps in Achievement
For decades, researchers have been concerned with girls' disadvantages in math and science. At the same time, early waves of international data showed that gender differences have shrunk over time (Baker & Jones, 1993;Wiseman et al., 2009). Interestingly, in some countries the gap has now flipped, with girls outperforming boys in math (Bodovski et al., 2014;Guiso et al., 2008;Hyde & Mertz, 2009). It is important to note that even when girls' math and science achievement is on par with boys', girls are less likely to pursue STEM majors in postsecondary education (Charles, 2011;Riegle-Crumb et al., 2012). Cross-nationally, girls are more likely than boys to aspire to graduate from an institution of higher education (Lauglo & Liu, 2019). Despite increased women's participation in higher education at all levels, sex segregation by field of study is not only persistent, but more pronounced in wealthier developed societies (Charles & Bradley, 2009). In addition, several studies documented that males are more likely to be enrolled in vocationally oriented tracks while females are at a higher likelihood of being assigned to tracks that lead to university matriculation (Buchmann & Park, 2009;Gerber & Hout, 1995;Titma, Tuma, & Roosma, 2003). Schnepf (2010) shows that the math advantage largely results from males' dominance at the top of the math achievement distribution; more specifically, male high achievers outperform female high achievers. The differences in the upper tail are important because how well students achieve at the top of the distribution serves as a gateway to mathematics and science careers (Ellison & Swanson, 2010). Findings regarding the gender gap that are based solely on U.S. samples, however, vary greatly depending on the covariates that scholars include in their analyses, with certain model specifications showing no difference between male and female students in math achievement after controlling for other factors (Cheema & Galluzzo, 2013). While male students consistently outperform female students on the mathematics section of the Scholastic Aptitude Test exam (Tsui, 2007), when all Educational Testing Service tests are analyzed, there is no mathematics gap across genders (Cole, 1997). Buchmann et al. (2008) provided a comprehensive review of the literature on gender inequalities from early childhood to young adulthood. The authors summarized the findings on academic achievement in elementary and secondary school, in transition from high school to college and college attendance. They surveyed the gendered trajectories in skills, grades, and test scores, as well as in the behaviors and expectations that boys and girls exhibit in school and in their families. That review, however, did not include the connection between gender gaps in educational outcomes and macro-level countries' characteristics. A more recent study examined the role of standardization and differentiation in gender gaps in reading (Van Heck et al., 2019). Using the six waves of PISA data, the authors found that girls hold an advantage in reading in all OECD countries, and this advantage is further bolstered in countries with later track selection. They also found a negative relationship between standardization and the overall country's reading performance with boys having a greater disadvantage in standardized systems.

Immigrant Students' Achievement
Research has found substantial heterogeneity in immigrant students' performance, depending on the country of destination and origin (Alba et al., 2011;Crosnoe & Lopez Turley, 2011;Kasinitz et al., 2008;Lee & Zhou, 2015;Levels et al., 2008;Wang & Goldschmidt, 1999). In the United States, for example, students of Asian origin do better in school than native-born white students, while students of Mexican origin exhibit lower achievement and graduation rates (Crosnoe & Lopez Turley, 2011;Lee & Zhou, 2015;Telles & Ortiz, 2008). Lee and Zhou (2015) attribute this higher achievement to the model minority image many hold of Asian students, as well as to the institutions that Asian families create upon arrival that reinforce higher achievement. The authors discuss the structural factors behind the achievement of Asian students, pointing out the high selectivity of the group (both in comparison to the country of origin and to the country of destination). Other scholars attribute differential achievement patterns to length of stay in the host country (Schnepf, 2008) and language proficiency (Schlicht et al., 2010). They explain that immigrants who are in the host country for a longer period of time have more opportunities to better their language skills, which in turn has a positive influence on their achievement. Given that the percentage of language minority students in Europe and the United States is likely to increase (Brown, 2015;OECD-UNDESA, 2013), it is important to understand under what conditions they perform best.
Further, having immigrant parents is associated with a unique set of benefits and disadvantages as well. Quite often, these parents lack the knowledge of the education systems of their host countries, which results in lack of ability to help their children with schooling (Barban & White, 2011;Goldenberg et al., 2001;Rosenbaum & Rochford, 2008). On the other hand, these parents are known to have higher levels of motivation and grit that they can potentially pass on to their children (Kao & Tienda, 1995;Madood, 2004). Scholars often refer to this grit as 'immigrant drive' (Portes & Rumbaut, 2001).
Evidence is mixed regarding immigrants' propensity to enroll or be assigned to lower or higher tracks. For example, all else being equal, immigrant students in Italy are more likely to enroll into vocational tracks than non-immigrants (Barban & White, 2011), while immigrant students in Germany are at a higher likelihood to be recommended by teachers for entrance into the college track (Caro et al., 2009). Furthermore, track misallocation is arguably more likely to occur in countries with more tracks; in other words, holding everything else constant, the probability of misallocation increases when there are more tracks to choose from (Combet, 2015). It remains an empirical question as to whether there are consistent patterns of relationships between different education system characteristics and immigrant students' performance.
The literature continues to debate the role governmental spending on education plays in shaping academic achievement of different groups of students. West and Wößmann (2008), for instance, advocate that even privately operated schools should be financially supported by the government, as alternative arrangements could damage educational equity. Hanushek (2003) and Marlow (2000) show that simply increasing public spending on education does little to increase student achievement; they also demonstrate, though, that in many European countries, as public spending on education rises, the effect of parental education on achievement becomes smaller, and at the highest level of spending insignificant (Schlicht et al. 2010).
While incorporating every relevant institutional difference that might affect educational equality is virtually impossible (Meier & Schutz, 2007), an analysis that examines a wider array of features of education systems comes as a timely addition to the expanding literature on the relationship between inequality and institutional characteristics of education systems across countries. Several studies (Bodovski et al., 2017;Bol et al., 2014) have incorporated multiple features of education systems into their analyses but these studies only tangentially touch upon equity issues, such as SESachievement gaps in Bodovski et al (2017). However, equity issues are not limited to SES-achievement gaps. For the education system to perform its function as "the great equalizer" (Mann, 1848), it also needs -among other equality benchmarks -to narrow and potentially eliminate genderachievement gaps and immigrant student-achievement gaps (UNESCO 2016). In order to truly understand under which conditions an education system is best equipped to do so, the system characteristics and student characteristics need to be examined in the same analyses. In addition to examining SESachievement gaps, the current study uses multi-level analyses to also focus on gender-achievement gaps and immigrant status -achievement gaps. Specifically, our study examines two main research questions: 1. To what extent are SES, gender, and immigrant status related to academic achievement in math and science cross-nationally? 2. To what extent do differentiation, standardization, and proportion of governmental spending on education moderate the socioeconomic, gender, and immigrant status gaps in achievement?

Data and Sample
We used data from TIMSS 2011 and supplemented them with countries' information on economic and education systems from various sources. TIMSS employs a two-stage stratified cluster sample design, where schools are selected using probability proportional-to-size sampling at the first stage; and one or two classes are randomly sampled within each school at the second stage (Joncas, 2008). In addition to assessing students' math and science proficiency, TIMSS also collects background and school information for fourth and eighth grade students in 45 countries. We focused on eighth-grade students because in most countries track placement takes place in secondary education, which makes the eighth grade a crucial year during which student performance is assessed and evaluated as a basis for these decisions. Country-specific information on standardization, differentiation, government spending on education, and Gross Domestic Product (GDP) per capita was collected using both websites for international organizations (e.g., the European Union; the Organization for Economic Co-operation and Development, the United Nations Educational, Scientific and Cultural Organization; and the World Bank) and national governmental websites (mainly, websites of the ministries of education). For our analysis, we included all individuals and schools assessed in each country. Our data includes 261,747 students from 8,430 schools across 45 countries.

Measures
Academic achievement. The dependent variables are math and science achievement scores. TIMSS uses item response theory (IRT) and multiple imputation techniques to calculate five plausible values for each academic subject on a scale with mean of 500 and standard deviation of 100. Using the average of these five plausible values as the dependent variable would produce smaller standard errors, which would increase the odds of committing a Type I error (Willms & Smith, 2005). Thus, for both academic subjects, we simultaneously use all five plausible values to estimate correct standard errors.
Student-level variables. At the student level, we consider three key individual and family predictors: gender, immigration status, and SES. Gender was based on students' report of their sex (male = 0; female = 1). Immigration status was measured using information on the place of birth of students and parents/guardians. Thus, a student who was born inside the country with parents also born inside the country was coded as a "native student", a student who was born inside the country with at least one parent born outside the country was coded as a "second-generation immigrant student", and a student who was born outside the country with at least one parent born outside the country was coded as a "first-generation immigrant student". To measure SES, we constructed a standardized composite index based on father's education, mother's education, and the number of books at home. Finally, we include the student's age measured in months as a control variable at the student level.
School-level variables. At the school level, we controlled for school location. School location is measured by a dichotomous variable, where schools in "urban (densely populated) areas", "suburban areas", and "medium size city or large town" were categorized as urban; and schools in "small town or village" and "remote rural" locations as rural. Table 1 presents descriptive statistics for the student-and school-level variables included in our analysis. Country-level variables. At the country level, we collected measures on standardization, differentiation, and government spending on education. The standardization index was constructed by conducting a Principal Component Analysis (PCA) on a set of measures that included whether the central government controlled the curriculum, prescribed textbooks, and required students to take a school exam at any given point that had consequences for their progression through the education system. The differentiation index was created by conducting PCA on measures that captured the number of available tracks at the secondary level and the age at which tracking occurs. Both standardization and differentiation indices were scaled to have a mean of zero and standard deviation of one across all countries. Government spending on education was measured as the percentage of total government spending. In addition to these country-level predictors, we controlled for GDP per capita (logged).

Analytical Strategy
To investigate how the institutional features of education systems interact with student-level characteristics to affect students' academic achievement, we used the HLM-7 software to estimate random-intercepts and slopes three-level hierarchical linear models (Raudenbush & Bryk, 2002). This approach allows us not only to address the clustering of students within schools and within countries, but also to examine the extent to which academic achievement as well as the relationships between academic achievement and student-level variables vary across schools and countries (Raudenbush & Bryk, 2002). The final model was specified as follows:
For both math and science achievement scores, we sequentially estimated the following six models. We first estimated a null model (M0) to show the proportion of the total variance in student achievement scores that is accounted for by the clustering of students within schools and countries.
Second, we fitted a model (M1) by regressing academic achievement only on the main student-level predictors. The next model (M2) added all student-, school-, and country-level variables. The final set of models (M3, M4, and M5) added cross-level interaction terms between student-and country-level predictors, separately. For all models, we centered all student-level predictors around the group mean and school-and country-level variables around the grand mean. We applied the final student SENATE weights in our analyses to take into account the effects of stratification or disproportional sampling of subgroups, non-response adjustments, and to calibrate each country to have an equal weight (Joncas, 2008). To address missing data, we used the multiple imputation by chained equations (MICE) technique. We included all dependent and independent variables in the imputed model to predict missing values and generated five imputed datasets to be simultaneously used in our analyses (Royston, 2004)

Academic Achievement, Student Characteristics, and Education Systems
Tables 3 and 4 show the results from the three-level hierarchical linear models for math and science achievement, respectively. The first column of each table displays results for the null model. In the case of math achievement, the intraclass correlations for the school and country variance are 0.23 and 0.36, respectively. Likewise, in the case of science achievement, the intraclass correlations are 0.23 and 0.31, respectively. These numbers suggest that more than half of the total variance in students' academic achievement is explained by between-school and between-country variation, which justify the need for a multilevel modeling approach.
The second column shows the relationships between academic achievement and the three student characteristics. The results show that female students performed significantly lower than males in math achievement (3.5 points lower) but there were no significant gender differences in science in this model specification. Second-and first-generation immigrant students performed lower than their native counterparts in both math (5 and 27 points lower, respectively) and science (7.4 and 33.1 points lower, respectively), with first-generation students also performing worse than second-generation ones 1 . Higher SES was associated with higher math and science achievement. In particular, a unit increase in the SES index was associated with a 21.4 point increase in math and science. After including the student-level variables, variation within schools decreased from 5074.8 to 4576.1 for math and from 5355.2 to 4782.1 for science. This indicates that about 10% of the within school variation in academic achievement scores can be explained by the three student-level predictors. Furthermore, the estimates for the country level variance of the slopes for gender (72.7 for math and 121.3 for science), second generation students (189.3 and 285.6), first generation students (397.1 and 433.4), and SES (112.9 and 91.4) are statistically significant at the 0.01 level, which confirms the existence of differences in slopes among countries.
In the third column, student's age, school location, and education system variables at the student-, school-, and country-level, respectively, were added. Results show that student's age was negatively related with academic achievement and students from schools located in urban areas perform significantly higher than students from rural schools. Standardization, differentiation, privatization, and government spending on education were not associated with any of the two measures of academic achievement. Finally, GDP per capita was positively associated with both math and science achievement. After including the country-level variables, between countries variation decreased by 37% for math (from 4417.6 to 2781.8) and by 43% for science (from 3581.3 to 2060.9). Notes: Unstandardized coefficients are reported with standard errors in parentheses. Wald test for the null hypothesis that second generation and first generation coefficients are equal has χ²(1)=59.3 (p-value<0.001). Number of students=261,747, schools=8,430, countries=45. + p < 0.10; * p < 0.05; ** p < 0.01.

Cross-Level Interactions Between Student and Education System Characteristics
Next, tables 5 and 6 show the results from cross-level interactions between student-and country-level variables for math and science achievement, respectively. Each column shows cross-level interactions for gender (M3), immigrant status (M4), and SES (M5), respectively. With respect to gender, we found a negative and significant interaction with differentiation only for science achievement. This result suggests that girls' disadvantage in science achievement is greater in countries with higher levels of differentiation. With respect to immigration status, the results show a positive and significant interaction term between first-generation students with differentiation for science achievement. This suggests that immigrant students' disadvantage in science achievement is attenuated in countries with higher levels of differentiation. Furthermore, the interaction term between immigration status and GDP per capita is positive for both math and science achievement. These results suggest that math and science achievement gaps between native and immigrant students are smaller in countries with higher levels of economic development.
Finally, with respect to SES, the results show a positive and significant interaction between SES and differentiation for both math and science achievement, suggesting that a higher level of differentiation disproportionally benefits higher SES students. No significant interaction was found between the level of standardization and SES. Further, we found a negative interaction between SES and government spending on education for both math and science achievement, which suggests that the disadvantage of low-SES students is attenuated in countries with higher levels of government spending on education. Finally, the interaction between SES and GDP percapita was positive and statistically significant only for science achievement, suggesting that the gap between low-and high-SES students is greater in wealthier countries. RISE -International Journal of Sociology of Education,9(2) 143

Discussion
Using data from the 2011 TIMSS for 45 countries, we examined the socioeconomic, gender and immigrant status related gaps in math and science achievement. We linked these gaps to the characteristics of education systems, such as the degree of differentiation, standardization, and the share of governmental spending on education. We found that overall higher SES is positively and significantly associated with higher math and science achievement; immigrant students lag behind their native peers in both math and science with first generation students performing worse; and girls show lower math performance while their science achievement is not significantly different from boys'. Not surprisingly, students in wealthier countries showed higher academic performance in both math and science. We found that a higher degree of differentiation makes socioeconomic gaps larger in both math and science achievement (i.e., in more rigidly differentiated systems low-SES students perform worse). Further, both firstand second-generation immigrant students' disadvantage in science achievement is attenuated in countries with higher levels of differentiation. Second-generation students also perform better in math in countries with more rigidly tracked systems. In addition, the achievement gaps between native and immigrant students in both math and science are smaller in countries with higher GDP. Moreover, higher proportion of governmental spending on education reduces the disadvantage of low-SES students in both math and science.
Education systems are deeply embedded within the economic, political, social, and cultural contexts of their respective countries, making it rather hard to come up with specific policy recommendations that will be effective universally. That being said, our findings show that higher educational spending attenuates the disadvantage of low-SES students in both math and science, thus highlighting the importance of governmental investments in schools. Further, our investigation shows that rigid differentiation exacerbates SES-based educational inequality; thus, having more flexible opportunities for students to switch among more or less advanced course options (both within and across subjects) seems beneficial for these students. This description fits the comprehensive high school model that is prevalent in the United States. However, such a model can only be successful if advanced options are truly available for all students. It is critical that the advanced curriculum (International Baccalaureate programs and/or a large enough variety of Advanced Placement courses) be offered in all schools, including those in disadvantaged areas (rural and urban).
Although our findings show that differentiation may reduce the immigrantnative gaps, particularly in science, our study should not be viewed as a call for de-tracking across the board. Previous studies have shown that immigrants' expectations regarding how much education they will achieve (Chykina, 2019), as well as their eventual educational attainment (Griga & Hadjar, 2014) decrease in tracked education systems. Further, immigrants report feeling silenced and less comfortable to speak up in tracked classes and schools, even if placed in a higher track (Gibson & Carrasco, 2009). Our finding of overall disadvantage of immigrant students in both math and science calls for careful and thoughtful policy measures to support these students. Since a significant proportion of immigrant students come from lower socio-economic background, policies focused on additional investment in resources, both monetary and pedagogical, are clearly in need. Culturally sensitive and socially appropriate educational policies targeting immigrant students, especially first-generation, will be the most successful to ensure their brighter future in their new home countries.
Our study has several limitations. The main limitation is the cross-sectional nature of the analysis. By using the TIMSS data, we are unable to control for previous achievement or tease out the processes by which the achievement is shaped over time. Second, as with any comparative international quantitative study, the results may hide important country-to-country differences and nuances in what it means to be a female, an immigrant student, or a student from a low socio-economic background. Yet, we believe that our findings are important in providing the overall picture of the relationships between individual student characteristics and their academic performance, and how these influences vary by the country educational context. Notes 1. We conducted Wald tests to determine whether the coefficients for first-and secondgeneration students are statistically different from each other. We found that they are significant both for math (χ2= 59.3, p<0.001) and science (χ2= 71.3, p<0.001) achievement.