U.S. public schools are highly segregated by both race and class. Prior research shows that the desegregation of Southern schools in the 1960s and 1970s led to significant benefits for black students, including increased educational attainment and higher earnings. We do not know, however, whether segregation today has the same harmful effects as it did 50 years ago, nor do we have clear evidence about the mechanisms through which segregation affects achievement patterns. In this paper we estimate the effects of current-day school segregation on racial achievement gaps. We use 8 years of data from all public school districts in the U.S. We find that racial school segregation is strongly associated with the magnitude of achievement gaps in 3rd grade, and with the rate at which gaps grow from third to eighth grade. The association of racial segregation with achievement gaps is completely accounted for by racial differences in school poverty: racial segregation appears to be harmful because it concentrates minority students in high-poverty schools, which are, on average, less effective than lower-poverty schools. Finally, we conduct exploratory analyses to examine potential mechanisms through which differential enrollment in high-poverty schools leads to inequality. We find that the effects of school poverty do not appear to be explained by differences in the set of measurable teacher or school characteristics available to us.
This paper describes a method for pooling grouped, ordered-categorical data across multiple waves to improve small-sample heteroskedastic ordered probit (HETOP) estimates of latent distributional parameters. We illustrate the method with aggregate proficiency data reporting the number of students in schools or districts scoring in each of a small number of ordered “proficiency” levels. HETOP models can be used to estimate means and standard deviations of the underlying (latent) test score distributions, but may yield biased or very imprecise estimates when group sample sizes are small. A simulation study demonstrates that pooled HETOP models can reduce the bias and sampling error of standard deviation estimates when group sample sizes are small. An analysis of real test score data suggests the pooled models are likely to improve estimates in applied contexts.
Socioeconomic achievement gaps have long been a central focus of educational research. However, not much is known about how (and why) between-district gaps vary among states, even though states are a primary organizational level in the decen- tralized education system in the United States. Using data from the Stanford Education Data Archive (SEDA), this study describes state-level socioeconomic achievement gradients and the growth of these gradients from Grades 3 to 8. We also examine state-level correlates of the gradients and their growth, including school system funding equity, preschool enrollment patterns, the distribution of teachers, income inequality, and segregation. We find that socioeconomic gradients and their growth rates vary considerably among states, and that between-district income segregation is positively associated with the socioeconomic achievement gradient.
We estimate racial/ethnic achievement gaps in several hundred metropolitan areas and several thousand school districts in the United States using the results of roughly 200 million standardized math and reading tests administered to public school students from 2009-2013. We show that achievement gaps vary substantially, ranging from nearly 0 in some places to larger than 1.2 standard deviations in others. Economic, demographic, segregation and schooling characteristics explain roughly three-quarters of the geographic variation in these gaps. The strongest correlates of achievement gaps are local racial/ethnic differences in parental income, local average parental education levels, and patterns of racial/ethnic segregation, consistent with a theoretical model in which family socioeconomic factors affect educational opportunity partly though residential and school segregation patterns.
I use standardized test scores from roughly forty-five million students to describe the temporal structure of educational opportunity in more than eleven thousand school districts in the United States. Variation among school districts is considerable in both average third-grade scores and test score growth rates. The two measures are uncorrelated, indicating that the characteristics of communities that provide high levels of early childhood educational opportunity are not the same as those that provide high opportunities for growth from third to eighth grade. This suggests that the role of schools in shaping educational opportunity varies across school districts. Variation among districts in the two temporal opportunity dimensions implies that strategies to improve educational opportunity may need to target different age groups in different places.
Are public schools in the United States engines of mobility or agents of inequality? Can schools in low-income communities provide a pathway out of poverty, or are the constraints of poverty too great for schools to overcome? Such questions are at the heart of debates about the role of education in social mobility in the United States. Despite decades of research, however, we still lack clear answers.
In this article, I provide new evidence to inform these debates. It suggests that the lack of a clear answer to the question is explained in part by the substantial variation in the role of schooling in shaping educational opportunity across places. Early childhood conditions are more important in some places, educational opportunities during the elementary and middle school years more important in others.
In the first systematic study of gender achievement gaps in U.S. school districts, we estimate male-female test score gaps in math and English Language Arts (ELA) for nearly 10,000 school districts in the U.S. We use state accountability test data from third through eighth grade students in the 2008-09 through 2014-15 school years. The average school district in our sample has no gender achievement gap in math, but a gap of roughly 0.23 standard deviations in ELA that favors girls. Both math and ELA gender achievement gaps vary among school districts and are positively correlated – some districts have more male-favoring gaps and some more female-favoring gaps. We find that math gaps tend to favor males more in socioeconomically advantaged school districts and in districts with larger gender disparities in adult socioeconomic status. These two variables explain about one fifth of the variation in the math gaps. However, we find little or no association between the ELA gender gap and either socioeconomic variable, and we explain virtually none of the geographic variation in ELA gaps.
To download a data file with the gender achievement gap estimates produced in this paper, please click here to sign the data use agreement. Upon signing, you will be redirected to the Stanford Education Data Archive where you can download the data file from this paper.
This map displays Empirical Bayes estimates of the average achievement gaps in math and English language arts in nearly 10,000 U.S. public school districts. A gap of zero indicates that there is no achievement gap in that district. Negative gaps (shown in orange) indicate that female students score higher on average than male students in the district; positive achievement gaps (shown in blue) indicate that male students score higher on average than female students in the district. The gaps displayed are in standard deviation units; for reference, a third of a standard deviation gap is approximately a one grade level difference.
Prior research suggests that males outperform females, on average, on multiple-choice items
compared to their relative performance on constructed-response items. This paper
characterizes the extent to which gender achievement gaps on state accountability tests
across the United States are associated with those tests’ item formats. Using roughly eight
million fourth and eighth grade students’ scores on state assessments, we estimate state- and
district-level math and reading male-female achievement gaps. We find that the estimated
gaps are strongly associated with the proportions of the test scores based on multiple-choice
and constructed-response questions on state accountability tests, even when controlling for
gender achievement gaps as measured by the NAEP or NWEA MAP assessments, which have
the same item format across states. We find that test item format explains approximately 25
percent of the variation in gender achievement gaps among states.
This paper provides the first population-based evidence on how much standardized test
scores vary among public school districts within each state and how segregation explains that
variation. Using roughly 300 million standardized test score records in math and ELA for
grades 3 through 8 from every U.S. public school district during the 2008-09 to 2014-15 school
years, we estimate intraclass correlations (ICCs) as a measure of between-district variation.
We characterize the variation in the ICCs across states, as well as the patterns in the ICCs over
subjects, grades and cohorts. Further, we investigate the relationship between the ICCs and
measures of racial and socioeconomic segregation. We find that between-district variation is
greatest, on average, in states with high levels of both white-black and economic segregation.
A comparison of Chicago public school students’ standardized test scores in 2009-2014 with
those of public students across the U.S. reveals two striking patterns. First, Chicago students’
scores improved dramatically more, on average, between third and eighth grade than those
of the average student in the U.S. This is true for students of all racial/ethnic groups. The
average Chicago student’s test scores improved by roughly 6 grade-level equivalents in the 5
years from third to eighth grade. Second, at each grade level in grades three through eight,
Chicago students’ scores improved more from 2009 to 2014 than did the average scores of all
students in the U.S. Test scores rose in Chicago by roughly two-thirds of a grade level from
2009 to 2014, compared to an increase of one-sixth of a grade level nationally. Again, this was
equally true for black, Hispanic, and white students. These patterns do not appear to result
from increasingly test-aligned instruction or from changing city demographics and enrollment
Nonmedical exemptions from school-entry vaccine mandates are receiving increased policy
and public health scrutiny. This paper examines how expanding the availability of exemptions
influences vaccination rates in early childhood and academic achievement in middle school. We
leverage 2003 legislation that granted personal belief exemptions (PBE) in Texas and Arkansas,
two states that previously allowed exemptions only for medical or religious regions. We find that
PBE decreased vaccination coverage among black and low-income preschoolers by 16.1% and
8.3%, respectively. Furthermore, we find that those cohorts affected by the policy change in
early childhood performed less well on standardized tests of academic achievement in middle
school. Estimated effects on mathematics and English Language Arts test scores were largest
for black students, especially those residing in economically disadvantaged counties
The school meals program is the largest nutritional assistance program for school-aged children.
Whereas program eligibility was historically determined by family income, recent reforms allow
schools to offer free meals to all students. This paper evaluates the effect of the Community
Eligibility Provision, the largest schoolwide free meals program, on academic performance. I l
everage within- and across-state variation in the timing of CEP participation and find universal
free meals increases breakfast and lunch participation by 38 and 12 percent, respectively.
Math performance improves in districts with baseline low free meal eligibility, particularly
among racial/ethnic groups with low income-based participation rates.
We conduct an online survey experiment in which participants are asked to imagine that they are
parents moving to a new metropolitan area. They then choose between the five largest school
districts in that area. All participants receive demographic data for each district. In addition,
some participants are randomly assigned to receive average achievement and/or average growth
data for each district. While there are strong relationships between student demographics and
student achievement, the links between student demographics and student growth are much weaker.
We find that, on average, the provision of growth data causes participants to choose less
white and less affluent districts. Moreover, the provision of both achievement and growth data
causes participants to choose less white and less affluent districts than the provision of achievement data alone.
There is growing interest in the relation between the racial achievement gap and the racial discipline gap.
However, few studies have examined this relation at the national level. This study combines data from the Stanford
Education Data Archive and the Civil Rights Data Collection and employs a district fixed effects analysis to examine
whether and the extent to which racial discipline gaps are related to racial achievement gaps in Grades 3 through 8
in districts across the United States. In bivariate models, we find evidence that districts with larger racial
discipline gaps have larger racial achievement gaps (and vice versa). Though other district-level differences
account for the positive association between the Hispanic-White discipline gap and the Hispanic-White achievement
gap, we find robust evidence that the positive association between the Black-White discipline gap and the Black-White
achievement gap persists after controlling for a multitude of confounding factors. We also find evidence that the
mechanisms connecting achievement to disciplinary outcomes are more salient for Black than White students.
This study investigates the effect of violent crime on school district-level achievement in English Language Arts (ELA) and Mathematics. The research design exploits geographic variation in achievement and crime across 337 school districts and temporal variation across
seven birth cohorts of children born between 1996 and 2002. To generate causal estimates of
the effect of crime on achievement, the identification strategy leverages exogenous shocks to
crime rates arising from the availability of federal funds to hire police officers in the local
police departments where the school districts operate. Results show that birth cohorts who
entered the school system when violent crime was lower score higher in ELA by the end of
eighth grade, relative to birth cohorts attending schools in the same district but who entered
the school system when crime rates were substantially higher. A 10 percent decline in violent
crime raises eighth-grade ELA achievement in the district by .04 standard deviations. Analyses
by race/ethnicity and gender indicate that black children, Hispanic children, and boys
experienced the largest gains in ELA achievement as violent crime dropped. The effects on
Mathematics achievement are smaller and imprecisely estimated. These findings extend our
understanding of the geography of educational opportunity in the United States and reinforce
the idea that understanding inequalities in academic achievement requires evidence on what
happens inside schools as well as what happens outside of schools.
This study examines whether county-level estimates of implicit bias predict black-white test score gaps in county schools. Data from over 1 million respondents from across the United States who completed an online version of the Race Implicit Association Test (IAT) were combined with data from the Stanford Education Data Archive covering over 300 million test scores from U.S. schoolchildren in grades 3 through 8. In both bivariate and multivariate models, counties with higher levels of racial bias had larger black-white test score disparities. This relationship was primarily explained by sorting mechanisms: The black-white test score gap was larger in counties with higher levels of implicit bias because these counties’ schools were more racially segregated and were characterized by larger racial gaps in gifted and talented assignment as well as special education placement.
Over the past decade, U.S. immigration enforcement policies have increasingly targeted unauthorized immigrants residing in the U.S. interior, many of whom are the parents of U.S.-citizen children. Heightened immigration enforcement may affect student achievement through stress, income effects, or student mobility. I use one immigration enforcement policy, Secure Communities, to examine this relationship. I use the staggered activation of Secure Communities across counties to measure its relationship with average achievement for Hispanic students, as well as non-Hispanic black and white students. I find that the activation of Secure Communities was associated with decreases in average achievement for Hispanic students in English Language Arts (ELA), as well as black students in ELA and math. Similarly, I find that increases in removals are associated with decreases in achievement for Hispanic and black students. I note that the timing of rollout is potentially correlated with other county trends affecting results.
In this paper we compare two approaches to measuring the average rate at which students learn in a given school or district. One type of measure—longitudinal growth measures—relies on student-level longitudinal data. A second type—cohort growth measures—relies only on repeated aggregated, cross-sectional data. Because student-level data is often not readily available, cohort growth measures are sometimes the only type available. The estimated school and district learning rates reported in the Stanford Education Data Archive (SEDA), for example, are cohort growth measures based on aggregated data. Understanding how much researchers and policymakers can rely on these cohort growth estimates requires one to know how well, and under what conditions, the estimates obtained from this approach align with those based on longitudinal data. In this report we address these questions.
The Stanford Education Data Archive (SEDA) is part of the Educational Opportunity Project at Stanford University (https:\edopportunity.org), an initiative aimed at harnessing data to help scholars, policymakers, educators, and parents learn how to improve educational opportunities for all children. SEDA includes a range of detailed data on educational conditions, contexts, and outcomes in schools, school districts, counties, commuting zones, and metropolitan statistical areas across the United States. Available measures differ by aggregation; see Sections I.A. and I.B. for a complete list of files and data.
By making the data files available to the public, we hope that anyone who is interested can obtain detailed information about U.S. schools, communities, and student success. We hope that researchers will use these data to generate evidence about what policies and contexts are most effective at increasing educational opportunity, and that such evidence will inform educational policy and practices.
Linking score scales across different tests is considered speculative and fraught, even at the
aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation
methods for aggregate linkages, using the challenge of linking U.S. school district average
test scores across states as a motivating example. We show that aggregate linkages can be
validated both directly and indirectly under certain conditions, such as when the scores for at
least some target units (districts) are available on a common test (e.g., the National
Assessment of Educational Progress). We introduce precision-adjusted random effects
models to estimate linking error, for populations and for subpopulations, for averages and for
progress over time. These models allow us to distinguish linking error from sampling
variability and illustrate how linking error plays a larger role in aggregates with smaller sample
sizes. Assuming that target districts generalize to the full population of districts, we can show
that standard errors for district means are generally less than 0.2 standard deviation units,
leading to reliabilities above 0.7 for roughly 90% of districts. We also show how sources of
imprecision and linking error contribute to both within- and between-state district
comparisons within vs. between states. This approach is applicable whenever the essential
counterfactual question—“what would means/variance/progress for the aggregate units be,
had students taken the other test?”—can be answered directly for at least some of the units.
Test score distributions of schools or demographic groups are often summarized by
frequencies of students scoring in a small number of ordered proficiency categories. We show
that heteroskedastic ordered probit (HETOP) models can be used to estimate means and
standard deviations of multiple groups’ test score distributions from such data. Because the
scale of HETOP estimates is indeterminate up to a linear transformation, we develop formulas
for converting the HETOP parameter estimates and their standard errors to a scale in which
the population distribution of scores is standardized. We demonstrate and evaluate this novel
application of the HETOP model with a simulation study and using real test score data from
two sources. We find that the HETOP model produces unbiased estimates of group means
and standard deviations, except when group sample sizes are small. In such cases, we
demonstrate that a “partially heteroskesdastic” ordered probit (PHOP) model can produce
estimates with a smaller root mean squared error than the fully heteroskedastic model.
Ho and Reardon (2012) present methods for estimating achievement gaps when test scores are
coarsened into a small number of ordered categories, preventing fine-grained distinctions between
individual scores. They demonstrate that gaps can nonetheless be estimated with minimal bias across a
broad range of simulated and real coarsened data scenarios. In this paper, we extend this previous work
to obtain practical estimates of the imprecision imparted by the coarsening process and of the bias
imparted by measurement error. In the first part of the paper, we derive standard error estimates and
demonstrate that coarsening leads to only very modest increases in standard errors under a wide range
of conditions. In the second part of the paper, we describe and evaluate a practical method for
disattenuating gap estimates to account for bias due to measurement error.
Test scores are commonly reported in a small number of ordered categories. These contexts
include state accountability testing, Advanced Placement tests, and English proficiency tests.
This paper introduces and evaluates methods for estimating achievement gaps on a familiar
standard-deviation-unit metric using data from these ordered categories alone. These methods
hold two practical advantages over alternative achievement gap metrics. First, they require only
categorical proficiency data, which are often available where means and standard deviations are
not. Second, they result in gap estimates that are invariant to score scale transformations,
providing a stronger basis for achievement gap comparisons over time and across jurisdictions.
We find three candidate estimation methods that recover full-distribution gap estimates well
when only censored data are available.
We use data from multiple national surveys to describe trends in private elementary school
enrollment by family income from 1968-2013. We note several important trends. First, the
private school enrollment rate of middle-income families declined substantially over the last
five decades, while that of high-income families remained quite stable. Second, there are
notable differences in private school enrollment trends by race/ethnicity, urbanicity, and
region of the country. Although racial/ethnic differences in private school enrollment are
largely explained by income differences, the urban/suburban and regional differences in
private school enrollment patterns are large even among families with similar incomes.
Factors contributing to these patterns may include trends in income inequality, private school
costs and availability, and the perceived relative quality of local schooling options.
Although trends in the racial segregation of schools are well documented, less is known
about trends in income segregation. We use multiple data sources to document trends in
income segregation between schools and school districts. Between-district income
segregation of families with children enrolled in public school increased by over 15% from
1990 to 2010. Within large districts, between-school segregation of students who are eligible
and ineligible for free lunch increased by over 40% from 1991 to 2012. Consistent with research
on neighborhood segregation, we find that rising income inequality contributed to the rise in
income segregation between schools and districts during this period. The rise in income
segregation between both schools and districts may have serious implications for inequality
in students’ access to resources that bear on academic achievement.
Since the Supreme Court’s 1954 Brown v. Board of Education decision, researchers and policymakers have paid close attention to trends in school segregation. While Brown focused on black-white segregation, we review the evidence regarding trends and consequences of both racial and economic school segregation. In general, the evidence regarding trends in racial segregation suggests that the most significant declines in black-white school segregation occurred at the end of the 1960s and the start of the 1970s. Although there is disagreement about the direction of more recent trends in racial segregation, this disagreement is largely driven by different definitions of segregation and different ways of measuring it. We conclude that the changes in segregation in the last few decades are not large, regardless of what measure is used, though there are important differences in the trends across regions, racial groups, and institutional levels. Limited evidence on school economic segregation makes documenting trends difficult, but in general, students are more segregated by income across schools and districts today than in 1990. We also discuss the role of desegregation litigation, demographic changes, and residential segregation in shaping trends in both racial and economic segregation.
One of the reasons that scholars, policymakers, and citizens are concerned with school segregation is that segregation is hypothesized to exacerbate racial or socioeconomic disparities in educational success. The mechanisms that would link segregation to disparate outcomes have not often been spelled out clearly or tested explicitly. We develop a general conceptual model of how and why school segregation might affect students and review the relatively thin body of empirical evidence that explicitly assesses the consequences of school segregation. This literature suggests that racial desegregation in the 1960s and 1970s was beneficial to blacks; evidence of the effects of segregation in more recent decades, however, is mixed or inconclusive. We conclude with discussion of aspects of school segregation on which further research is needed.
In this paper we investigate whether the school desegregation produced by courtordered desegregation plans persists when school districts are released from court oversight.
Over 200 medium-sized and large districts were released from desegregation court orders
from 1991 to 2009. We find that racial school segregation in these districts increased gradually
following release from court order, relative to the trends in segregation in districts remaining
under court order. These increases are more pronounced in the South, in elementary grades,
and in districts where pre-release school segregation levels were low. These results suggest
that court-ordered desegregation plans are effective in reducing racial school segregation, but
that their effects fade over time in the absence of continued court oversight.
Almost fifty years ago, in 1966, the Coleman Report famously highlighted the relationship between family socioeconomic status and student achievement. Family socioeconomic characteristics continue to be among the strongest predictors of student achievement, but while there is a considerable body of research that seeks to tease apart this relationship, the causes and mechanisms of this relationship have been the subject of considerable disagreement and debate. Much of the scholarly research on the socioeconomic achievement gradient has focused largely on trying to understand the mechanisms through which factors like income, parental educational attainment, family structure, neighborhood conditions, school quality, as well as parental preferences, investments, and choices lead to differences in children’s academic and educational success. Still, we know little about the trends in socioeconomic achievement gaps over a lengthy period of time.
The question posed in this article is whether and how the relationship between family socioeconomic characteristics and academic achievement has changed during the last fifty years, with a particular focus on rising income inequality. As the income gap between high- and low-income families has widened, has the achievement gap between children in high- and low income families also widened? The answer, in brief, is yes. The achievement gap between children from high- and low-income families is roughly 40 percent larger among children born in 2001 than among those born twenty-five years earlier.