Data SGP and Student Background Variables

Data sgp is a software package that allows researchers to construct and analyze longitudinal student assessment data sets. It provides a common format for representing the time dependent data of students, where each case/row represents a single student and columns represent variables associated with the student at different times. The package includes an exemplar WIDE and LONG format data set (sgpData and sgpData_LONG, respectively) to assist in setting up one’s own longitudinal data sets.

It is well known that aggregated student growth percentages (SGPs) are highly correlated with student background variables. This correlation is problematic, in that it can obscure the ability to detect teacher effects and may lead to unwarranted conclusions about teacher effectiveness. In addition, it is possible that these correlations can lead to unsupported claims about the value of SGPs in evaluating educational programs and individuals.

One of the purported benefits of SGPs is that by ranking students against others with similar prior achievement levels, they provide a fairer assessment of student progress than simply examining unadjusted test scores. This claim has led to the increased use of SGPs in evaluating teachers and schools. In order to understand the validity of this claim, it is necessary to study the relationship between SGPs and student background variables.

In this paper, we use a panel data set of 8 windows (3 windows annually) of student assessment data to examine the relationship between SGPs and a variety of student background characteristics. In particular, we look at the degree to which SGPs correlate with student gender, ethnicity, family income, and parents’ education level. We also examine the extent to which SGPs correlate with classroom teacher characteristics.

We find that the relationships between SGPs and student background variables are much more pronounced than commonly believed. Moreover, we find that the results would be similar even if tests had no measurement error. This result has important implications for the ability of alternative estimation methods to improve the fairness of SGP measures.

The sgpdata dataset is an anonymized panel data set of student assessment results in long format for 8 windows (3 windows annually) across 3 content areas (Early Literacy, Math, and Reading). Each row contains the unique student identification number, the grade level at which they were assessed each year and the scale score that was assigned to them that year. The first five columns – ID, sgptData_INSTRUCTORNUMBER, YEAR, and sgpdata_INSTRUCTOR_YEAR – contain information needed to create student aggregates for each of the content areas.

The sgpData dataset is an anonymous, downloadable, longitudinal panel data set of student assessment outcomes in long format for 8 windows (3 windows annually) in 3 content areas. Each row contains the unique student identifier, the grade level at which they were assessed every year and the scale score that was assigned to each of them that year. The first five columns – ID, sgptData_INSTRUCTORNUMBER, Year, and sgpdata_INSTRUCTOR_YEAR are demographic/student categorization variables used by the summarySGP function to create student aggregates.