###### How can you evaluate the impact of a teaching initiative?

###### Are this year's cohort doing better than in previous years?

###### Do students in one demographic do better than others?

###### What data do you need, and how should you analyse them?

##### Sound Data analysis dashboards use the quasi-experimental methods of educational research and your assessment results to help schools answer these important questions.

Our dashboards use the appropriate statistical tests to see if there are any significant differences between the assessment results of different cohorts, how large these differences are, and present these findings in a teacher-friendly way. Sound Data dashboards support valid analysis and conclusions.

Comparing raw averages or percentages is just not good enough. Statistical tests provide us with powerful ways of describing the difference between the distributions of various cohorts. The tests for significant differences are most important and their meaning is summarised by the p-value (p). In teacher speak, the p-value gives the probability that any differences we find between the cohorts are the result of random variation (e.g., some years are more able than others). When the p-value falls below a critical value (e.g., 5% or 1%), we can dismiss those random effects and claim that there are significant or highly significant differences between the outcomes of our initiative cohort and the rest. For example, if there is only a one percent chance the improvement in the grades is random, you can be pretty confident that your initiative has worked! The tests to establish how large a difference are also important and their meaning is summarised by the effect size (e.g. R or D) or strength of an association (e.g. G). In teacher speak, the effect size tells you how large the difference is in comparison to the distribution of the assessment results. When the effect size passes over different thresholds (e.g., 0.1, 0.3, 0.5) we can refer to the difference as a small, moderate, or large difference.

Sound Data teacher-friendly dashboards not only present the numerical test results but also provide common-language interpretations so all educators can quickly learn to extract meaning. When there are significant differences (the p-value is less than or equal to 0.05 or 5%) the common-language interpretation is highlighted pink.

Different comparisons require different methods. Comparing the results of this year’s students to previous years is called a longitudinal comparison (e.g. 2022 with 2012 – 2021). Comparing the results of different demographics is called a cross-sectional comparison (e.g. fe/males, non/Maori, non/Pacifica, non/Asian, non/Pakeha) . If you have results from a single point in time (e.g. NCEA data, entrance data) you can compare the achievement of different cohorts, if you have results from two points in time (e.g. PAT results from term 1 and term 4) you can compare the progress or learning of different cohorts. Your unit of analysis could be the subject, curriculum area, year-level, room and so on.

Sound Data dashboards are bespoke and co-constructed with school leaders. The displays and statistical tests are determined by each school’s research questions and the data they want to analyse. This data can include: NCEA grades, credits and qualifications; PAT, JAM, GloSS, IKAN, Probe, Running Records, e-asTTle, curriculum levels; overall teacher judgements; surveys of student and parent voice. Here are some examples of Sound Data teacher-friendly dashboards.

The NCEA Subject Dashboard analyses the NCEA data for different subjects. This demonstration dashboard looks at Level 1 English, Mathematics and Science. Achievement data includes: Grades – the total combined grade distribution for all standards, externals and internals, as well as for individual achievement and unit standards; Credits – internal, external total and 14+; Course endorsements. Longitudinal comparisons are carried out comparing the 2023 cohort with a combined previous years cohort (2012-2022). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha.

The NCEA Level Qualifications Dashboard analyses the NCEA qualifications gained by Year 11, 12 and 13 students. Qualifications include NCEA Level 1, 2 and 3 and University Entrance.

The NZCER Progress Dashboard analyses the progress students make during the year by comparing their scale score results in Term 1 and Term 4. This demonstration dashboard looks at PAT Comprehension and PAT Mathematics data. Longitudinal comparisons are carried out by comparing the 2022 cohort with the combined previous years’ cohort (2019-2021 for Comprehension and 2017-2021 for Mathematics). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distribution for each cohort. These analyses are carried out for all students as well as individual year groups.

The NZCER Term 1 Dashboard analyses achievement at the start of the year to see if the students’ prior knowledge is significantly different than in previous years and hence possibly requiring a change in your curriculum. This demonstration dashboard looks at PAT Comprehension and PAT Mathematics data. Longitudinal comparisons are carried out by comparing the 2022 cohort with the combined previous years’ cohort (2018-2021). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha.

The e-asTTle Progress Dashboard analyses the progress students make during the year by comparing their scale score results in Term 1 and Term 4. This demonstration dashboard looks at e-asTTle Writing data. Longitudinal comparisons are carried out by comparing the 2023 cohort with the combined previous years’ cohort (2012-2022).

Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distribution for each cohort. These analyses are carried out for all students as well as individual year groups.

The progress of each room is compared with school Results.

The e-asTTle Term 1 Dashboard analyses achievement at the start of the year to see if the students’ prior knowledge is significantly different than in previous years and hence possibly requiring a change in your curriculum. This demonstration dashboard looks at e-asTTle Writing data. Longitudinal comparisons are carried out by comparing the 2023 cohort with the combined previous years’ cohort (2012-2022). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha.

The Primary Stage Assessments Progress Dashboard incorporates assessments such as GLOSS, JAM, Reading Age-Level Records and analyses the progress students make during the year by comparing their assessment results in Term 1 and Term 4. This demonstration dashboard looks at GLOSS Strategy-Level, JAM and Reading Age-Level data. Longitudinal comparisons are carried out by comparing the 2022 cohort with the combined previous years’ cohort (2018-2021). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distributions for each cohort. These analyses are carried out for all students as well as individual year groups.

*Ordinal Logistic Regression*. This dashboard demonstrates how an Ordinal Logistic Regression Model can be used to describe how Mathematics and Reading results, gender and membership of a Kāhui Ako can influence the chances of gaining a higher grade in NCEA Level 3 Qualifications.