How can you evaluate the impact of a teaching initiative?
Are this year's cohort doing better than in previous years?
Do students in one demographic do better than others?
What data do you need, and how should you analyse them?
Sound Data analysis dashboards use the quasi-experimental methods of educational research and your assessment results to help schools answer these important questions.

Our dashboards use the appropriate statistical tests to see if there are any significant differences between the assessment results of different cohorts, how large these differences are, and present these findings in a teacher-friendly way. Sound Data dashboards support valid analysis and conclusions.

Comparing raw averages or percentages is just not good enough. Statistical tests provide powerful ways to describe the difference between the distributions of various cohorts. The tests for significant differences are most important and their meaning is summarised by the p-value (p). In teacher speak, the p-value gives the probability that any differences we find between the cohorts are the result of random variation (e.g., some years are more able than others). When the p-value falls below a critical value (e.g., 5% or 1%), we can dismiss those random effects and claim that there are significant or highly significant differences between the outcomes of our initiative cohort and the rest. For example, if there is only a one percent chance the improvement in the grades is random, you can be pretty confident that your initiative has worked! The tests to establish the magnitude of a difference are also important and their meaning is summarised by the effect size (e.g. Rosethal’s R or Cohen’s D) or strength of an association (e.g. Gamma). In teacher speak, the effect size tells you how large the difference is in comparison to the distribution of the assessment results. When the effect size passes over different thresholds (e.g., 0.1, 0.3, 0.5) we can refer to the difference as a small, moderate, or large difference.

Sound Data teacher-friendly dashboards not only present the numerical test results but also provide common-language interpretations so all educators can quickly learn to extract meaning. When significant differences of sufficient size exist (p-value ≤ 0.05 and the effect size is greater than or equal to 0.1) the common-language interpretation is highlighted pink.

Different comparisons require different methods. Comparing the results of this year’s students to previous years is called a longitudinal comparison (e.g. 2022 with 2012 – 2021). Comparing the results of different demographics is called a cross-sectional comparison (e.g. fe/males, non/Maori, non/Pacifica, non/Asian, non/Pakeha) . If you have results from a single point in time (e.g. NCEA data, entrance data) you can compare the achievement of different cohorts. If you have results from two points in time (e.g. PAT results from term 1 and term 4 or PAT results from Year 9 and NCEA results from Year 13) you can compare the progress or learning of different cohorts. Your unit of analysis could be the subject, curriculum area, year-level, room, teaching initiative and so on.

Sound Data dashboards are available either as packages or can be bespoke and co-constructed with you and your team. Sound Data packages include multiple dashboards containing common forms of student data and analysis, plus data coaching so you and your team can learn to get the best from your analyses. This data can include NCEA grades, credits and qualifications; PAT, e-asTTle, SMART, JAM, GloSS, IKAN, Probe, Running Records; curriculum levels, overall teacher judgements; surveys of student and parent voice. Here are some examples of Sound Data teacher-friendly dashboards.

The NCEA Subject Dashboard analyses the NCEA data for different subjects. This demonstration dashboard looks at Level 2 English, Mathematics and Physics. Achievement data includes: Grades (the total combined grade distribution for all standards, externals and internals, as well as for individual achievement and unit standards); Credits (internal, external total and 14+); and Course endorsements. Longitudinal comparisons compare the 2025 cohort with a combined previous years’ cohort (2013-2025). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha.

The NCEA Level Qualifications Dashboard analyses the NCEA qualifications gained by Year 11, 12 and 13 students. Qualifications include NCEA Level 1, 2 and 3 and University Entrance. Longitudinal comparisons are carried out comparing the 2025 cohort with a combined previous years cohort (2013-2024). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha. Theoretical comparisons are carried out comparing the demonstration school with national statsitics.

The NZCER Progress Dashboard analyses student progress by comparing their scale score results over specified periods, for example, between Term 1 and Term 4. This demonstration dashboard examines PAT Comprehension and Mathematics data. Longitudinal comparisons are carried out by comparing the latest cohort with the combined previous years’ cohort. Cross-sectional comparisons are carried out for Gender, Māori, Pacifica, Asian, MELAA, Pākehā, and ESOL. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distribution for each cohort. Teaching group results are compared with school Year Level norms. Theoretical comparisons compare actual results with national means.

The NZCER Term Achievement dashboard analyses achievement at different time points e.g. in Term 1, Term 2, Term 3, Term 4. This demonstration dashboard looks at PAT Comprehension and Mathematics. Longitudinal comparisons are carried out by comparing the latest cohort with the combined previous years’ cohorts. Cross-sectional comparisons are carried out for Gender, Māori, Pacifica, Asian, MELAA, Pākehā and ESOL. Teaching group results are compared with school Year Level norms. Theoretical comparisons compare actual results with national means.

The e-asTTle Progress Dashboard analyses student progress by comparing their scale score results over specified periods, for example, between Term 1 and Term 4. This demonstration dashboard examines e-asTTle Reading data. Longitudinal comparisons are carried out by comparing the latest cohort with the combined previous years’ cohort. Cross-sectional comparisons are carried out for Gender, Māori, Pacifica, Asian, MELAA, Pākehā, and ESOL. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distribution for each cohort. Teaching group results are compared with school Year Level norms. Theoretical comparisons compare actual results with expectations.

The e-asTTle Term Achievement dashboard analyses achievement at different time points e.g. in Term 1, Term 2, Term 3, Term 4. This demonstration dashboard looks at e-asTTle Reading. Longitudinal comparisons are carried out by comparing the latest cohort with the combined previous years’ cohorts. Cross-sectional comparisons are carried out for Gender, Māori, Pacifica, Asian, MELAA, Pākehā and ESOL. Teaching group results are compared with school Year Level norms. Theoretical comparisons compare actual results with expectations.

The Primary Stage Assessments Progress Dashboard incorporates assessments such as GLOSS, JAM, Reading Age-Level Records and analyses the progress students make during the year by comparing their assessment results in Term 1 and Term 4. This demonstration dashboard looks at GLOSS Strategy-Level, JAM and Reading Age-Level data. Longitudinal comparisons are carried out by comparing the 2022 cohort with the combined previous years’ cohort (2018-2021). Cross-sectional comparisons are carried out for Gender, Maori, Pacifica, Asian, MELAA and Pakeha. Analyses are also carried out to establish whether learning has taken place by comparing the Term 4 distributions with the Term 1 distributions for each cohort. These analyses are carried out for all students as well as individual year groups.

To evaluate across sectors progress and impact we need to combine different assessments.  One way of doing this is using a regression model to describe the relationship between (1) a dependent variable and (2) a set of independent variables. This allows us to explore how the independent variables influence the dependent variable.  Simple linear regression models require the variables to have particular properties such as being continuous scales e.g., think about the scatter graphs you drew in your school science classes that allowed you to explore the relationship between the acceleration of an object and the unbalanced force acting on it.  However, in New Zealand many of our important educational outcomes are not scale variables but rather ordered grades e.g., the NCEA Level 3 Qualifications of Not Achieved, Achieved, Merit and Excellence form an ordered rather than continuous variable.  Although we cannot use linear regressions to explore what factors influence NCEA Level 3 Qualifications, we can instead use an Ordinal Logistic Regression. This dashboard uses an Ordinal Logistic Regression Model to describe how Year 9 Reading and Mathematics results, gender and a professional development initiative influence the chances of gaining a higher grade in NCEA Level 3 Qualifications.