Section 2: Criteria for regulation

32.The criteria for regulating National Assessments comprise the common criteria (see paragraphs 34-36) and the assessment criteria (see paragraphs 37-41).

33.Responsible bodies with functions relating to development of National Assessments should develop assessments that meet the common and assessment criteria and ensure that their suppliers meet these requirements.

Common criteria

34.The common criteria are set by Ofqual and are used to judge the extent to which the outcomes of statutory National Curriculum assessment arrangements meet, or are likely to meet, their statutory or specified purpose. Ofqual must consult with relevant stakeholders on any proposed changes to the common criteria from one year to the next.

35.The common criteria apply to the development, implementation, marking and reporting of externally assessed tests and internally assessed tasks, and the moderation of teacher assessment and Early Years Foundation Stage profile judgements. The responsible body should ensure that the procedures for each part of the assessment process and activities meet the relevant common criteria, and that the common criteria are considered during any evaluation.

36.Ofqual will apply the following five common criteria to regulation:

  • Validity - the central concept in evaluating the quality of assessment outcomes. It is the overarching concept and subsumes other concepts. All regulatory activities will consider the extent to which an assessment activity provides a valid outcome. Validity is a multi-faceted but unitary concept – other concepts, such as reliability, comparability and minimise bias, are aspects of validity, but validity remains one whole, indivisible concept. However, aspects of validity can be in conflict; for instance, to make more reliable test scores, one might make a test longer, but this would tend to make the test less manageable.



    Validity pertains to the arguments or interpretations placed on assessment outcomes, results or scores. In evaluating validity, one would be evaluating arguments such as:

    1. 'The results of this assessment provide a sound basis for informing teachers on pupils’ or children’s progress at the next stage of learning.'
    2. 'The results of this assessment provide useful information for teachers in succeeding years to understand pupils’ strengths and weaknesses.'

    The evaluation of validity will amount to working out whether the outcomes of the assessment procedure (grades, profile of the child, etc) provide adequate information to sustain the argument being made. One needs to evaluate validity separately for each interpretation that is made.

    Validity will rarely be an absolute condition; for example, one would be more likely to decide that ‘the outcomes from this assessment are sufficiently valid’ rather than saying that they are valid in absolute terms.

  • Reliability - relates to the propensity of an assessment procedure to generate consistent outcomes. If an assessment procedure tends to give the same result when repeated, then it will tend to be reliable. Reliability is a property of the assessment outcomes (scores, grades, levels, etc), not of the test itself.

    Reliability is a necessary condition for validity; if an assessment procedure is not measuring consistently, in effect it is not measuring at all. Achieving this with teacher assessment is dependent on the quality of criteria, moderation processes and exemplification and guidance, for example.

  • Comparability - relates to the extent to which several (different) assessment procedures generate consistent outcomes. For example, one might consider comparability between:
    1. years - was this year’s assessment easier or harder than last year's?
    2. areas of learning or subjects - is the standard expected in one area of learning higher or lower than in another?

    Like reliability, comparability relates to assessment outcomes such as scores or grades rather than the assessment itself. Like validity, aspects of comparability can be in conflict: for example, one may have to choose whether to retain comparability between years or seek to improve the moderation of assessments. One may not be able to guarantee both.

  • Manageability - an assessment can be deemed manageable if it is feasibly possible to carry it out, given known practical constraints such as time, budget, numbers of pupils, etc. Unlike validity, reliability and comparability, manageability does relate to the assessment procedure itself; for example, one could have a very long test that produced very reliable results, yet was not sufficiently manageable.
  • Minimise bias - an assessment can be said to minimise bias in so far as it does not produce unreasonably adverse outcomes for a group of learners with given social characteristics (for example, gender, age, disability, sexuality, ethnic origin, socioeconomic status, etc). Minimising bias as a characteristic of assessment is closely related to statutory equality duties. Ofqual’s approach to the discharge of those duties can be found in its Equality Scheme.

Assessment criteria

37.The assessment criteria supplement the common criteria and must:

  • clarify how the assessment meets the purposes of the assessment that have been set by the Secretary of State
  • illustrate how the responsible body is delivering its remit
  • describe how the assessment will meet the common criteria
  • describe the form of the assessment
  • define the coverage of the relevant area of learning relating to that key stage.

38.The assessment criteria for each national assessment are produced by the responsible body that has the remit for developing and/or implementing the relevant national assessment.

39.The assessment criteria should be agreed with Ofqual. A responsible body should seek approval from Ofqual if it wishes to modify the assessment criteria. Changes will be agreed if they are deemed to be in the interest of the learner or to secure the integrity of the assessment.

40.Responsible bodies should develop specifications and associated assessment arrangements that meet both the common and assessment criteria.

41.The responsible body for a particular assessment, or part of an assessment, should keep the relevant assessment criteria under review. Any proposed revisions should be agreed with Ofqual. Any changes advised by Ofqual should be implemented by the responsible body. Revisions may arise as a result of the monitoring and evaluation of the assessment arrangements, changes in assessment policy, revised statutory orders or National Curriculum requirements. The arrangements described in Section 4, ‘Managing change’, should be followed.

Credits